Information Storage

Information Storage

Manjunath Ramachandra (MSR School of Advanced Studies, Philips, India)
DOI: 10.4018/978-1-60566-888-8.ch011

Abstract

The success of information transfer from the suppliers depends largely up on the organization of the data to cater for different categories of the users. It calls for quick, competitive and cost effective solutions. To meet the same, hierarchical data representation is introduced in this chapter. The example of Data warehouse is considered to explain the concept.
Chapter Preview
Top

Introduction

The last chapter provides the different techniques for the integration of the data. Once the data is available in the common format, it would be stored in the memory for the future consumption. Here, the different storage techniques are discussed.

Right storage device has to be provided for the right content depending up on the application, frequency of access, retrieval method, content security, underlying data transfer mechanism etc. The devices are expected to support Multi channel access simultaneously, calling for data caching and fast access, queuing and buffering policies.

Data warehousing systems consolidate the data from a large number of distributed data sources and store the integrated data for efficient data analysis and mining. The various techniques of data storage, including the data abstraction in a data warehouse, are addressed in this chapter. The case study of a Data warehouse is taken to examine the issues with the storage and solutions have been proposed.

For efficient decision making data is to be accessible for all the players of an organization. This requires the data captured from various data sources to be stored in homogeneous and consistent form at a centralized location. It requires a separate centralized database that caters to various independent applications. There are two technologies for handling the data:

  • OLTP: It is the Transaction oriented, online transaction processing system

  • OLAP: It is the Analysis processing or Online Analytical processing (S. Chaudhuri and U. Dayal, 1997).

These two systems are repeatedly and interchangeably used in the organization. The process to store the heterogeneous data in a homogeneous form involves three stages:

  • Data Acquisition: It is required to get the data from various data sources stored in their respective databases.

  • Data cleansing: It involves the purification process of the data to make t homogeneous.

  • Loading: Finally there is a loading process to put the data in the storage system.

Top

Background

Access to the right data would be required for effective decision making. The data is generally spread across the organization or over the supply chain, each being maintained in a different format. The data storage system has to collate the heterogeneous data. It calls for the data processing for storage and subsequent retrieval and rendering. Computationally intensive algorithms are to be executed to support the depiction of complex reports.

Case Study: Data Warehouse

Data warehouse (DW) is a common practice of storing the data from multiple, distributed and heterogeneous sources of data. A good introduction to Data warehousing is provided in (W.H. Inmon and C. Kelley, 1993). Instead of collecting and integrating the data from the sources on demand during the query time, data warehousing systems collects the data in advance, cleanse and integrate the data. The data warehouse masks the actual data sources from the end users. So there is a less burden on these resources. When the source data changes, the same has to be reflected in the data warehouse to keep it updated. The ready to use integrated information is stored at a centralized data repository. With the data available on the warehouse in usable form, the end users can directly interact with the data warehouse and mine the data for analysis and subsequent use. The data warehouse caters for different users. Correspondingly, the relevant data, each requiring different access technologies is to be made available.

Structure of the Warehouse Data

The stored data consists of two components: the structured part where there is some pattern in the data and the unstructured part where the data is totally uncorrelated and random. It caters for the needs of decision making in business based on the historical data, trend analysis based on predictions etc.

The stored data can come in many forms from the heterogeneous sources, including the documents, queries, reports etc and finally rendered in user friendly manner. The issue of heterogeneity is resolved by associating a standard schema, independent of the platform.

Complete Chapter List

Search this Book:
Reset