Since its origin in the 1970’s research and development into databases systems has evolved from simple file storage and processing systems to complex relational databases systems, which have provided a remarkable contribution to the current trends or environments. Databases are now such an integral part of day-to-day life that often people are unaware of their use. For example, purchasing goods from the local supermarket is likely to involve access to a database. In order to retrieve the price of the item, the application program will access the product database. A database is a collection of related data and the database management system (DBMS) is software that manages and controls access to the database (Elmasri & Navathe, 2004).
A data warehouse is a specialized type of database. More specifically, a data warehouse is a “repository (or archive) of information gathered from multiple sources, stored under a unified schema, at a single site” (Silberschatz, Korth, & Sudarshan, 2002, p. 843). Chaudhuri and Dayal (1997) consider that a data warehouse should be separately maintained from the organization’s operational database since the functional and performance requirements of online analytical processing (OLAP) supported by data warehouses are quite different from those of the online transaction processing (OLTP) traditionally supported by the operational database.
OLAP Versus OLTP
Two reasons why traditional OLTP is not suitable for data warehousing are presented: (a) Given that operational databases are finely tuned to support known OLTP workloads, trying to execute complex OLAP queries against the operational databases would result in unacceptable performance. Furthermore, decision support requires data that might be missing from the operational databases; for instance, understanding trends or making predictions requires historical data, whereas operational databases store only current data. (b) Decision support usually requires consolidating data from many heterogeneous sources: these might include external sources such as stock market feeds, in addition to several operational databases. The different sources might contain data of varying quality, or use inconsistent representations, codes and formats, which have to be reconciled.
Traditional Online Transaction Processing (OLTP)
Traditional relational databases have been used primarily to support OLTP systems. The transactions in an OLTP system usually retrieve and update a small number of records accessed typically on their primary keys. Operational databases tend to be hundreds of megabytes to gigabytes in size and store only current data (Ramakrishnan & Gehrke, 2003).
Figure 1 shows a simple overview of the OLTP system. The operational database is managed by a conventional relational DBMS. OLTP is designed for day-to-day operations. It provides a real-time response. Examples include Internet banking and online shopping.
Online Analytical Processing (OLAP)
OLAP is a term that describes a technology that uses a multi-dimensional view of aggregate data to provide quick access to strategic information for the purposes of advanced analysis (Ramakrishnan & Gehrke, 2003).
OLAP supports queries and data analysis on aggregated databases built in data warehouses. It is a system for collecting, managing, processing and presenting multidimensional data for analysis and management purposes (Figure 2). There are two main implementation methods to support OLAP applications: relational OLAP (ROLAP) and multidimensional OLAP (MOLAP).