Article Preview
Top1. Introduction
The introduction of computing systems has increasingly led to computerization of day-to-day business operations. Various applications are being written for collecting, processing and storing business transaction data into centralized operational databases. These Online Transaction Processing (OLTP) systems (Inmon, 2005) carry out all the business operations of all the departments of an organization. However, these systems failed to adequately cater to the strategic decision making needs of an organization, since no data analytical capabilities were built into them. This led to an information crisis. The main cause for such an information crisis was not the lack of data, but the lack of an integrated, coherent, time stamped, subject oriented and archived data stored within a common database. The other reason was the lack of an architecture to support data analytical processing. Data warehouse, or information house, was designed for catering to this problem of information crisis (Inmon, 2005; Rainardi, 2008). Data warehouse is a centralized database of historical, subject-oriented, time-variant, non-volatile and integrated data from multiple, heterogeneous, remote and independent operational databases, for the purpose of data analysis to support business decision making (Choong et al., 2007; Hoffer et al., 2005; Song and Gao, 2010). Data from operational databases in OLTP systems are extracted, transformed, cleaned and loaded into a data warehouse. In a relational data warehouse, the data is stored in the form of de-normalized relations; and in multidimensional databases it is stored in the form of data cubes; since these forms are very suitable for data analysis (Golfarelli et al., 2004; Gray et al., 1997; Agarwal et al., 1996; Gyssens and Lakshmanan, 1997; Kumar et al., 2006; Neimi et al., 2001). Online analytical processing (OLAP) is a chain of interactive analyses performed by a business analyst on the data residing in a data warehouse in order to know business trends, to compare different variables and to identify the contributing factors and other information of an organization (Cabuboo and Torlone, 1998; Choong et al., 2007; Codd et al., 1993; Rainardi, 2008). OLAP queries are complex and computation intensive in nature; but they require a short query response time of few seconds or minutes. It takes hours and days to get answers when OLAP queries are posed directly on the raw data of base tables of a data warehouse (Chaudhari and Dayal, 1997; Agarwal et al., 1996). To answer OLAP queries efficiently, indices and materialized views have been widely used. The OLAP query processing time can be substantially reduced by materializing pre-computed summarized tables (Chirkova et al., 2002). Some of the issues with materialized views are the selection of views for materialization, the use of materialized views to answer queries and the efficient maintenance of materialized views (Chaudhari and Dayal, 1997). This paper focuses on the view selection issue, which is discussed next.