Article Preview
Top1. Introduction
Data warehouse has become an essential component of almost every information system of any enterprise. It stores subject-oriented, time-stamped, non-volatile and integrated data in the form of multidimensional data cubes to facilitate complex and fast data analysis to support decision making (Inmon, 2005; Rainardi, 2008; Choong, et al., 2007; Yu et al., 2004; Shukla et al., 1998). The size of a data warehouse and the complexity of analytical queries can significantly delay query response time. It requires days to answer complex analytical queries posed directly against the base dimension tables of a data cube implemented using star schema in a data warehouse; but the general requirement for Online Analytical Processing (OLAP) query execution time is of few seconds or minutes (Chirkova et al., 2001; Chaudhuri and Dayal, 1997; Gupta, 1997; Agarwal et al., 1996; Kumar et al., 2006). With increased query execution time, the availability of information, for strategic and tactical business decision making in real time, is greatly reduced. OLAP query execution time can be greatly reduced by query optimizers, query evaluation techniques, indexing strategies, materialized views etc. to leverage decision making (Harinarayan et al., 1996; Gyssens and Lakshmanan, 1997; Niemi et al., 2001; Gray et al., 1996; Golfarelli et al., 2004). This paper focuses on the use of materialized views to improve OLAP query execution time in a data warehouse environment. An analytical query defines a view, in whose context it analyzes a fact for arriving at business decisions (Shukla et al., 1998; Shukla et al., 2000). A view whose fact (aggregate) has been pre-computed and substantiated is a materialized view (Kotidis, 2002). OLAP queries involving computationally expensive joins and aggregations are generally materialized to improve query performance (Chaudhuri and Dayal, 1997; Shukla et al., 1998; Yang et al., 1997). The challenges associated with materialized views are: identification of views for materialization, efficient computation of selected views from base tables, using materialized views to answer queries and efficient updating of materialized views on arrival of new data into data warehouse (Golfarelli et al., 2004; Gupta et al., 1997; Gupta and Mumick, 2005). Some computationally expensive queries are posed more frequently than others; pre-computing and materializing such queries would enable the system to respond quickly. Furthermore, some queries may help in answering many other queries and may improve the system’s query performance. Identifying a set of such queries over a database schema under a workload of queries and storage space constraint is referred to as view selection (Chaudhuri and Dayal, 1997; Shukla et al., 1998).