Materialized View Selection using Marriage in Honey Bees Optimization

Materialized View Selection using Marriage in Honey Bees Optimization

Biri Arun (School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India) and T.V. Vijay Kumar (School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India)
Copyright: © 2015 |Pages: 25
DOI: 10.4018/IJNCR.2015070101
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Data warehouse was designed to cater to the strategic decision making needs of an organization. Most queries posed on them are on-line analytical queries, which are complex and computation intensive in nature and have high query response times when processed against a large data warehouse. This time can be substantially reduced by materializing pre-computed summarized views and storing them in a data warehouse. All possible views cannot be materialized due to storage space constraints. Also, an optimal selection of subsets of views is shown to be an NP-Complete problem. This problem of view selection has been addressed in this paper by selecting a beneficial set of views, from amongst all possible views, using the swarm intelligence technique Marriage in Honey Bees Optimization (MBO). An MBO based view selection algorithm (MBOVSA), which aims to select views that incur the minimum total cost of evaluating all the views (TVEC), is proposed. In MBOVSA, the search has been intensified by incorporating the royal jelly feeding phase into MBO. MBOVSA, when compared with the most fundamental greedy based view selection algorithm HRUA, is able to select comparatively better quality views.
Article Preview

1. Introduction

The introduction of computing systems has increasingly led to computerization of day-to-day business operations. Various applications are being written for collecting, processing and storing business transaction data into centralized operational databases. These Online Transaction Processing (OLTP) systems (Inmon, 2005) carry out all the business operations of all the departments of an organization. However, these systems failed to adequately cater to the strategic decision making needs of an organization, since no data analytical capabilities were built into them. This led to an information crisis. The main cause for such an information crisis was not the lack of data, but the lack of an integrated, coherent, time stamped, subject oriented and archived data stored within a common database. The other reason was the lack of an architecture to support data analytical processing. Data warehouse, or information house, was designed for catering to this problem of information crisis (Inmon, 2005; Rainardi, 2008). Data warehouse is a centralized database of historical, subject-oriented, time-variant, non-volatile and integrated data from multiple, heterogeneous, remote and independent operational databases, for the purpose of data analysis to support business decision making (Choong et al., 2007; Hoffer et al., 2005; Song and Gao, 2010). Data from operational databases in OLTP systems are extracted, transformed, cleaned and loaded into a data warehouse. In a relational data warehouse, the data is stored in the form of de-normalized relations; and in multidimensional databases it is stored in the form of data cubes; since these forms are very suitable for data analysis (Golfarelli et al., 2004; Gray et al., 1997; Agarwal et al., 1996; Gyssens and Lakshmanan, 1997; Kumar et al., 2006; Neimi et al., 2001). Online analytical processing (OLAP) is a chain of interactive analyses performed by a business analyst on the data residing in a data warehouse in order to know business trends, to compare different variables and to identify the contributing factors and other information of an organization (Cabuboo and Torlone, 1998; Choong et al., 2007; Codd et al., 1993; Rainardi, 2008). OLAP queries are complex and computation intensive in nature; but they require a short query response time of few seconds or minutes. It takes hours and days to get answers when OLAP queries are posed directly on the raw data of base tables of a data warehouse (Chaudhari and Dayal, 1997; Agarwal et al., 1996). To answer OLAP queries efficiently, indices and materialized views have been widely used. The OLAP query processing time can be substantially reduced by materializing pre-computed summarized tables (Chirkova et al., 2002). Some of the issues with materialized views are the selection of views for materialization, the use of materialized views to answer queries and the efficient maintenance of materialized views (Chaudhari and Dayal, 1997). This paper focuses on the view selection issue, which is discussed next.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 6: 2 Issues (2017): 1 Released, 1 Forthcoming
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing