An Approach for Retrieving Faster Query Results From Data Warehouse Using Synonymous Materialized Queries

Sonali Ashish Chakraborty, Jyotika Doshi

Source Title: International Journal of Data Warehousing and Mining (IJDWM) 17(2)

DOI: 10.4018/IJDWM.2021040105

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

The enterprise data warehouse stores an enormous amount of data collected from multiple sources for analytical processing and strategic decision making. The analytical processing is done using online analytical processing (OLAP) queries where the performance in terms of result retrieval time is an important factor. The major existing approaches for retrieving results from a data warehouse are multidimensional data cubes and materialized views that incur more storage, processing, and maintenance costs. The present study strives to achieve a simpler and faster query result retrieval approach from data warehouse with reduced storage space and minimal maintenance cost. The execution time of frequent queries is saved in the present approach by storing their results for reuse when the query is fired next time. The executed OLAP queries are stored along with the query results and necessary metadata information in a relational database is referred as materialized query database (MQDB). The tables, fields, functions, relational operators, and criteria used in the input query are matched with those of stored query, and if they are found to be same, then the input query and the stored query are considered as a synonymous query. Further, the stored query is checked for incremental updates, and if no incremental updates are required, then the existing stored results are fetched from MQDB. On the other hand, if the stored query requires an incremental update of results, then the processing of only incremental result is considered from data marts. The performance of MQDB model is evaluated by comparing with the developed novel approach, and it is observed that, using MQDB, a significant reduction in query processing time is achieved as compared to the major existing approaches. The developed model will be useful for the organizations keeping their historical records in the data warehouse.

Article Preview

Top

1. Introduction

Huge amount of data collected from multiple sources is loaded into enterprise data warehouse through a process known as ETL (Extraction, Transformation and Loading) (Thareja, 2009). Analytical processing is performed on this warehouse data using OLAP queries for strategic decision making. The results are generated after traversing through enormous amount of data. The performance of OLAP queries in terms of result retrieval time is an important aspect. To generate the results of frequent OLAP queries (Thareja (2009), Gupta (2014), Han et al. (2011), Rusu et al. (2004), Rusu et al. (2005), Tjioe and Taniar (2005)) the data warehouse is invoked repeatedly and is quite time-consuming resulting in performance overhead on the system. The OLAP queries may also use data marts (Thareja, 2009) for generating results which are subset of data warehouse. The data marts store data based on the needs of the users and contain less amount of data as compared to a data warehouse. Depending on the source of data, the data marts are classified as dependent data mart or independent data mart. The type of data mart used in the present study is dependent data mart where the source of data is the data warehouse as depicted in Figure 1.

Figure 1.

Data Warehouse generated with data from multiple sources and generating dependent Data Marts

The two major existing approaches used for retrieving query results from a data warehouse are multidimensional data cubes and materialized views. The multidimensional data cubes (Gupta (2014) and Han et al. (2011)) are used to store the results of aggregate queries while materialized views (Gupta et al. (1993) and Gupta and Mumick (1995)) store query results along with the view definition. The major issues faced while using data cubes and materialized views is explained in detail in section 2.

In the present study, the executed OLAP queries are stored along with their results and some necessary metadata information into a relational database referred here as MQDB (Materialized Query Database) (Chakraborty and Doshi, 2018a). The metadata of a query includes timestamp, frequency, threshold, number of records in output, path of result table and path of data mart (for processing incremental data). When an OLAP query is fired, first it is determined if its synonymous query exists in MQDB. If the tables, fields, functions and criteria of the input query and stored query are same then they generate same results and therefore these queries are considered as synonymous queries. For a synonymous query, the requirement of incremental update is determined. If no incremental updates are required then the stored results are fetched from MQDB (Chakraborty and Doshi, 2018a). For the synonymous queries requiring an incremental update, incremental results are generated using data marts. Generating incremental results of the query using data mart is faster due to less number of records as compared to a data warehouse (Chakraborty and Doshi, 2018b). Thereafter, the final results are derived by combining stored results with the incremental results using arithmetic operations.

Top

2. Major Existing Approaches And Drawbacks

The two major existing approaches used for retrieving query results from a data warehouse are using multidimensional data cubes and materialized views. A brief discussion about the approaches with their issues is presented in section 2.1 and 2.2.

Complete Article List

Search this Journal:

Reset

Volume 20: 1 Issue (2024)

Volume 19: 6 Issues (2023)

Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming

Volume 17: 4 Issues (2021)

Volume 16: 4 Issues (2020)

Volume 15: 4 Issues (2019)

Volume 14: 4 Issues (2018)

Volume 13: 4 Issues (2017)

Volume 12: 4 Issues (2016)

Volume 11: 4 Issues (2015)

Volume 10: 4 Issues (2014)

Volume 9: 4 Issues (2013)

Volume 8: 4 Issues (2012)

Volume 7: 4 Issues (2011)

Volume 6: 4 Issues (2010)

Volume 5: 4 Issues (2009)

Volume 4: 4 Issues (2008)

Volume 3: 4 Issues (2007)

Volume 2: 4 Issues (2006)

Volume 1: 4 Issues (2005)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

An Approach for Retrieving Faster Query Results From Data Warehouse Using Synonymous Materialized Queries

Abstract

1. Introduction

2. Major Existing Approaches And Drawbacks

Complete Article List