Facilitate Effective Decision-Making by Warehousing Reduced Data: Is It Feasible?

Facilitate Effective Decision-Making by Warehousing Reduced Data: Is It Feasible?

Faten Atigui (Centre d'Etude et De Recherche en Informatique et Communications (CEDRIC), Conservatoire National des Arts et Métiers, Paris, France), Franck Ravat (Institut de Recherche en Informatique de Toulouse (IRIT), Université Toulouse I Capitole, Toulouse, France), Jiefu Song (Institut de Recherche en Informatique de Toulouse (IRIT), Université Toulouse I Capitole, Toulouse, France), Olivier Teste (Institut de Recherche en Informatique de Toulouse (IRIT), Université Toulouse II Jean Jaurès, Toulouse, France) and Gilles Zurfluh (Institut de Recherche en Informatique de Toulouse (IRIT), Université Toulouse I Capitole, Toulouse, France)
Copyright: © 2015 |Pages: 29
DOI: 10.4018/ijdsst.2015070103


The authors' aim is to provide a solution for multidimensional data warehouse's reduction based on analysts' needs which will specify aggregated schema applicable over a period of time as well as retain only useful data for decision support. Firstly, they describe a conceptual modeling for multidimensional data warehouse. A multidimensional data warehouse's schema is composed of a set of states. Each state is defined as a star schema composed of one fact and its related dimensions. The derivation between states is carried out through combination of reduction operators. Secondly, they present a meta-model which allows managing different states of multidimensional data warehouse. The definition of reduced and unreduced multidimensional data warehouse schema can be carried out by instantiating the meta-model. Finally, they describe their experimental assessments and discuss their results. Evaluating their solution implies executing different queries in various contexts: unreduced single fact table, unreduced relational star schema, reduced star schema and reduced snowflake schema. The authors show that queries are more efficiently calculated within a reduced star schema.
Article Preview

Reducing data allows us both to decrease the quantity of irrelevant data in decision making and to increase future analysis quality (Udo & Afolabi, 2011). In the context of decision support, data reduction is a technique originally used in the field of data mining (Okun & Priisalu, 2007; Udo & Afolabi, 2011).

In the DW context, (Garcia-Molina, Labio, & Yang, 1998) were the first to define solutions for data deletion. More precisely, they study data expiration in materialized views so that they are not affected and can be maintained after updates with the help of a set of standard predefined views.

In the multidimensional area, (Chen et al., 2002) propose an architecture allowing the integration of data streams into a MDW and reduce the size. The size reducing is predefined and automatically executed by partially aggregating the data cube; it makes sure the detailed information is only available during a time interval. Nevertheless, this work only focuses on the fact table. (Skyt et al., 2008) presents a technique for progressive data aggregation of a fact. This study intends to specify data aggregation criteria of a fact due to higher levels of dimensions. The authors also provide techniques to query reduced multidimensional objects. As mentioned in (Iftikhar & Pedersen, 2011), this work is highly theoretical but it fails to provide us a concrete example of implementation strategy. In (Iftikhar & Pedersen, 2011), a gradual data aggregation solution based on conception, implementation and evaluation is proposed. This solution is based on a table containing different temporal granularities: second, minute, hour, month and year.

This previous work only focuses on the fact table. (Iftikhar & Pedersen, 2010, 2011) use a temporal table for gradual data reduction. None of the previous work takes into account analysts’ needs. Our goal is more ambitious as it aims to study data reduction of the complete multidimensional schema that depends only on the users’ needs. We intend to provide a consistent analysis environment and thus facilitate the analyst’s task by limiting the analysis to semantically consistent data.

Complete Article List

Search this Journal:
Open Access Articles
Volume 11: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 10: 4 Issues (2018): 3 Released, 1 Forthcoming
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing