An Objective Function for Evaluation of Fragmentation Schema in Data Warehouse

An Objective Function for Evaluation of Fragmentation Schema in Data Warehouse

Hacène Derrar (USTHB, Algeria), Omar Boussaid (University of Lyon 2, France) and Mohamed Ahmed-Nacer (USTHB, Algeria)
DOI: 10.4018/978-1-4666-5888-2.ch188

Chapter Preview



There are three fragmentation approaches: vertical fragmentation (Bouakkaz, 2012; Navathe, 1984), horizontal fragmentation (Ceri, 1982) and hybrid fragmentation (Gorla, 2012, pp. 559-576; Ziyati, 2006). Vertical Fragmentation (VF) consists in dividing a relation into partitions of different schema, by projection with duplicating the key. It consists in grouping together attributes that are frequently accessed by queries.

Horizontal Fragmentation (HF) consists in dividing a relation into partitions with the same schema using query predicates. Each partition preserves part of the tuples according to restriction criteria. It reduces query processing costs by minimizing the number of irrelevant accessed instances. Two versions of HF are cited by the researchers : primary HF and derived HF. Primary HF of a relation is performed using predicates that are defined on that relation. On the other hand, derived HF is the partitioning of a relation that results from predicates defined on another relation.

Finally, hybrid fragmentation consists of either horizontal fragments that are subsequently vertically fragmented or vertical fragments that are subsequently horizontally fragmented.

Key Terms in this Chapter

OLAP: On Line Analytical Processing, is an approach to answering multi-dimensional analytical queries swiftly. OLAP is part of the broader category of business intelligence, which also encompasses relational database, report writing and data minig.

Data Fragmentation Schema: Is a division of data a table or relation into fragments in which for any two fragments, the set of data of one is non-overlapping with the set of data of another.

Workload: Is the set of queries performed in a given period of time. It also refers to computer systems' ability to handle and process work. Components such as servers or database systems are often assigned an expected workload upon creation. Analysis of their performance compared to the workload that was expected is then conducted over time.

Data Warehouse: Is a database for reporting and data analysis. Integrating data from one or more disparate sources creates a central repository of data, a Data Warehouse (DW). Data warehouses store current and historical data and are used for creating trending reports for senior management reporting such as annual and quarterly comparisons.

Square Error: Is one of many ways to quantify the difference between values implied by an estimator and the true values of the quantity being estimated.

Complete Chapter List

Search this Book: