A Data Warehouse (DW) is a collection of historical data, built by gathering and integrating data from several sources, which supports decisionmaking processes (Inmon, 1992). On-Line Analytical Processing (OLAP) applications provide users with a multidimensional view of the DW and the tools to manipulate it (Codd, 1993). In this view, a DW is seen as a set of dimensions and cubes (Torlone, 2003). A dimension represents a business perspective under which data analysis is performed and organized in a hierarchy of levels that correspond to different ways to group its elements (e.g., the Time dimension is organized as a hierarchy involving days at the lower level and months and years at higher levels). A cube represents factual data on which the analysis is focused and associates measures (e.g., in a store chain, a measure is the quantity of products sold) with coordinates defined over a set of dimension levels (e.g., product, store, and day of sale). Interrogation is then aimed at aggregating measures at various levels. DWs are often implemented using multidimensional or relational DBMSs. Multidimensional systems directly support the multidimensional data model, while a relational implementation typically employs star schemas(or variations thereof), where a fact table containing the measures references a set of dimension tables.
Schema evolution is a subject that has been studied for years. This section briefly introduces traditional works and introduces the problem in the DW context.
Schema Evolution on Relational and Object-Oriented Databases
The problem of schema evolution has been mainly studied for relational and OO databases. In the relational case, few concepts (relations and attributes) are used to describe a database schema. Thus, possible changes are limited to (McKenzie & Snodgrass, 1990) addition/suppression of an attribute or modification of its type, addition/suppression of a relation, and so forth. In the OO context, the model (classes, attributes, methods, is-a relationships) is richer, and then schemas are more complex. Possible changes are (Banerjee et al., 1987) addition, suppression or modification of attributes and methods in a class definition, or changes in superclass/subclass relationships.
Key Terms in this Chapter
Dimension: A set of values (members) organized in a hierarchy of levels.
Valid Time: Represents the moment when a fact exists in reality.
Database Schema: Description of the structure of the database, defined as a collection of data types.
Schema Evolution: The dynamic modification of the schema of a database.
Data Warehouse: Collection of historical data, built by gathering and integrating data from several sources, which supports decision-making processes.
Transaction Time: Indicates the moment when a fact was stored in the database.
Multidimensional Database: A collection of cubes and dimensions.
Multidimensional Database Version: State of a DW with an associated temporal pertinence.
Cube: Set of cells associating a collection of members, each one belonging to a dimension level, with one or more measure values.
Temporal Pertinence: Element of the Cartesian product of the domains of valid and transaction times.