A Survey of Data Warehouse Model Evolution

A Survey of Data Warehouse Model Evolution

Cécile Favre (University of Lyon (ERIC Lyon 2), France), Fadila Bentayeb (University of Lyon (ERIC Lyon 2), France) and Omar Boussaid (University of Lyon (ERIC Lyon 2), France)
DOI: 10.4018/978-1-60566-242-8.ch015
OnDemand PDF Download:


A data warehouse allows the integration of heterogeneous data sources for analysis purposes. One of the key points for the success of the data warehousing process is the design of the model according to the available data sources and the analysis needs (Nabli, Soussi, Feki, Ben-Abdallah & Gargouri, 2005). However, as the business environment evolves, several changes in the content and structure of the underlying data sources may occur. In addition to these changes, analysis needs may also evolve, requiring an adaptation to the existing data warehouse’s model. In this chapter, we provide an overall view of the state of the art in data warehouse model evolution. We present a set of comparison criteria and compare the various works. Moreover, we discuss the future trends in data warehouse model evolution.
Chapter Preview


A data warehouse allows the integration of heterogeneous data sources for analysis purposes. One of the key points for the success of the data warehousing process is the design of the model according to the available data sources and the analysis needs (Nabli, Soussi, Feki, Ben-Abdallah & Gargouri, 2005).

However, as the business environment evolves, several changes in the content and structure of the underlying data sources may occur. In addition to these changes, analysis needs may also evolve, requiring an adaptation to the existing data warehouse’s model.

In this chapter, we provide an overall view of the state of the art in data warehouse model evolution. We present a set of comparison criteria and compare the various works. Moreover, we discuss the future trends in data warehouse model evolution.

Key Terms in this Chapter

Data Warehouse Schema: Designs the structuration of the data in the data warehouse; measures representing facts are analyzed according to dimensions.

Data Warehouse: Collection of historical data, built by gathering and integrating data from several data sources, structured in a multidimensional way to support decisional queries.

Model Updating: Making the same model evolve without keeping track of its evolution history; thus, the model corresponds to its current version.

Temporal Validity Label: A temporal validity label corresponds to a timestamp used to denote the valid time.

Analysis in a Consistent Time: Results are provided by taking into account the moment when a fact exists in the reality.

Temporal Modeling: Providing temporal extensions to keep track of the model history.

Model Versioning: Building several versions of a model where each new version corresponds to a schema evolution or an evolution of data valid for a given period.

Data Warehouse Model: The data warehouse model includes its schema and its data.

Complete Chapter List

Search this Book:
Table of Contents
Viviana E. Ferraggine, Jorge Horacio Doorn, Laura C. Rivero
Chapter 1
Sikha Bagui
An Entity Relationship (ER) model that includes all the concepts of the original ER model and the additional concepts of... Sample PDF
Mapping Generalizations and Specializations and Categories to Relational Databases
Chapter 2
Norman Pendegraft
Bounded cardinality occurs when the cardinality of a relationship is within a specified range. Bounded cardinality is closely linked to symmetric... Sample PDF
Bounded Cardinality and Symmetric Relationships
Chapter 3
Navin Viswanath, Rajshekhar Sunderraman
Typically, relational databases operate under the Closed World Assumption (CWA) of Reiter (Reiter, 1987). The CWA is a meta-rule that says that... Sample PDF
A Paraconsistent Relational Data Model
Chapter 4
Managing Temporal Data  (pages 28-36)
Abdullah Uz Tansel
In general, databases store current data. However,the capability to maintain temporal data is a crucial requirement for many organizations and... Sample PDF
Managing Temporal Data
Chapter 5
Richard C. Millham
Legacy systems, from a data-centric view, could be defined as old, business-critical, and standalone systems that have been built around legacy... Sample PDF
Data Reengineering of Legacy Systems
Chapter 6
Elzbieta Malinowski
In the database design, the advantages of using conceptual models for representing users’ requirements are well known. Nevertheless, even though... Sample PDF
Different Kinds of Hierarchies in Multidimensional Models
Chapter 7
Elzbieta Malinowski
Data warehouses (DWs) are used for storing and analyzing high volumes of historical data. The structure of DWs is usually represented as a star... Sample PDF
Spatial Data in Multidimensional Conceptual Models
Chapter 8
Elzbieta Malinowski
Data warehouses (DWs) integrate data from different source systems in order to provide historical information that supports the decision-making... Sample PDF
Requirement Specification and Conceptual Modeling for Data Warehouses
Chapter 9
Héctor Oscar Nigro, Sandra Elizabeth González Císaro
Today’s technology allows storing vast quantities of information from different sources in nature. This information has missing values, nulls... Sample PDF
Principles on Symbolic Data Analysis
Chapter 10
Luiz Camolesi Júnior, Marina Teresa Pires Vieira
Researchers in several areas (sociology, philosophy and psychology), among them Herbert Spencer and Abraham Maslow, attribute human actions... Sample PDF
Database Engineering Supporting the Data Evolution
Chapter 11
Hassina Bounif
Schema evolution is an important research topic with an extensive literature built up over the years. However, databases are still reluctant to... Sample PDF
Versioning Approach for Database Evolution
Chapter 12
Vincenzo Deufemia, Giuseppe Polese, Mario Vacca
Waterfall methodologies can poorly cope with changes, making maintenance considerably an expensive process. For this reason, incremental and... Sample PDF
Evolutionary Database: State of the Art and Issues
Chapter 13
Vincenzo Deufemia, Giuseppe Polese, Mario Vacca
The problem of changes in software development is a complex one, and it is almost impossible to avoid it. Indeed, the continuous evolution of the... Sample PDF
Interrogative Agents for Data Modeling
Chapter 14
Edgard Benítez-Guerrero, Ericka-Janet Rechy-Ramírez
A Data Warehouse (DW) is a collection of historical data, built by gathering and integrating data from several sources, which supports... Sample PDF
Schema Evolution Models and Languages for Multidimensional Data Warehouses
Chapter 15
Cécile Favre, Fadila Bentayeb, Omar Boussaid
A data warehouse allows the integration of heterogeneous data sources for analysis purposes. One of the key points for the success of the data... Sample PDF
A Survey of Data Warehouse Model Evolution
Chapter 16
M. Mercedes Martínez-González
Digital libraries are systems that contain organized collections of objects, serving in their most basic functions as a mirror of the traditional... Sample PDF
Document Versioning and XML in Digital Libraries
Chapter 17
Harith T. Al-Jumaily, Dolores Cuadra, Paloma Martínez
In the context of database, we believe that MDD (Model-Driven Development) (OMG, 2006) is a very ambitious task because we find that when applying... Sample PDF
MDD Approach for Maintaining Integrity Constraints in Databases
Chapter 18
Pierre F. Tiako
The development of software applications generally requires the following: hardware resources (computers, networks, peripherals, etc.), software... Sample PDF
Artifacts for Collaborative Software Development
Chapter 19
Jaroslav Zendulka
Modeling techniques play an important role in the development of database applications. Well-known entity-relationship modeling and its extensions... Sample PDF
Object-Relational Modeling
Chapter 20
Concept-Oriented Model  (pages 171-180)
Alexandr Savinov
The concept-oriented model (CoM) is a new approach to data modeling (Savinov, 2004) that is being developed along with concept-oriented programming... Sample PDF
Concept-Oriented Model
Chapter 21
Jean-Luc Hainaut, Jean Henrard, Didier Roland, Jean-Marc Hick, Vincent Englebert
Database reverse engineering consists of recovering the abstract descriptions of files and databases of legacy information systems. A legacy... Sample PDF
Database Reverse Engineering
Chapter 22
Vincenzo Deufemia, Giuseppe Polese, Mario Vacca
Functional dependencies represent a fundamental concept in the design of a database since they are capable of capturing some semantics of the data... Sample PDF
Imprecise Functional Dependencies
Chapter 23
Ladjel Bellatreche
Horizontal data partitioning is the process of splitting access objects into set of disjoint rows. It was first introduced in the end of 70’s and... Sample PDF
Horizontal Data Partitioning: Past, Present and Future
Chapter 24
Francisco A.C. Pinheiro
A workflow is a series of work processes performed under rules that reflect the formal structure of the organization in which they are carried out... Sample PDF
Database Support for Workflow Management Systems
Chapter 25
Francisco A.C. Pinheiro
Technology pervades every aspect of modern life. It has an impact on the democratic life of a nation (Chen, Gibson, & Geiselhart, 2006) and is... Sample PDF
Politically Oriented Database Applications
Chapter 26
Cheryl L. Dunn, Gregory J. Gerard, Severin V. Grabski
Semantically modeled databases require their component objects to correspond closely to real world phenomena and preclude the use of artifacts as... Sample PDF
Semantically Modeled Databases in Integrated Enterprise Information Systems
Chapter 27
James E. Wyse
The technologies that enable the transactions and interactions of mobile business are now as ubiquitous as any business-applicable technology that... Sample PDF
The Linkcell Construct and Location-Aware Query Processing for Location-Referent Transactions in Mobile Business
Chapter 28
Hagen Höpfner
Redundant data management is a must in client server information systems with mobile clients. Based on the level of autonomy of mobile devices/... Sample PDF
Caching, Hoarding, and Replication in Client/Server Information Systems with Mobile Clients
Chapter 29
Michael Vassilakopoulos, Antonio Corral
Time and space are ubiquitous aspects of reality. Temporal and Spatial information appear together in many everyday activities, and many information... Sample PDF
Spatio-Temporal Indexing Techniques
Chapter 30
Antonio Corral, Michael Vassilakopoulos
Spatial data management has been an active area of intensive research for more than two decades. In order to support spatial objects in a database... Sample PDF
Query Processing in Spatial Databases
Chapter 31
Khaoula Mahmoudi, Sami Faïz
Geographic Information Systems (GIS) (Faïz, 1999) are being increasingly used to manage, retrieve, and store large quantities of data which are... Sample PDF
Automatic Data Enrichment in GIS Through Condensate Textual Information
Chapter 32
Maria Kontaki, Apostolos N. Papadopoulos, Yannis Manolopoulos
In many application domains, data are represented as a series of values in different time instances (time series). Examples include stocks, seismic... Sample PDF
Similarity Search in Time Series
Chapter 33
Maurie Caitlin Kelly, Bernd J. Haupt, Ryan E. Baxter
Internet map services (IMSs) are redefining the ways in which people interact with geospatial information system (GIS) data. The driving forces... Sample PDF
Internet Map Services and Weather Data
Chapter 34
Spatial Network Databases  (pages 307-315)
Michael Vassilakopoulos
A Spatial Database is a database that offers spatial data types, a query language with spatial predicates, spatial indexing techniques, and... Sample PDF
Spatial Network Databases
Chapter 35
Xuegang Huang
Location-based services (LBSs) utilize consumer electronics, mobile communications, positioning technology, and traditional map information to... Sample PDF
Supporting Location-Based Services in Spatial Network Databases
Chapter 36
Laura Díaz, Carlos Granell, Michael Gould
Spatial data are increasingly becoming available on the Internet in applications such as routing portals that involve map-based and satellite... Sample PDF
Spatial Data Integration Over the Web
Chapter 37
Ali Amer Alwan, Hamidah Ibrahim, Nur Izura Udzir
A database state is said to be consistent if and only if it satisfies the set of integrity constraints. A database state may change into a new state... Sample PDF
Improving Constraints Checking in Distributed Databases with Complete, Sufficient, and Support Tests
Chapter 38
Hendrik Decker, Davide Martinenghi
Integrity checking has been a perennial topic in almost all database conferences, journals, and research labs. The importance of the issue is... Sample PDF
Inconsistency-Tolerant Integrity Checking
Chapter 39
Luciano Caroprese, Ester Zumpano
Data integration aims to provide a uniform integrated access to multiple heterogeneous information sources designed independently and having... Sample PDF
Merging, Repairing, and Querying Inconsistent Databases
Chapter 40
Hamidah Ibrahim
A vital problem that should be tackled in today’s database system is guaranteeing database consistency. Many techniques and tools have been devised... Sample PDF
The Challenges of Checking Integrity Constraints in Centralized, Distributed, and Parallel Databases
Chapter 41
Data Quality Assessment  (pages 378-384)
Juliusz L. Kulikowski
For many years the fact that for a high information processing systems’ effectiveness high quality of data is not less important than high systems’... Sample PDF
Data Quality Assessment
Chapter 42
G. Shankaranarayanan, Adir Even
Maintaining data at a high quality is critical to organizational success. Firms, aware of the consequences of poor data quality, have adopted... Sample PDF
Measuring Data Quality in Context
Chapter 43
José Francisco Zelasco, Gaspar Porta, José Luís Fernandez Ausinaga
Both this article, referred to as Article I, and another one, Article II, entitled “Geometric Quality in Geographic Information IFSAR DEM Control”... Sample PDF
Geometric Quality in Geographic Information
Chapter 44
José Francisco Zelasco, Judith Donayo, Kevin Ennis, José Luís Fernandez Ausinaga
Both this, article II and another one, article I, titled “Geometric Quality In Geographic Information” published in this encyclopedia propose the... Sample PDF
Geometric Quality in Geographic Information IFSAR DEM Control
Chapter 45
Luciano Caroprese, Sergio Greco, Ester Zumpano
Recently, there have been several proposals that consider the integration of information and the computation of queries in an open-ended network of... Sample PDF
Querying and Integrating P2P Deductive Databases
Chapter 46
Gian Piero
The current state of Web technology – the “first generation” or “syntactic” Web – gives rise to well-known, serious problems when trying to... Sample PDF
Using Semantic Web Tools for Ontologies Construction
Chapter 47
Polyxeni Katsiouli, Petros Papapanagiotou, Vassileios Tsetsos, Christos Anagnostopoulos, Stathes Hadjiefthymiades
The Semantic Web (SW; Berners-Lee, Hendler, & Lassila, 2001) is already in its implementation phase and an indication of this is the intense... Sample PDF
Matching Relational Schemata to Semantic Web Ontologies
Chapter 48
László Kovács, Péter Barabás, Tibor Répási
A key characteristic of database systems is the layered structure and the accomplished independencies as is defined in the ANSI SPARC database... Sample PDF
Ontology-Based Semantic Models for Databases
Chapter 49
José A. Alonso-Jiménez, Joaquín Borrego-Díaz, Antonia M. Chávez-González
Nowadays, data management on the World Wide Web needs to consider very large knowledge databases (KDB). The larger is a KDB, the smaller the... Sample PDF
Inconsistency, Logic Databases, and Ontologies
Chapter 50
Ismael Navas-Delgado, Jose F. Aldana-Montes
The growth of the Internet has simplified data access, which has involved an increment in the creation of new data sources. Despite this increment... Sample PDF
Data Integration: Introducing Semantics
Chapter 51
Agustina Buccella, Alejandra Cechich
New software requirements have emerged because of innovation in technology, specially involving network aspects. The possibility enterprises... Sample PDF
An Overview of Ontology-Driven Data Integration
Chapter 52
Agustina Buccella, Alejandra Cechich
Currently there are many domain areas in Computer Science interested in the integration of various information sources. Federated Databases... Sample PDF
Current Approaches and Future Trends of Ontology-Driven Geographic Integration
Chapter 53
Leonid Stoimenov
Research in information systems interoperability is motivated by the ever-increasing heterogeneity of the computer world. New generations of... Sample PDF
Mediation and Ontology-Based Framework for Interoperability
Chapter 54
Héctor Oscar Nigro, Sandra Elizabeth González Císaro
Nowadays one of the most important and challenging problems in Knowledge Discovery Process in Databases (KDD) or Data Mining is the definition of... Sample PDF
Ontologies Application to Knowledge Discovery Process in Databases
Chapter 55
Edgard Benítez-Guerrero, Omar Nieva-García
The vast amounts of digital information stored in databases and other repositories represent a challenge for finding useful knowledge.... Sample PDF
Expression and Processing of Inductive Queries
Chapter 56
Alexandre Evfimievski, Tyrone Grandison
Privacy-preserving data mining (PPDM) refers to the area of data mining that seeks to safeguard sensitive information from unsolicited or... Sample PDF
Privacy-Preserving Data Mining
Chapter 57
Anamika Gupta, Shikha Gupta, Naveen Kumar
Association refers to correlations that exist among data. Association Rule Mining (ARM) is an important data-mining task. It refers to discovery of... Sample PDF
Mining Frequent Closed Itemsets for Association Rules
Chapter 58
Jiaxiong Pi, Yong Shi, Zhengxin Chen
Data mining is aimed at the extraction of interesting (i.e., nontrivial, implicit, previously unknown, and potentially useful) patterns or knowledge... Sample PDF
Similarity Retrieval and Cluster Analysis Using R* Trees
Chapter 59
Ji Zhang, Qigang Gao, Hai Wang
Knowledge discovery in databases, commonly referred to as data mining, has attracted enormous research efforts from different domains such as... Sample PDF
Outlying Subspace Detection for High-Dimensional Data
Chapter 60
Data Clustering  (pages 562-572)
Yanchang Zhao, Longbing Cao, Huaifeng Zhang, Chengqi Zhang
Clustering is one of the most important techniques in data mining. This chapter presents a survey of popular approaches for data clustering... Sample PDF
Data Clustering
Chapter 61
Emmanuel Udoh, Salim Bhuiyan
In the field of bioinformatics, small to large data sets of genes, proteins, and genomes are analyzed for biological significance. A technology that... Sample PDF
C-MICRA: A Tool for Clustering Microarray Data
Chapter 62
Denis Shestakov
Finding information on the Web using a web search engine is one of the primary activities of today’s web users. For a majority of users results... Sample PDF
Deep Web: Databases on the Web
Chapter 63
Doina Caragea, Vasant Honavar
Recent development of high throughput data acquisition technologies in a number of domains (e.g., biological sciences, atmospheric sciences, space... Sample PDF
Learning Classifiers from Distributed Data Sources
Chapter 64
R. Manjunath
Expert systems have been applied to many areas of research to handle problems effectively. Designing and implementing an expert system is a... Sample PDF
Differential Learning Expert System in Data Management
Chapter 65
Xenia Naidenova
One of the most important tasks in database technology is to combine the following activities: data mining or inferring knowledge from data and... Sample PDF
Machine Learning as a Commonsense Reasoning Process
Chapter 66
George Tzanis, Christos Berberidis, Ioannis Vlahavas
Machine learning is one of the oldest subfields of artificial intelligence and is concerned with the design and development of computational systems... Sample PDF
Machine Learning and Data Mining in Bioinformatics
Chapter 67
Shigeaki Sakurai
Owing to the progress of computer and network environments, it is easy to collect data with time information such as daily business reports, weblog... Sample PDF
Sequential Pattern Mining from Sequential Data
Chapter 68
Pei Liu, Eric Boutin
The field of scientometrics has been looking at the identification of co-authorship through network mapping. Research on this topic focuses on the... Sample PDF
From Chinese Philosophy to Knowledge Discovery in Databases A Case Study: Scientometric Analysis
Chapter 69
Yangjun Chen
An important question in information retrieval is how to create a database index which can be searched efficiently for the data one seeks. Today... Sample PDF
An Overview on Signature File Techniques
Chapter 70
Yangjun Chen
With the growing importance of XML in data exchange, much research has been done in providing flexible query mechanisms to extract data from XML... Sample PDF
On the Query Evaluation in XML Databases
Chapter 71
XML Document Clustering  (pages 665-673)
Andrea Tagarelli
The ability of providing a “standardized, extensible means of coupling semantic information within documents describing semistructured data”... Sample PDF
XML Document Clustering
Chapter 72
Indices in XML Databases  (pages 674-681)
Hadj Mahboubi, Jérôme Darmont
Since XML (eXtensible Markup Language) (Bray, Paoli, Sperberg-McQueen, Maler & Yergeau, 2004) emerged as a standard for information representation... Sample PDF
Indices in XML Databases
Chapter 73
Len Asprey, Rolf Green, Michael Middleton
This chapter discusses the benefits of managing business documents and Web content within the context of an integrative information systems... Sample PDF
Integrative Information Systems Architecture: Document & Content Management
Chapter 74
Kamel Aouiche, Jérôme Darmont
Database management systems (DBMSs) require an administrator whose principal tasks are data management, both at the logical and physical levels, as... Sample PDF
Index and Materialized View Selection in Data Warehouses
Chapter 75
Alfredo Cuzzocrea
Data-stream query processing and mining is an emerging challenge for the database research community. This issue has recently gained the attention... Sample PDF
Synopsis Data Structures for Representing, Querying, and Mining Data Streams
Chapter 76
Julien Gossa, Sandro Bimonte
The Grid is an emerging solution for sharing resources through a network. It is meant to manage heterogeneous resources in world-scale... Sample PDF
GR-OLAP: Online Analytical Processing of Grid Monitoring Information
Chapter 77
Ana Valeria Villegas, Carina Mabel Ruano, Norma Edith Herrera
Searching for database elements that are close or similar to a given query element is a problem that has a vast number of applications in many... Sample PDF
A Pagination Method for Indexes in Metric Databases
Chapter 78
Udai Shanker, Manoj Misra, Anil K. Sarje
Many applications such as military tracking, medical monitoring, stock arbitrage system, network management, aircraft control, factory automation... Sample PDF
SWIFT: A Distributed Real Time Commit Protocol
Chapter 79
Udai Shanker, Manoj Misra, Anil K. Sarje
Important data base system resources are the data items that can be viewed as logical resource, and CPU, disks and the main memory which are... Sample PDF
MECP: A Memory Efficient Real Time Commit Protocol
Chapter 80
Camilo Porto Nunes, Cláudio de Souza Baptista, Marcus Costa Sampaio
Computing systems have become more complex and there is a plethora of systems in heterogeneous and autonomous platforms, from mainframes to mobile... Sample PDF
Self-Tuning Database Management Systems
Chapter 81
F. D. Muñoz-Escoí, H. Decker, J. E. Armendáriz, J. R. González de Mendívil
Databases are replicated in order to get two complementary features: performance improvement and high availability. Performance can be improved when... Sample PDF
A Survey of Approaches to Database Replication
Chapter 82
Yingyuan Xiao
Recently, the demand for real-time data services has been increasing (Aslinger & Son, 2005). Many applications such as online stock trading, agile... Sample PDF
A Novel Crash Recovery Scheme for Distributed Real-Time Databases
Chapter 83
Querical Data Networks  (pages 788-797)
Cyrus Shahabi, Farnoush Banaei-Kashani
Recently, a family of massive self-organizing data networks has emerged. These networks mainly serve as large-scale distributed query processing... Sample PDF
Querical Data Networks
Chapter 84
Sergio Greco, Cristian Molinaro, Irina Trubitsyna, Ester Zumpano
It is well known that NP search and optimization problems can be formulated as DATALOG¬ (datalog with unstratified negation; Abiteboul, Hull, &... Sample PDF
On the Implementation of a Logic Language for NP Search and Optimization Problems
Chapter 85
Alfredo Cuzzocrea
During the last years, there was a growing interest in peer-to-peer (P2P) systems, mainly because they fit a wide number of real-life ICT... Sample PDF
A Query-Strategy-Focused Taxonomy of P2P IR Techniques
Chapter 86
Michael Zoumboulakis, George Roussos
The concept of the so-called Pervasive and Ubiquitous Computing was introduced in the early nineties as the third wave of computing to follow the... Sample PDF
Pervasive and Ubiquitous Computing Databases: Critical Issues and Challenges
Chapter 87
Christoph Bussler
Businesses world-wide started exchanging electronic business messages with each other around 1970. This coincides with the emergence of wide-area... Sample PDF
Business-to-Business (B2B) Integration
Chapter 88
Christoph Bussler
As long as businesses only have one enterprise application or back end application system there is no need to share data with any other system in... Sample PDF
Enterprise Application Integration (EAI)
Chapter 89
Kirk St. Amant
Globalization is increasingly integrating the world’s economies and societies. Now, products created in one nation are often marketed to a range of... Sample PDF
The Role of Rhetoric in Localization and Offshoring
Chapter 90
Irena Mlynkova
Without any doubt, the eXtensible Markup Language (XML) (Bray et al., 2006) is currently one of the most popular formats for data representation.... Sample PDF
Adaptive XML-to-Relational Storage Strategies
Chapter 91
Alfredo Cuzzocrea
Thanks to the explosion of the wireless technology, mobile environments are becoming the leading software platforms for extracting knowledge and... Sample PDF
Innovative Access and Query Schemes for Mobile Databases and Data Warehouses
Chapter 92
László Kovács, Domonkos Tikk
The textual data format is one of the most important data types in database management. Databases support a wide range of special textual types that... Sample PDF
Full-Text Manipulation in Databases
Chapter 93
Ahmad Hammoud, Ramzi A. Haraty
Most Web developers underestimate the risk and the level of damage that might be caused when Web applications are vulnerable to SQL (structured... Sample PDF
Bind but Dynamic Technique: The Ultimate Protection Against SQL Injections