Schema evolution is an important research topic with an extensive literature built up over the years. However, databases are still reluctant to change and thus their evolution is difficult to achieve because the evolution of the schema involves several issues at different levels of the database schema such as the change management at the logical level. Several approaches have been proposed to achieve the evolution of a schema of a wide range of types of databases. Versioning, modification and views are examples of these chosen approaches. In this paper, we present and discuss one of these approaches, which is the versioning approach for database evolution. The future trends of the versioning are presented as well.
Databases are the core of information systems and their roles are essential within companies or organizations. These databases are subject to changes for several reasons that include the changes undergone to the real-world or the emergence of new database user requirements. For example, (Banerjee and Kim, 1986) and (Peters and Ozsu,1997), consider that the schema of the database changes for several reasons such as: a) changes to the real-world, (b) changes necessitated by errors in the schema due to poor design, and (c) changes triggered by changes to applications that access the schema (and data). In addition to that, we consider that the evolution of databases can depend on other additional factors such as the technology. When the technology changes over time, this requires most of the time, the evolution of the schema; for example, when upgrading the DBMS (DataBase Management System), a database administrator is confronted to change some schema components types or structures to put forward better database system performances.
Schema evolution means modifying a schema within a populated database without loss of stored data taking into consideration two problems: firstly, the semantics of change (i.e., the effects of the changes on the schema) and, secondly, the change propagation (the propagation of the schema changes to the underlying existing instances. ontology
Considerable advances have been made in data structures, rules, constraints, schemata models and meta-models in order to resolve the problem of schema evolution. Rahm and Bernstein (2006) proposed an online catalogue of bibliographic references to schema evolution and related fields, such as generic data management, evolution of ontologies, software evolution and workflow evolution. Because of lack of space, we just cite some important works taken from this online catalogue:
Four families of solutions: the solutions of schema evolution belong to one of the four existing families, which are modification (Banerjee and Kim, 1986), versioning (Loomis and Chaudhri, 1997), views (Bellahsène, 1996) and combining the approaches. We realise that all of these families of approaches complement each other. For instance, in the schema modification, changing the schema may lead to a loss of information. However, in the schema versioning approach, replication of the schema avoids data loss, but creates complex navigation through the different generated versions and slows the DBMS (Database Management System). While with views, changes can be simulated on the schema without changing the underlying database and no conversion is needed, however, there are several issues associated with the view update such as the problem of performances of a system that needs to compute views based on other views and the problem to update the object returned from a view. Combining approaches allows to avoid the problems mentioned above, but is characterized by complexity and onerous mechanisms to be executed.
Solutions applied to the conceptual level: several studies have focused on the evolvability of database schema at the conceptual level because conceptual models:
are the first artifacts to which a change is or should be applied (Borgida and Williamson, 1985)
increase the level of abstraction that influences the evolution of schemas (Verelst, 2004).
Key Terms in this Chapter
Multi-Representation: A multi-representation strategy that enables one to create two or more points of views of the same real-world and to combine them. The unit to be versioned can be one or several schema components or the whole schema.
Versioning-by-Difference: Is a delta compression that contains only the components of the schema that have been changed from one schema version to the next.
Version: Is a unity that has a unique and immutable identity as well as an internal structure
Versioning-View: Is a combined approach that implies the alternative use of the versioning and the view mechanisms to realize the evolution of the schema. The views are used to make the minor changes and the versioning to complete the complex ones.
Version Derivation: Is a direct acyclic graph (DAG) with which all the relationships between all versions of an object are specified
Schema Evolution: Is the ability of the database schema to change over time without loss of stored data.
Version of an Object: Is a snapshot of this object taken at a certain period of time
Stamp: A stamp S is defined as a vector S= where each component si of the S represents the i th representation of the real-world.