There are many methods in the area of data warehousing to define requirements for the development of the most appropriate conceptual model of a data warehouse. There is no universal consensus about the best method, nor are there accepted standards for the conceptual modeling of data warehouses. Only few conceptual models have formally described methods how to get these models. Therefore, problems arise when in a particular data warehousing project, an appropriate development approach, and a corresponding method for the requirements elicitation, should be chosen and applied. Sometimes it is also necessary not only to use the existing methods, but also to provide new methods that are usable in particular development situations. It is necessary to represent these new methods formally, to ensure the appropriate usage of these methods in similar situations in the future. It is also necessary to define the contingency factors, which describe the situation where the method is usable.This chapter represents the usage of method engineering approach for the development of conceptual models of data warehouses. A set of contingency factors that determine the choice between the usage of an existing method and the necessity to develop a new one is defined. Three case studies are presented. Three new methods: userdriven, data-driven, and goal-driven are developed according to the situation in the particular projects and using the method engineering approach.
Data warehouses are based on multidimensional models which contain the following elements: facts (the goal of the analysis), measures (quantitative data), dimensions (qualifying data), dimension attributes, classification hierarchies, levels of hierarchies (dimension attributes which form hierarchies), and attributes which describe levels of hierarchies of dimensions.
When it comes to the conceptual models of data warehouses, it is argued by many authors that the existing methods for conceptual modelling used for relational or object-oriented systems do not ensure sufficient support for the representation of multidimensional models in an intuitive way. Use of the aforementioned methods also ensures a waste of some of the semantics of multidimensional models. The necessary semantics must be added to the model informally, but that makes the model unsuitable for automatic transformation purposes. The conceptual models proposed by authors such as Sapia et al. (1998), Tryfona et al. (1999) and Lujan-Mora et al. (2002) are with various opportunities for expression, as can be seen in a comparison of the models in works such as (Blaschka et al., 1998), (Pedersen, 2000) and (Abello et al, 2001). This means that when a particular conceptual model is used for the modelling of data warehouses, some essential features may be missing. Lujan-Mora et al. (2002) argue that problems also occur because of the inaccurate interpretation of elements and features in the multidimensional model. They say that this applies to nearly all conceptual models that have been developed for data warehousing. The variety of elements and features in the conceptual models reflect differences in opinion about the best model for data warehouses, and that means that there is no universal agreement about the relevant standard (Rizzi et al., 2006).
There are two possible approaches towards the development of a conceptual model. One can be developed from scratch, which means additional work in terms of the formal description of the model’s elements. A model can also be developed by modifying an existing model so as to express the concepts of the multidimensional paradigm.
The conceptual models of data warehouses can be classified into several groups in accordance with how they are developed (Rizzi et al., 2006):
Models based on the E/R model, e.g., ME/R (Sapia et al., 1998) or StarE/R (Tryfona et al., 1999);
Models based on the UML., e.g., those using UML stereotypes (Lujan-Mora et al., 2002);
Independent conceptual models proposed by different authors, e.g., Dimensional Fact Model (Golfarelli et al., 1998).
In the data warehousing field there exists the metamodel standard for data warehouses - the Common Warehouse Metamodel (CWM). It is actually a set of several metamodels, which describe various aspects of data warehousing. CWM is a platform independent specification of metamodels (Poole et al., 2003) developed so as to ensure the exchange of metadata between different tools and platforms. The features of a multidimensional model are basically described via an analysis-level OLAP package, however, CWM cannot fully reflect the semantics of all conceptual multidimensional models (Rizzi et al., 2006).Top
Existing Methods For The Development Of Conceptual Models For Data Warehouses
There are several approaches to learn the requirements for a conceptual data warehouse model and to determine how the relevant model can be built. Classification of these approaches is presented in this section, along with an overview of methods, which exist in each approach. Weaknesses of the approaches are analysed to show the necessity to develop new methods. The positive aspects of existing approaches and the existence of many methods in each approach, however, suggests that several method components can be used in an appropriate situation.