In the database design, the advantages of using conceptual models for representing users’ requirements are well known. Nevertheless, even though data warehouses (DWs) are databases that store historical data for analytical purposes, they are usually represented at the logical level using the star and snowflake schemas. These schemas facilitate delivery of data for online analytical processing (OLAP) systems. In particular, hierarchies are important since traversing them, OLAP tools perform automatic aggregations of data using the roll-up and drill-down operations. The former operation transforms detailed data into aggregated ones (e.g., daily into monthly sales) while the latter does the opposite.
The star and snowflake schemas include relational tables known as fact and dimension tables. The fact table represents the focus of analysis (e.g., analysis of sales). It usually contains numeric data called measures (e.g., quantity). Dimension tables contain attributes that allow users to see measures from different perspectives (e.g., analyze sales in different stores). Since users usually start from a general view of data and then, if required, the detail explorations follow, dimensions may contain attributes that form hierarchies. OLAP tools allow users to traverse hierarchies, aggregating measures automatically. For example, “moving” (i.e., using the roll-up operation) from store to city, the quantity of sold products in each store will be added according to the cities where the stores are located.
Depending on whether hierarchies are represented using flat (Figure 1(a)) or normalized tables (Figure 1b)), the relational structure is called star or snowflake schemas, respectively. Nevertheless, both schemas are not adequate for representing different kinds of hierarchies existing in real-world situations. The star schema does not represent hierarchies clearly, and the hierarchy structure should be deduced based on the knowledge of the application domain. On the other hand, the snowflake schema only allows us to represent simple hierarchies such as Store, City, and State in Figure 1(b), even though there are different kinds of hierarchies in real-world applications.
A Store dimension represented as (a) star and (b) snowflake schemas
There are several proposals of conceptual multidimensional models1 that include hierarchies. Nevertheless, as we will see later, these models do not include all hierarchies as presented in this chapter. This lack of a general classification of hierarchies, including their characteristics at the schema and at the instance levels, leads to repeated research efforts in “rediscovering” hierarchies and providing solutions for managing them.Top
We first describe the MultiDim model, a conceptual multidimensional model used for representing requirements for DW and OLAP applications, including different kinds of hierarchies. Then, we present the hierarchy classification and refer in more detail to each hierarchy type. Last, we present mapping of these hierarchies to the relational model.
Key Terms in this Chapter
Dimension: An abstract concept for grouping data that shares a common semantic meaning within the domain being modeled.
Logical Representation (or Schema): A specification of data structures according to the features of the given logical model, such as relational or object-relational.
Conceptual Multidimensional Model: A set of objects and rules that facilitates an abstract representation of requirements for DW and OLAP applications. It usually includes dimensions with hierarchies and facts with associated measures.
Snowflake Schema: A variation of the star schema except that dimensions are normalized representing each hierarchy level in a separate table.
Star Schema: A relational schema consisting of a fact table, which links to other de-normalized tables called dimension tables.
Level: A set of elements representing the same data granularity.
Hierarchy: A sequence of levels providing data at different granularities for establishing meaningful aggregation paths.