Multidimensionality in Statistical, OLAP and Scientific Databases

Arie Shoshani (Lawrence Berkeley National Laboratory, USA)
The term “multidimensional databases” refers to data that can be viewed conceptually in a multidimensional space, where each dimension represents some attributes of the data. Viewing data in this form is natural for many applications, yet the concepts are not treated in a uniform way in the database literature. In this chapter, we show the commonality of concepts between three database areas: statistical, OLAP, and scientific databases. We show that these domains have two main structural concepts: the cross-product space of the dimensions, and the classification hierarchy structure associated with each dimension. In the first part of this chapter we describe how these structures are sed to represent data in statistical and OLAP databases and how summarization operators can be applied to them. Further, we discuss how these structures can be extended to represent related information using federated database concepts. In the second part of the chapter we show that these concepts are common to many scientific database applications. In particular, we discuss the importance of supporting classification structures and the difficulty in representing them as tables in relational databases. We also discuss data structures to support multidimensional databases, emphasizing space-time representation, clustering in multidimensional space, indexing in multidimensional space, and supporting classification structures. We conclude by arguing that the concepts of multidimensionality and classification structures as well as the operation over them should be elevated to “first class” object types. These object types should be visible by the application user explicitly in the conceptual schemas as well as exposing them in the user interfaces.

