Article Preview
Top1. Introduction
Data Warehouses (DWs) and OLAP systems are business intelligence technologies, which aim to enable decision makers to formulate complex queries through visual and interactive interfaces (Kimball., 1996).
Today more and more spatial data are collected through sensors network, social media, etc. (Lee & kang., 2015). These data are analyzed using GeoBusiness Intelligence systems like Spatial Data Mining, Spatial Reporting and Spatial OLAP.
Although, new technologies for storage and querying spatial data recently emerged (e.g. MongoDB, CouchDB, Neo4j, etc.), relational Spatial Data Management Systems (SDMBs), such as PostGIS, Oracle Spatial, remain effective solutions for warehousing and OLAPing classical spatial data (Kitchin & Lauriault., 2015).
Indeed, to benefit from OLAP technologies in the context of spatial data, some works introduce the Spatial OLAP concept (SOLAP), which integrates the functionality of OLAP systems and Geographic Information Systems (GISs) in a single Relational environment (Bédard, Merret & Han., 2001). SOLAP systems organize information according to the spatio-multidimensional model to allow the analysis of numerical and spatial data according to several dimensions. They extend key concepts of OLAP such as dimensions and measures by integrating the spatial data’s component (Bédard, Merret & Han., 2001). The spatio-multidimensional data are stored in Spatial Data Warehouse (SDW). A SDW was defined as “…a collection of spatial and non-spatial data, subject oriented, integrated, time variant and non-volatile dedicated to spatial decision making…” (Stefanovic, Han & Koperski., 2000). Spatial dimensions are organized into hierarchies. Spatial hierarchies are defined through topological relationships between levels such as intersection or inclusion, etc. A spatial hierarchy is composed of several related levels, of which at least one is spatial. Various kinds of spatial hierarchies have been defined (simple, balanced, generalized, non-strict, etc.). In particular, a non-strict spatial hierarchy is a spatial hierarchy whose members have several father members, or in other words, a many-to-many relationship (n-n) exists between spatial levels (Malinowski & Zimányi., 2008). The use of such hierarchies is needed in some real-life applications (mobile phone communication analysis, water pollution studies, etc.).
In the context of classical spatial data and relational SDMBs, and using a traffic jam case study, we study the impact of non-strict spatial hierarchies on logical modeling and performance of relational SDWs. Indeed, non-strict spatial hierarchies generate analytical problems because they are the source of well-known double counting problem (Lechtenbörger & Vossen., 2003). Several solutions at different levels (conceptual (Malinowski & Zimányi., 2008) and visualization (Mansmann & Scholl., 2006) have been proposed in the literature allowing correct aggregation of measures with non-strict hierarchies.
However, to the best of our knowledge, no work studies the performance associated with this kind of hierarchies offering an optimization at the physical layer of a relational SDBMS. In addition, the existing spatial multidimensional relational logical models are not suitable to non-strict hierarchies defined on several levels because they do not address the double counting problem.
Therefore, motivated by the importance of non-strict spatial hierarchies in real SOLAP applications, the lack of an adequate logical model and ad-hoc optimization techniques, we present in this paper:
- 1.
a new logical model, called NN logical model (Kimball., 1996). It extends the model with Bridge table proposed by (Malinowski & Zimányi., 2008). It moves the bridge tables at the finest spatial level to eliminate the problem of double counting related to the distributivity of the distribution factors.
- 2.
NN index: a new index for SOLAP queries. NN-index has a hierarchical data structure based on the distribution factors and the bitmap index (Wu, Otoo & Shoshani., 2004).