Spatial OLAP (SOLAP) technologies are dedicated to multidimensional analysis of large volumes of (spatial) data. Spatal data are subject to different types of uncertainty, in particular spatial vagueness. Although several researches propose new models to cope with spatial vagueness, their integration in SOLAP systems is still in an embryonic state. Also, analyzing multidimensional data with metadata brought by the exploitation of the new models can be too complex and demanding for decision makers. To help reduce spatial vagueness consequences on the exactness of SOLAP analysis queries, the authors present a new approach for designing SOLAP datacubes based on end-users' tolerance to the risks of misinterpretation of fact data. An experimentation of the new approach on agri-environmental data is also proposed.
TopIntroduction
Spatial Online Analytical Processing (SOLAP) technologies are dedicated to multidimensional analysis of large volumes of (spatial) data (Bimonte, Boulil, Pradel, & Chanet, 2012). This type of OLAP systems includes spatial measures or spatial dimensions.
Spatial data always suffer from some levels of uncertainty. Not dealing with uncertainty when making high-level decisions based on SOLAP aggregated data increases the risks of data misinterpretations (Gervais, Bédard, Levesque, Bernier, & Devillers, 2009). This leads to faulty trend analysis, missed problems and inexact comparisons between regions or periods. The uncertainty is the result of semantic imprecision, logical inconsistency, temporal incoherence, etc. and/or spatial vagueness (Lotfi Bejaoui, 2009). In particular, spatial vagueness refers to a frequent imperfection on boundaries or spatial location of represented geographical objects (e.g. forest, fire, lake). To deal with uncertainty in SOLAP systems, two main approaches are investigated. The first one tries to reduce uncertainty (overabundance of observations to increase spatial precision for example) from the data or to provide decision-makers with visual feedbacks about the uncertainty (Bimonte, Nazih, Kang, Edoh-Alove, & Rizzi, 2013; Lévesque, 2008; Worboys, 1998). The second one proposes to handle uncertainty issues by using new uncertainty-aware spatio-multidimensional models and operators {Jadidi, 2012 #750;Siqueira, 2012 #652} (Jadidi, Mostafavi, Bédard, & Long, 2012; Perez, Somodevilla, & Pineda, 2007; Siqueira, Aguiar Ciferri, Times, & Ciferri, 2012), that are based on the representation of the vague objects with fuzzy or exact models. Nonetheless their implementation is still in an embryonic state.
Motivated by the desire to offer a solution that presents a symbiotic trade-off between the theoretical accuracy on spatial vagueness, the implementation feasibility in current technologies and the usability by intended end-users, we come up with a third approach: instead of dealing with the complexity of manipulating complex vague objects models in SOLAP systems, we propose to manage the risks of SOLAP datacubes misinterpretations, related to spatial vagueness, that the end-users incur. In particular, we are interested in two types of risks of misinterpretation which are the Risk-Geometry (related to the vagueness of the geometric members of the level) and the Risk-Aggregation (risk related to the aggregation formula used to compute measures for a given level).
To do so, we define a new SOLAP datacubes design approach that can take those risks into account during the datacubes modeling process. Such approach leads to the development of a classical SOLAP datacube which not only fits the end-users’ usage, but can also be implemented in existing (commercial) SOLAP tools and explored with classical SOLAP operators.
It extends existing methodologies with three main elements. First, our new approach takes simultaneously into account available data sources, end-users’ needs and end-users’ tolerance levels to “well-identified risks of SOLAP datacubes misinterpretations due to spatial vagueness issues”. Second, it delivers to end-users, different versions of SOLAP datacubes (according to their tolerance levels) where the possibility of making erroneous SOLAP analyses is minimized. Third, it enriches the SOLAP datacubes elements with visualization policies to properly communicate risks of misinterpretations to end-users if necessary.
The paper is organized as follows: Section 2 presents a the state-of-the-art on spatial vagueness management in SOLAP systems; motivation of our work using an agicultural case study is presented in Section 3; in section 4, we define and classify the risk of misinterpretation before moving on to defining our new risk-aware design approach requirements as well as the whole new design process proposed in section 5; in section 6 we detail our contributions regarding the risk of misinterpretation assessment and management in the new approach; finally the approach is tested on the case study in section 7.