XML OLAP Cube in the Cloud Towards the DWaaS

XML OLAP Cube in the Cloud Towards the DWaaS

Rihane DKAICH (MISC Laboratory, Faculty of Sciences, Kenitra, Morocco), Ikram EL AZAMI (MISC Laboratory, Faculty of Sciences, Kenitra, Morocco) and Abdelaziz MOULOUDI (MISC Laboratory, Faculty of Sciences, Kenitra, Morocco)
Copyright: © 2017 |Pages: 10
DOI: 10.4018/IJCAC.2017010103

Abstract

Datawarehouses can be extremely large and ressource demanding, which is not always affordable in a local environment. Hence, in order to deal with the big amounts of data held in the datawarehouses, Cloud warehousing seems to be the solution. On the other hand, many entreprises use datawarehouses for data analysis and use XML to deal with semi-structured data but also to take advantage of the web environment. Therefore, the idea of combining the two solutions in a parallel environment seems necessary. Thus, the goal of the study presented in this paper is to use XML to store and exchange data and to connect it to the distributed processing of multidimentional data. This article deals with the problematic of storing documents in distributed environments such as the cloud nodes, and discusses the possibility of combining the storage of data and the decisional analysis based on OLAP cubes in cloud environments, using the MapReduce model for query processing.
Article Preview
Top

Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879). Originally designed to meet the challenges of large-scale electronic publishing, XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere (Quin, 2016). Thus, many studies have focused on XML documents distribution to store XML data in distributed environments.

Distributed XML documents, which may be called distributed trees, are documents which have been partitioned and sent to various nodes and are linked together to form a complete XML document (Abiteboul, Gottob, & Manna, 2008). The process can be accomplished through embedded function calls to the separate documents over a network from within a centralized node in the distributed system (Abiteboul et al., 2008). Previous researches have focused on distributing XML documents based on data size, others are based on data structure. The most efficient distributed systems consider both data size and structure. The system discribed in (Seyed-Abbassi & Gordon, 2015; Aljawarneh, 2011) distributes XML documents in a cloud services system using a kernel document and many distributed cloud nodes. Figure 1 discribes the system which distributes the XML document through an algorithm of least load (Seyed-Abbassi & Gordon, 2015) based on the number of cloud servers available and the size of the original XML document and preserving the tree structure of the XML document. The algorithm splits each of the subtrees into parts based on which distribution node has the most space still available (Seyed-Abbassi & Gordon, 2015) and once the load is determined, it stores each subtree in the corresponding cloud. The kernel document indicates which cloud service holds which partitionned document. For the backup the system stores the same partitions with the same loads into a second cloud service, this redundancy allows to find the lost data in case a node is unavailable due to electrical issues, being compromised, hacking, or any other problem (Seyed-Abbassi & Gordon, 2015; Aljawarneh et al., 2015).

Figure 1.

Distributed XML document

IJCAC.2017010103.f01

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 10: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing