Analytical Processing Over XML and XLink

Analytical Processing Over XML and XLink

Paulo Caetano da Silva (Salvador University and Federal University of Pernambuco, Brazil), Valéria Cesário Times (Federal University of Pernambuco, Brazil), Ricardo Rodrigues Ciferri (Federal University of São Carlos, Brazil) and Cristina Dutra de Aguiar Ciferri (University of São Paulo, Brazil)
Copyright: © 2012 |Pages: 41
DOI: 10.4018/jdwm.2012010103
OnDemand PDF Download:
No Current Special Offers


Current commercial and academic OLAP tools do not process XML data that contains XLink. Aiming at overcoming this issue, this paper proposes an analytical system composed by LMDQL, an analytical query language. Also, the XLDM metamodel is given to model cubes of XML documents with XLink and to deal with syntactic, semantic and structural heterogeneities commonly found in XML documents. As current W3C query languages for navigating in XML documents do not support XLink, XLPath is discussed in this article to provide features for the LMDQL query processing. A prototype system enabling the analytical processing of XML documents that use XLink is also detailed. This prototype includes a driver, named sql2xquery, which performs the mapping of SQL queries into XQuery. To validate the proposed system, a case study and its performance evaluation are presented to analyze the impact of analytical processing over XML/XLink documents.
Article Preview


XML (eXtensible Markup Language) documents are a rich source of information for organizational decision making. Similarly, the use of Data Warehouses (DW) (Kimball, 2002) and OLAP (On-Line Analytical Processing) tools (Chaudhuri, 1997) allows the identification of tendencies and standards, in order to conduct better strategic decisions for companies businesses. However, the use of these technologies, together, is still in development process.

In XML, it is possible to represent information semantically similar in different ways. This leads to three kinds of data heterogeneity: (i) semantic, where similar information is represented through different names, e.g., enterprise and company, or dissimilar information through equal names, e.g., virus in the informatics field and in the health field; (ii) syntactic, where the semantically equal content is represented in several ways. For example, in different languages or in diverse measure units, e.g., meters and feet; and (iii) structural, in which data is organized in different structures, e.g., in different kinds of hierarchies, in attributes, or in elements (Näppilä, 2008). This representation flexibility is important, however, it makes the use of OLAP concepts in XML data a complex task. Applications and technologies, derived from XML, use XLink (XML Linking Language) (DeRose, Maler, & Orchard, 2001) as an alternative for representing the information semantic and structure, expressing relationships between concepts. An example of how the data semantic is represented using XLink is XBRL (eXtensible Business Reporting Language) (Hernández-Ros, 2006), an international standard for representing financial reports that uses extended links for modeling financial concepts. A problem that occurs when processing documents, which have XLink and correspond to chains of links, is that the W3C (World Wide Web Consortium) available query languages (i.e., Boag, Chamberlin, Fernández, Florescu, Robie, & Siméon, 2007; Berglund, Boag, Chamberlin, Fernández, Kay, Robie, & Siméon, 2007) do not provide support for navigating on them. Although XPath has been widely adopted as query standard in XML documents, it does not provide such navigation functionality. Several proposals have been developed for performing the analytical queries (OLAP) over XML data (Beyer, 2005; Bordawekar, 2005; Näppilä, 2008; Wang, 2007; Jian, 2007; XBRL International, 2006). However, these proposals do not take the use of XLink in XML documents into account.

Complete Article List

Search this Journal:
Open Access Articles
Volume 17: 4 Issues (2021): Forthcoming, Available for Pre-Order
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing