Article Preview
TopIntroduction
XML (eXtensible Markup Language) documents are a rich source of information for organizational decision making. Similarly, the use of Data Warehouses (DW) (Kimball, 2002) and OLAP (On-Line Analytical Processing) tools (Chaudhuri, 1997) allows the identification of tendencies and standards, in order to conduct better strategic decisions for companies businesses. However, the use of these technologies, together, is still in development process.
In XML, it is possible to represent information semantically similar in different ways. This leads to three kinds of data heterogeneity: (i) semantic, where similar information is represented through different names, e.g., enterprise and company, or dissimilar information through equal names, e.g., virus in the informatics field and in the health field; (ii) syntactic, where the semantically equal content is represented in several ways. For example, in different languages or in diverse measure units, e.g., meters and feet; and (iii) structural, in which data is organized in different structures, e.g., in different kinds of hierarchies, in attributes, or in elements (Näppilä, 2008). This representation flexibility is important, however, it makes the use of OLAP concepts in XML data a complex task. Applications and technologies, derived from XML, use XLink (XML Linking Language) (DeRose, Maler, & Orchard, 2001) as an alternative for representing the information semantic and structure, expressing relationships between concepts. An example of how the data semantic is represented using XLink is XBRL (eXtensible Business Reporting Language) (Hernández-Ros, 2006), an international standard for representing financial reports that uses extended links for modeling financial concepts. A problem that occurs when processing documents, which have XLink and correspond to chains of links, is that the W3C (World Wide Web Consortium) available query languages (i.e., Boag, Chamberlin, Fernández, Florescu, Robie, & Siméon, 2007; Berglund, Boag, Chamberlin, Fernández, Kay, Robie, & Siméon, 2007) do not provide support for navigating on them. Although XPath has been widely adopted as query standard in XML documents, it does not provide such navigation functionality. Several proposals have been developed for performing the analytical queries (OLAP) over XML data (Beyer, 2005; Bordawekar, 2005; Näppilä, 2008; Wang, 2007; Jian, 2007; XBRL International, 2006). However, these proposals do not take the use of XLink in XML documents into account.