Modeling XML Warehouses for Complex Data: The New Issues

Modeling XML Warehouses for Complex Data: The New Issues

Doulkifli Boukraa (University of Jijel, Algeria), Riadh Ben Messaoud (University of Nabeul, Tunisia) and Omar Boussaid (University Lumière Lyon 2, France)
DOI: 10.4018/978-1-60566-308-1.ch013
OnDemand PDF Download:
$37.50

Abstract

Current data warehouses deal for the most part with numerical data. However, decision makers need to analyze data presented in all formats which one can qualify as complex data. Warehousing complex data is a new challenge for the scientific community. Indeed, it requires revisiting the whole warehousing process in order to take into account the complex structure of data; therefore, many concepts of data warehousing will need to be redefined. In particular, modeling complex data in a unique format for analysis purposes is a challenge. In this chapter, the authors present a complex data warehouse model at both conceptual and logical levels. They show how XML is suitable for capturing the main concepts of their model, and present the main issues related to these data warehouses.
Chapter Preview
Top

Background

A commonly accepted definition of a data warehouse is that given by Inmon (2002) “A data warehouse is a subject-oriented, integrated, non-volatile, and time variant collection of data in support of management’s decisions” (Inmon, 2002). Traditional data warehouses apply to structured data and they have gained maturity as witnessed by the number of related tools. As structured data are not the only data needed for decision making, new generations of data warehouses have emerged that take into account different structures of data. In this chapter, we focus on one kind of these data warehouses: XML warehouses. As a new research field, there is no common definition of an XML warehouse. This is due to the fact that XML is used differently in different contexts: as a format of data sources, as a means of data integration or exchange between traditional data warehouses or as a language to describe the warehouse itself. In this section, we present the research work on XML warehouses. For the sake of clarity, we propose to group the research work according to the following concepts of data warehousing: data preparation, data modeling, data storage, data exchange and data analysis.

Data Preparation

In a data warehouse, data is physically integrated from different sources. Because the sources are usually independent from each other, the integration operation may cause many problems due to data redundancy, inconsistency, etc. Thus, data needs to be cleaned before integrating it. In this context, Rusu et.al. (2005) consider XML sources and provide a method for cleaning XML data. Their method consists of four steps: correcting XML schemas, eliminating redundancy, eliminating inconsistency and eliminating errors.

Golfarelli et. al. (2001) also consider XML data sources with related DTDs. Prior to the modeling activity, the authors require DTDs to be simplified by flattening their element definitions, grouping the same-named sub-elements and reducing unary operators to single one. A similar approach is found in (Vrdoljak et.al., 2003) but it applies to XML Schemas rather than DTDs.

Complete Chapter List

Search this Book:
Reset
Table of Contents
Foreword
Ernesto Damiani
Preface
Eric Pardede
Acknowledgment
Eric Pardede
Chapter 1
Mary Ann Malloy, Irena Mlynkova
As XML technologies have become a standard for data representation, it is inevitable to propose and implement efficient techniques for managing XML... Sample PDF
Closing the Gap Between XML and Relational Database Technologies: State-of-the-Practice, State-of-the-Art and Future Directions
$37.50
Chapter 2
Mirella M. Moro, Lipyeow Lim, Yuan-Chi Chang
It is well known that XML has been widely adopted for its flexible and self-describing nature. However, relational data will continue to co-exist... Sample PDF
Challenges on Modeling Hybrid XML-Relational Databases
$37.50
Chapter 3
Vassiliki Koutsonikola, Athena Vakali
Nowadays, XML has become the standard for representing and exchanging data over the Web and several approaches have been proposed for efficiently... Sample PDF
XML and LDAP Integration: Issues and Trends
$37.50
Chapter 4
Giovanna Guerrini, Marco Mesiti
The large dynamicity of XML documents on the Web has created the need to adequately support structural changes and to account for the possibility of... Sample PDF
XML Schema Evolution and Versioning: Current Approaches and Future Trends
$37.50
Chapter 5
Mingzhu Wei, Ming Li, Elke A. Rundensteiner, Murali Mani, Hong Su
Stream applications bring the challenge of efficiently processing queries on sequentially accessible XML data streams. In this chapter, the authors... Sample PDF
XML Stream Query Processing: Current Technologies and Open Challenges
$37.50
Chapter 6
Sven Groppe, Jinghua Groppe, Christoph Reinke, Nils Hoeller, Volker Linnemann
The widespread usage of XML in the last few years has resulted in the development of a number of XML query languages like XSLT or the later... Sample PDF
XSLT: Common Issues with XQuery and Special Issues of XSLT
$37.50
Chapter 7
Mirella M. Moro, Zografoula Vagena, Vassilis J. Tsotras
Content-based routing is a form of data delivery whereby the flow of messages is driven by their content rather than the IP address of their... Sample PDF
Recent Advances and Challenges in XML Document Routing
$37.50
Chapter 8
Philippe Poulard
XML engines are usually designed to solve a single class of problems: transformations of XML structures, validations of XML instances, Web... Sample PDF
Native XML Programming: Make Your Tags Active
$37.50
Chapter 9
Stéphane Bressan, Wee Hyong Tok, Xue Zhao
Since XML technologies have become a standard for data representation, a great amount of discussion has been generated by the persisting open issues... Sample PDF
Continuous and Progressive XML Query Processing and its Applications
$37.50
Chapter 10
Fabio Grandi, Federica Mandreoli, Riccardo Martoglia
In several application fields including legal and medical domains, XML documents are “versioned” along different dimensions of interest, whose... Sample PDF
Issues in Personalized Access to Multi-Version XML Documents
$37.50
Chapter 11
Tran Khanh Dang
In an outsourced XML database service model, organizations rely upon the premises of external service providers for the storage and retrieval... Sample PDF
Security Issues in Outsourced XML Databases
$37.50
Chapter 12
Marco Mesiti, Ernesto Jiménez Ruiz, Ismael Sanz, Rafael Berlanga Llavori, Giorgio Valentini, Paolo Perlasca, David Manset
There is a proliferation of research and industrial organizations that produce sources of huge amounts of biological data issuing from... Sample PDF
Data Integration Issues and Opportunities in Biological XML Data Management
$37.50
Chapter 13
Doulkifli Boukraa, Riadh Ben Messaoud, Omar Boussaid
Current data warehouses deal for the most part with numerical data. However, decision makers need to analyze data presented in all formats which one... Sample PDF
Modeling XML Warehouses for Complex Data: The New Issues
$37.50
Chapter 14
Irena Mlynkova
Since XML technologies have become a standard for data representation, numerous methods for processing XML data emerge every day. Consequently, it... Sample PDF
XML Benchmarking: The State of the Art and Possible Enhancements
$37.50
About the Contributors