Challenges on Modeling Hybrid XML-Relational Databases

Challenges on Modeling Hybrid XML-Relational Databases

Mirella M. Moro (Universidade Federal de Minas Gerais (UFMG)-Belo Horizonte, Brazil), Lipyeow Lim (IBM T.J. Watson Research Center, USA) and Yuan-Chi Chang (IBM T.J. Watson Research Center, USA)
DOI: 10.4018/978-1-60566-308-1.ch002
OnDemand PDF Download:
$37.50

Abstract

It is well known that XML has been widely adopted for its flexible and self-describing nature. However, relational data will continue to co-exist with XML for several different reasons one of which is the high cost of transferring everything to XML. In this context, data designers face the problem of modeling both relational and XML data within an integrated environment. This chapter highlights important questions on hybrid XML-relational database design and discusses use cases, requirements, and deficiencies in existing design methodologies especially in the light of data and schema evolution. The authors’ analysis results in several design guidelines and a series of challenges to be addressed by future research.
Chapter Preview
Top

Introduction

Enterprise data design has become much more complex than modeling traditional data stores. The data flowing in and out of an enterprise is no longer just relational tuples, but also XML data in the form of messages and business artifacts such as purchase orders, invoices, contracts and other documents. Moreover, regulations (such as the Sarbanes Oxley Act1) require much of these data (both relational and XML) to be versioned and persisted for audit trail. Last but not least, the competitiveness of enterprises is often a function of their business agility – the ability to change with the changing market. Consequently, enterprise data design needs to cope with different types of data, changing data and data schema evolution.

Relational database management systems (RDBMSs) are a dominant technology for managing enterprise data stores. Even if the enterprise data are more suitably managed as XML, the cost of migrating to XML databases may be prohibitive. Therefore, relational data will continue to persist in the database. On the other hand, the widespread use of XML data requires the ability to manage and retrieve XML information. A simple solution is to store XML data as character large objects (CLOBs) in an RDBMS, but query processing is inefficient due to per query parsing of the XML CLOBs. Another solution, adopted by most commercial RDBMSs, is shredding XML data into relational tables, for example Florescu & Kossmann (1999) and Shanmugasundaram (2001). However, shredding does not handle XML schema changes efficiently. Hence, a native XML database that stores XML data in a hierarchical format is still required. Such specialized native XML databases have been developed, for example Jagadish (2002), and some even support relational data as well, for example Halverson (2004).

Nevertheless, neither a pure relational nor a pure XML database meets all the needs of enterprise data management. Ideally, a hybrid database that supports both relational and XML is the best solution to model, persist, manage, and query both relational and XML data in a unified manner. Some commercial RDBMSs have begun to support such hybrid XML-relational data models (e.g. IBM’s DB2 v.92). Although employing a hybrid solution seems to be a straightforward idea, in reality, it involves a complex system with a many options that may easily confuse most designers. Likewise, we noticed that most users are still uncertain about how exactly to model an XML database, not to mention a hybrid XML-relational one.

In this context, the focus of this chapter is to discuss how to design a hybrid XML-relational database. Note that we are not concerned with designing a database system, but rather a set of relations containing relational and XML data. The contributions and the organization of this chapter are as follows.

  • We present a methodology for designing XML databases (without considering any interaction with relational data).

  • We overview some of the most relevant real case scenarios that motivate the relevance of a hybrid XML-relational database.

  • We present and discuss the challenges to defining a hybrid XML-relational model. We present a set of modeling ideas that serve as an initial solution for such complex modeling issues. Also, we discuss what else is needed in order to have a more complete solution – i.e., we discuss open issues on the modeling phase.

Finally, we discuss some related work and conclude this chapter with an overview of open problems.

Top

Background

This section presents a brief review of relational database design, which we assume is well-known in the computer science community. Traditionally, the design of relational databases is structured into three phases as follows.

Complete Chapter List

Search this Book:
Reset
Table of Contents
Foreword
Ernesto Damiani
Preface
Eric Pardede
Acknowledgment
Eric Pardede
Chapter 1
Mary Ann Malloy, Irena Mlynkova
As XML technologies have become a standard for data representation, it is inevitable to propose and implement efficient techniques for managing XML... Sample PDF
Closing the Gap Between XML and Relational Database Technologies: State-of-the-Practice, State-of-the-Art and Future Directions
$37.50
Chapter 2
Mirella M. Moro, Lipyeow Lim, Yuan-Chi Chang
It is well known that XML has been widely adopted for its flexible and self-describing nature. However, relational data will continue to co-exist... Sample PDF
Challenges on Modeling Hybrid XML-Relational Databases
$37.50
Chapter 3
Vassiliki Koutsonikola, Athena Vakali
Nowadays, XML has become the standard for representing and exchanging data over the Web and several approaches have been proposed for efficiently... Sample PDF
XML and LDAP Integration: Issues and Trends
$37.50
Chapter 4
Giovanna Guerrini, Marco Mesiti
The large dynamicity of XML documents on the Web has created the need to adequately support structural changes and to account for the possibility of... Sample PDF
XML Schema Evolution and Versioning: Current Approaches and Future Trends
$37.50
Chapter 5
Mingzhu Wei, Ming Li, Elke A. Rundensteiner, Murali Mani, Hong Su
Stream applications bring the challenge of efficiently processing queries on sequentially accessible XML data streams. In this chapter, the authors... Sample PDF
XML Stream Query Processing: Current Technologies and Open Challenges
$37.50
Chapter 6
Sven Groppe, Jinghua Groppe, Christoph Reinke, Nils Hoeller, Volker Linnemann
The widespread usage of XML in the last few years has resulted in the development of a number of XML query languages like XSLT or the later... Sample PDF
XSLT: Common Issues with XQuery and Special Issues of XSLT
$37.50
Chapter 7
Mirella M. Moro, Zografoula Vagena, Vassilis J. Tsotras
Content-based routing is a form of data delivery whereby the flow of messages is driven by their content rather than the IP address of their... Sample PDF
Recent Advances and Challenges in XML Document Routing
$37.50
Chapter 8
Philippe Poulard
XML engines are usually designed to solve a single class of problems: transformations of XML structures, validations of XML instances, Web... Sample PDF
Native XML Programming: Make Your Tags Active
$37.50
Chapter 9
Stéphane Bressan, Wee Hyong Tok, Xue Zhao
Since XML technologies have become a standard for data representation, a great amount of discussion has been generated by the persisting open issues... Sample PDF
Continuous and Progressive XML Query Processing and its Applications
$37.50
Chapter 10
Fabio Grandi, Federica Mandreoli, Riccardo Martoglia
In several application fields including legal and medical domains, XML documents are “versioned” along different dimensions of interest, whose... Sample PDF
Issues in Personalized Access to Multi-Version XML Documents
$37.50
Chapter 11
Tran Khanh Dang
In an outsourced XML database service model, organizations rely upon the premises of external service providers for the storage and retrieval... Sample PDF
Security Issues in Outsourced XML Databases
$37.50
Chapter 12
Marco Mesiti, Ernesto Jiménez Ruiz, Ismael Sanz, Rafael Berlanga Llavori, Giorgio Valentini, Paolo Perlasca, David Manset
There is a proliferation of research and industrial organizations that produce sources of huge amounts of biological data issuing from... Sample PDF
Data Integration Issues and Opportunities in Biological XML Data Management
$37.50
Chapter 13
Doulkifli Boukraa, Riadh Ben Messaoud, Omar Boussaid
Current data warehouses deal for the most part with numerical data. However, decision makers need to analyze data presented in all formats which one... Sample PDF
Modeling XML Warehouses for Complex Data: The New Issues
$37.50
Chapter 14
Irena Mlynkova
Since XML technologies have become a standard for data representation, numerous methods for processing XML data emerge every day. Consequently, it... Sample PDF
XML Benchmarking: The State of the Art and Possible Enhancements
$37.50
About the Contributors