XML Benchmarking: The State of the Art and Possible Enhancements

XML Benchmarking: The State of the Art and Possible Enhancements

Irena Mlynkova (Charles University, Czech Republic)
DOI: 10.4018/978-1-60566-308-1.ch014
OnDemand PDF Download:
$37.50

Abstract

Since XML technologies have become a standard for data representation, numerous methods for processing XML data emerge every day. Consequently, it is necessary to compare the newly proposed methods with the existing ones, as well as analyze the effect of a particular method when applied to various types of data. In this chapter, the auhtors provide an overview of existing approaches to XML benchmarking from the perspective of various applications and show that to date the problem has been highly marginalized. Therefore, in the second part of the chapter they discuss persisting open issues and their possible solutions.
Chapter Preview
Top

Introduction

Since XML (Bray et al., 2006) became a de-facto standard for data representation and manipulation, numerous methods have been proposed for efficiently managing, processing, exchanging, querying, updating and compressing XML documents. And new proposals emerge every day. Naturally, each author performs various experimental tests using the newly proposed method and describes its advantages and disadvantages. But, it can be very difficult for a future user to decide which of the existing approaches is the most suitable for his/hers particular requirements on the basis of the descriptions of methods. The problem is that various methods are usually tested on different data sets derived from diverse sources which either do not yet exist or which were created only for the testing purposes, with special requirements of particular applications etc.

An author of a new method will encounter a similar problem whenever he/she wants to compare the new proposal with an existing one. This is possible only if the source or executable files of the existing method or, at least, identical testing data sets are available. But, too often it is impossible to have access to this information. In addition, in the latter case, the performance evaluation is limited by the testing set whose characteristics are often unknown. Hence, a reader finds it difficult to obtain a clear notion of the analyzed situation.

An analogous problem occurs if we want to test the behaviour of a particular method on various types of data, or determine the correlation between the efficiency of the method and changing complexity of the input data. Not even the process of gathering the testing data sets is simple. Firstly, the real-world XML data usually contain a huge number of errors (Mlynkova et al., 2006) which need to be corrected. And what is worse, the real-world data sets are usually surprisingly simple and do not cover all constructs allowed by XML specifications.

Currently, there exist several projects which provide a set of testing XML data collections (usually together with a set of testing XML operations) that are publicly available and well-described. We can find either fixed (or gradually extended) databases of real-world XML data (e.g. project INEX (INEX, 2007)) or projects which enable us to generate synthetic XML data on the basis of user-specified characteristics (e.g. project XMark (Busse, 2003)). But, in the former case, we are limited by the characteristics of the testing set; whereas, in the latter case, the characteristics of the generated data that can be specified are trivial (such as the amount and size of the data).

Complete Chapter List

Search this Book:
Reset
Table of Contents
Foreword
Ernesto Damiani
Preface
Eric Pardede
Acknowledgment
Eric Pardede
Chapter 1
Mary Ann Malloy, Irena Mlynkova
As XML technologies have become a standard for data representation, it is inevitable to propose and implement efficient techniques for managing XML... Sample PDF
Closing the Gap Between XML and Relational Database Technologies: State-of-the-Practice, State-of-the-Art and Future Directions
$37.50
Chapter 2
Mirella M. Moro, Lipyeow Lim, Yuan-Chi Chang
It is well known that XML has been widely adopted for its flexible and self-describing nature. However, relational data will continue to co-exist... Sample PDF
Challenges on Modeling Hybrid XML-Relational Databases
$37.50
Chapter 3
Vassiliki Koutsonikola, Athena Vakali
Nowadays, XML has become the standard for representing and exchanging data over the Web and several approaches have been proposed for efficiently... Sample PDF
XML and LDAP Integration: Issues and Trends
$37.50
Chapter 4
Giovanna Guerrini, Marco Mesiti
The large dynamicity of XML documents on the Web has created the need to adequately support structural changes and to account for the possibility of... Sample PDF
XML Schema Evolution and Versioning: Current Approaches and Future Trends
$37.50
Chapter 5
Mingzhu Wei, Ming Li, Elke A. Rundensteiner, Murali Mani, Hong Su
Stream applications bring the challenge of efficiently processing queries on sequentially accessible XML data streams. In this chapter, the authors... Sample PDF
XML Stream Query Processing: Current Technologies and Open Challenges
$37.50
Chapter 6
Sven Groppe, Jinghua Groppe, Christoph Reinke, Nils Hoeller, Volker Linnemann
The widespread usage of XML in the last few years has resulted in the development of a number of XML query languages like XSLT or the later... Sample PDF
XSLT: Common Issues with XQuery and Special Issues of XSLT
$37.50
Chapter 7
Mirella M. Moro, Zografoula Vagena, Vassilis J. Tsotras
Content-based routing is a form of data delivery whereby the flow of messages is driven by their content rather than the IP address of their... Sample PDF
Recent Advances and Challenges in XML Document Routing
$37.50
Chapter 8
Philippe Poulard
XML engines are usually designed to solve a single class of problems: transformations of XML structures, validations of XML instances, Web... Sample PDF
Native XML Programming: Make Your Tags Active
$37.50
Chapter 9
Stéphane Bressan, Wee Hyong Tok, Xue Zhao
Since XML technologies have become a standard for data representation, a great amount of discussion has been generated by the persisting open issues... Sample PDF
Continuous and Progressive XML Query Processing and its Applications
$37.50
Chapter 10
Fabio Grandi, Federica Mandreoli, Riccardo Martoglia
In several application fields including legal and medical domains, XML documents are “versioned” along different dimensions of interest, whose... Sample PDF
Issues in Personalized Access to Multi-Version XML Documents
$37.50
Chapter 11
Tran Khanh Dang
In an outsourced XML database service model, organizations rely upon the premises of external service providers for the storage and retrieval... Sample PDF
Security Issues in Outsourced XML Databases
$37.50
Chapter 12
Marco Mesiti, Ernesto Jiménez Ruiz, Ismael Sanz, Rafael Berlanga Llavori, Giorgio Valentini, Paolo Perlasca, David Manset
There is a proliferation of research and industrial organizations that produce sources of huge amounts of biological data issuing from... Sample PDF
Data Integration Issues and Opportunities in Biological XML Data Management
$37.50
Chapter 13
Doulkifli Boukraa, Riadh Ben Messaoud, Omar Boussaid
Current data warehouses deal for the most part with numerical data. However, decision makers need to analyze data presented in all formats which one... Sample PDF
Modeling XML Warehouses for Complex Data: The New Issues
$37.50
Chapter 14
Irena Mlynkova
Since XML technologies have become a standard for data representation, numerous methods for processing XML data emerge every day. Consequently, it... Sample PDF
XML Benchmarking: The State of the Art and Possible Enhancements
$37.50
About the Contributors