Issues in Personalized Access to Multi-Version XML Documents

Issues in Personalized Access to Multi-Version XML Documents

Fabio Grandi (Università di Bologna, Italy), Federica Mandreoli (Università di Modena e Reggio Emilia, Italy) and Riccardo Martoglia (Università di Modena e Reggio Emilia, Italy)
DOI: 10.4018/978-1-60566-308-1.ch010
OnDemand PDF Download:
$37.50

Abstract

In several application fields including legal and medical domains, XML documents are “versioned” along different dimensions of interest, whose nature depends on the application needs such as time, space and security. Specifically, temporal and Semantic versioning is particularly demanding in a broad range of application domains where temporal versioning can be used to maintain histories of the underlying resources along various time dimensions, and Semantic versioning can then be used to model limited applicability of resources to individual cases or contexts. The selection and reconstruction of the version(s) of interest for a user means the retrieval of those fragments of documents that match both the implicit and explicit user needs, which can be formalized as what we call personalization queries. In this chapter, the authors focus on the design and implementation issues of a personalization query processor. They consider different design options and, among them, they introduce an in-depth study of a native solution by showing, also through experimental evaluation, how some of the best performing technological solutions available today for XML data management can be successfully extended and optimally combined in order to support personalization queries.
Chapter Preview
Top

Overview And Motivation

Nowadays, XML has become ubiquitous with an ever-increasing number of computer applications exchanging and storing information in XML format. In particular, a large number of organizations, including private companies and public institutions, place rich collections of documents at the disposal of internet users. Generally, such collections are large XML repositories containing millions of semi-structured documents, each one containing thousands of nodes. Portals and websites which allow users to access such repositories are usually equipped with classic keyword-based search engines which are not adequate to retrieve all and only the information that is relevant for the user, as the tree structure of documents must also be taken into account. As a consequence, in recent years many research efforts have been expended to support structural querying in XML repositories and discovering the occurrences of labelled trees - or twig query - patterns (Amer-Yahia et al., 2001) has become a core operation for XML query processing.

Moreover, in several application fields including legal and medical domains, management of bills of materials and catalogue data, accounting and finance, XML documents are “versioned” along different dimensions of interest, whose nature depends on the application needs (e.g. time, space, security). In this chapter, we consider time pertinence and applicability as versioning dimensions, which give rise to multidimensional temporal and semantic versioning. Indeed, temporal and semantic versioning is particularly demanding in a broad range of application domains where temporal versioning can be used to maintain histories of the underlying resources along various time dimensions, and semantic versioning can then be used to model limited applicability of resources (or resource portions) to individual cases or contexts. In all these cases, while the most important version is the “current” one with respect to the temporal dimensions (and with generic applicability), past versions are also very important for applications and cannot be discarded.

For instance, in the legal domain, a clear example of such multi-version resources are norm texts, including Laws, Acts, Decrees, Provisions, Regulations, etc. Norm texts are continually subject to amendments and modifications and multiple temporal versions coexist as a consequence of the dynamics of the legislative activity. In particular, several temporal dimensions are involved in the representation and management of norm texts, including transaction, validity, efficacy, applicability, publication and enactment times (Grandi et al., 2005; Palmirani & Brighi, 2006). The most important version of a norm is the consolidated version, which is the one produced by the application of all the modifications the norm has undergone so far, as it is the one which is currently part of the regulations in force and generically applicable to all citizens. However, past versions (even with limited applicability) are also virtually needed. For instance, considering validity time and semantic applicability to individual cases, a court might be called to judge a case involving a crime C committed at a time T on the basis of the (versions of the) laws which were valid at time T and applicable to crime C.

Another interesting example is the medical domain, where multi-version resources of interest are, for instance, clinical guidelines, which are definitions of “best practices” encoding and standardizing clinical procedures for a given disease. Clinical guidelines are also subject to continuous development and revision by committees of expert physicians and health authorities, and multiple temporal versions coexist as a consequence of the clinical and healthcare activity. Several temporal dimensions are also involved in the representation and management of clinical guidelines, including valid, transaction, event, availability, proposal and acceptance times (Combi & Montanari, 2001; Terenziani et al., 2005). Also, in the medical domain, past versions continue to be relevant, as a physician might be called upon to justify his/her actions for a given patient P at a time T on the basis on the (versions of the) clinical guidelines which were valid at time T and applicable to the pathology of patient P.

Complete Chapter List

Search this Book:
Reset
Table of Contents
Foreword
Ernesto Damiani
Preface
Eric Pardede
Acknowledgment
Eric Pardede
Chapter 1
Mary Ann Malloy, Irena Mlynkova
As XML technologies have become a standard for data representation, it is inevitable to propose and implement efficient techniques for managing XML... Sample PDF
Closing the Gap Between XML and Relational Database Technologies: State-of-the-Practice, State-of-the-Art and Future Directions
$37.50
Chapter 2
Mirella M. Moro, Lipyeow Lim, Yuan-Chi Chang
It is well known that XML has been widely adopted for its flexible and self-describing nature. However, relational data will continue to co-exist... Sample PDF
Challenges on Modeling Hybrid XML-Relational Databases
$37.50
Chapter 3
Vassiliki Koutsonikola, Athena Vakali
Nowadays, XML has become the standard for representing and exchanging data over the Web and several approaches have been proposed for efficiently... Sample PDF
XML and LDAP Integration: Issues and Trends
$37.50
Chapter 4
Giovanna Guerrini, Marco Mesiti
The large dynamicity of XML documents on the Web has created the need to adequately support structural changes and to account for the possibility of... Sample PDF
XML Schema Evolution and Versioning: Current Approaches and Future Trends
$37.50
Chapter 5
Mingzhu Wei, Ming Li, Elke A. Rundensteiner, Murali Mani, Hong Su
Stream applications bring the challenge of efficiently processing queries on sequentially accessible XML data streams. In this chapter, the authors... Sample PDF
XML Stream Query Processing: Current Technologies and Open Challenges
$37.50
Chapter 6
Sven Groppe, Jinghua Groppe, Christoph Reinke, Nils Hoeller, Volker Linnemann
The widespread usage of XML in the last few years has resulted in the development of a number of XML query languages like XSLT or the later... Sample PDF
XSLT: Common Issues with XQuery and Special Issues of XSLT
$37.50
Chapter 7
Mirella M. Moro, Zografoula Vagena, Vassilis J. Tsotras
Content-based routing is a form of data delivery whereby the flow of messages is driven by their content rather than the IP address of their... Sample PDF
Recent Advances and Challenges in XML Document Routing
$37.50
Chapter 8
Philippe Poulard
XML engines are usually designed to solve a single class of problems: transformations of XML structures, validations of XML instances, Web... Sample PDF
Native XML Programming: Make Your Tags Active
$37.50
Chapter 9
Stéphane Bressan, Wee Hyong Tok, Xue Zhao
Since XML technologies have become a standard for data representation, a great amount of discussion has been generated by the persisting open issues... Sample PDF
Continuous and Progressive XML Query Processing and its Applications
$37.50
Chapter 10
Fabio Grandi, Federica Mandreoli, Riccardo Martoglia
In several application fields including legal and medical domains, XML documents are “versioned” along different dimensions of interest, whose... Sample PDF
Issues in Personalized Access to Multi-Version XML Documents
$37.50
Chapter 11
Tran Khanh Dang
In an outsourced XML database service model, organizations rely upon the premises of external service providers for the storage and retrieval... Sample PDF
Security Issues in Outsourced XML Databases
$37.50
Chapter 12
Marco Mesiti, Ernesto Jiménez Ruiz, Ismael Sanz, Rafael Berlanga Llavori, Giorgio Valentini, Paolo Perlasca, David Manset
There is a proliferation of research and industrial organizations that produce sources of huge amounts of biological data issuing from... Sample PDF
Data Integration Issues and Opportunities in Biological XML Data Management
$37.50
Chapter 13
Doulkifli Boukraa, Riadh Ben Messaoud, Omar Boussaid
Current data warehouses deal for the most part with numerical data. However, decision makers need to analyze data presented in all formats which one... Sample PDF
Modeling XML Warehouses for Complex Data: The New Issues
$37.50
Chapter 14
Irena Mlynkova
Since XML technologies have become a standard for data representation, numerous methods for processing XML data emerge every day. Consequently, it... Sample PDF
XML Benchmarking: The State of the Art and Possible Enhancements
$37.50
About the Contributors