Integrity Constraint Checking for Multiple XML Databases

Integrity Constraint Checking for Multiple XML Databases

Praveen Madiraju (Marquette University, USA), Rajshekhar Sunderraman (Georgia State University, USA), Shamkant B. Navathe (Georgia Institute of Technology, USA) and Haibin Wang (Winship Cancer Institute, USA)
DOI: 10.4018/978-1-60566-172-8.ch010
OnDemand PDF Download:


Global semantic integrity constraints ensure the integrity and consistency of data spanning distributed databases. In this chapter, we discuss a novel representation technique for expressing semantic integrity constraints for XML databases. We also provide the details of XConstraint Checker, a general framework for checking global semantic constraints for XML databases. The framework is augmented with an efficient algorithm for checking these global XML constraints. The algorithm is efficient for three reasons: 1) the algorithm does not require the update statement to be executed before the constraint check is carried out; hence, we avoid any potential problems associated with rollbacks, 2) sub constraint checks are executed in parallel, and 3) most of the processing of algorithm could happen at compile time; hence, we save time spent at run-time. As a proof of concept, we present a prototype of the system implementing the ideas discussed in this paper.
Chapter Preview


XML (eXtensible Markup Language) has now been adopted as a standard for representation and exchange of data on the web. XML based data exchange occurs in many applications such as finance, health, e-commerce and other application areas. A major goal of a database is to ensure consistency of the data. Integrity constraints are rules which guarantee the consistency of a database. We consider XML constraints in the setting of distributed XML databases. A single update (XUpdate (Tatarinov et al., 2001), (Laux & Martin, 2000)) on one site might cause a global constraint (global XConstraint) to be violated. By global XConstraints, we mean global semantic integrity constraints affecting multiple XML databases. We need an approach to check for such constraint violations. In the XML database setting, the majority of the times, users are interested in generating (updating), integrating and exchanging data. So, frequent updates on XML data may cause frequent global constraint violations. Hence we need an approach that will efficiently and speedily check for such global constraint violations.

There are two major approaches to this problem. The first would be to translate the XML document into relational data using methods such as those found in Shanmugasundaram et al. (1999), Chen et al. (2003) and Fong and Wong (2004). And then, map the updates and constraints on the XML data to corresponding updates and constraints on the relational data (Chen et al., 2002a). Now the problem of constraint checking on XML data is pushed to the problem of constraint checking on relational data. There are well established models for constraint checking in the relational world. However, this approach suffers from the overhead cost involved in transforming XML data into relational data (Kane, Su & Rundensteiner, 2002). The second approach would be to check for constraint violations on the XML data without transforming to relational data. It should be noted that using the first approach vs. second depends on the application being considered. If the application contains millions of records and if it benefits to use relational database features such as querying, fast indexing, etc., it is worth while to consider the first one otherwise the second approach suffices for a normal sized application. In this chapter, we consider the second approach.

A naïve solution would first update an XML document and then check for constraint violations. If a constraint is violated, we can rollback. However, such a naïve solution suffers from the overhead of time and resources spent on rollback. Also, the update statement is checked against all the constraints with the total new updated database state. However, in an incremental constraint checking strategy (Fan, 2005), (Bouchou et al., 2005), constraints are checked incrementally only on the updated document. Hence, we need an approach that would check for constraint violations before updating the database and therefore obviates the need for rollback situations.

In our constraint checking procedure, constraint violations are checked at compile time, before updating the database. Our approach centers on the design of the XConstraint Checker. Given an XUpdate (Tatarinov et al., 2001), (Laux & Martin, 2000) statement and a list of global XConstraints, we generate sub XConstraint checks corresponding to local sites. Sub XConstraint is an XML constraint, expressed as an XQuery, local to a single site (more details in Section 4). The results gathered from these sub XConstraints determine if the XUpdate statement violates any global XConstraints. Our approach is efficient; since we do not require the update statement to be executed before the constraint check is carried out and hence, we avoid any rollback situations. Our approach achieves speed as the sub constraint checks can be executed in parallel.

Complete Chapter List

Search this Book:
Editorial Advisory Board
Table of Contents
Chapter 1
Hong Zhang, Rajiv Kishore, Ram Ramesh
A conceptual modeling grammar should be based on the theory of ontology and possess clear ontological semantics to represent problem domain... Sample PDF
Semantics of the MibML Conceptual Modeling Grammar: An Ontological Analysis Using the Bunge-Wand-Weber Framework
Chapter 2
Henry M. Kim, Arijit Sengupta, Mark S. Fox, Mehmet Dalkilic
This paper introduces a measurement ontology for applications to semantic Web applications, specifically for emerging domains such as microarray... Sample PDF
A Measurement Ontology Generalizable for Emerging Domain Applications on the Semantic Web
Chapter 3
Zhiyuan Chen
Environmental research and knowledge discovery both require extensive use of data stored in various sources and created in different ways for... Sample PDF
Semantic Integration and Knowledge Discovery for Environmental Research
Chapter 4
Vijayan Sugumaran, Gerald DeHondt
Software reuse has been discussed in the literature for the past three decades and is widely seen as one of the major areas for improving... Sample PDF
Towards Code Reuse and Refactoring as a Practice within Extreme Programming
Chapter 5
Miguel I. Aguiree-Urreta, George M. Marakas
Requirements elicitation has been recognized as a critical stage in system development projects, yet current models prescribing particular... Sample PDF
Requirements Elicitation Technique Selection: A Theory-Based Contingency Model
Chapter 6
VenuGopal Balijepally, Sridhar Nerur, RadhaKanta Mahapatra
Software development in organizations is evolving and increasingly taking a socio-technical hue. While empirical research guided by common sense... Sample PDF
IT Value of Software Development: A Multi-Theoretic Perspective
Chapter 7
Amel Mammar
UB2SQL is a tool for designing and developing database applications using UML and B formal method. The approach supported by UB2SQL consists of two... Sample PDF
UB2SQL: A Tool for Building Database Applications Using UML and B Formal Method
Chapter 8
Juliette Gutierrez
Crime reports are used to find criminals, prevent further violations, identify problems causing crimes and allocate government resources.... Sample PDF
Using Decision Trees to Predict Crime Reporting
Chapter 9
Karen Corral, David Schuff, Robert D. St. Louis, Ozgur Turetken
Inefficient and ineffective search is widely recognized as a problem for businesses. The shortcomings of keyword searches have been elaborated upon... Sample PDF
A Model for Estimating the Savings from Dimensional vs. Keyword Search
Chapter 10
Praveen Madiraju, Rajshekhar Sunderraman, Shamkant B. Navathe, Haibin Wang
Global semantic integrity constraints ensure the integrity and consistency of data spanning distributed databases. In this chapter, we discuss a... Sample PDF
Integrity Constraint Checking for Multiple XML Databases
Chapter 11
Russel Pears
Data Warehouses are widely used for supporting decision making. On Line Analytical Processing or OLAP is the main vehicle for querying data... Sample PDF
Accelerating Multi Dimensional Queries in Data Warehouses
Chapter 12
Vikas Agrawal, P. S. Sundararaghavan, Mesbah U. Ahmed, Udayan Nandkeolyar
Data warehouse has become an integral part in developing a DSS in any organization. One of the key architectural issues concerning the efficient... Sample PDF
View Materialization in a Data Cube: Optimization Models and Heuristics
Chapter 13
Athman Bouguettaya, Zaki Malik, Xumin Liu, Abdelmounaam Rezgui, Lori Korff
The ubiquity of the World Wide Web facilitates the deployment of highly distributed applications. The emergence of Web databases and applications... Sample PDF
WebFINDIT: Providing Data and Service-Centric Access through a Scalable Middleware
Chapter 14
James E. Wyse
Location-based mobile commerce (LBMC) incorporates location-aware technologies, wire-free connectivity, and server-based repositories of business... Sample PDF
Retrieval Optimization for Server-Based Repositories in Location-Based Mobile Commerce
Chapter 15
Shing-Han Li, Shi-Ming Huang, David C. Yen, Cheng-Chun Chang
The lifecycle of information system (IS) became relatively shorter compared with earlier days as a result of information technology (IT) revolution... Sample PDF
Migrating Legacy Systems to Web Services Architecture
Chapter 16
Myeong Ho Lee
The trend toward convergence, initiated by advances in ICT, entails the creation of new value chain networks, made up by partnerships between actors... Sample PDF
A Socio-Technical Interpretation of IT Convergence Services: Applying a Perspective from Actor Network Theory and Complex Adaptive Systems
Chapter 17
T. Ariyachandra, L. Dong
Past evidence suggests that organizational transformation from IT implementations is rare. Data warehousing promises to be one advanced information... Sample PDF
Understanding Organizational Transformation from IT Implementations: A Look at Structuration Theory
Chapter 18
Yuan Long, Keng Siau
Drawing on social network theories and previous studies, this research examines the dynamics of social network structures in Open Source Software... Sample PDF
Social Networks Structures in Open Source Software Development Teams
Chapter 19
Susanta Mitra, Aditya Bagchi, A. K. Bandyopadhyay
A social network defines the structure of a social community like an organization or institution, covering its members and their... Sample PDF
Design of a Data Model for Social Networks Applications
About the Contributors