XML Object Identification

XML Object Identification

Copyright: © 2014 |Pages: 31
DOI: 10.4018/978-1-4666-5198-2.ch007
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

For the ability to represent data from a wide variety of sources, XML is rapidly emerging as the new standard for data representation and exchange on Web and e-government. To effectively use XML data in practice, entity resolution, which has been proven extremely useful in data fusion, inconsistency detection, and data repairing, must be in place to improve the quality of the XML data. In this chapter, the authors deal specifically with object identification on XML data, the application of which includes XML document management in highly dynamic applications like the Web and peer-to-peer systems, detection of duplicate elements in nested XML data, and finding similar identities among objects from multiple Web sources. The authors survey techniques of pairwise and groupwise entity resolution for XML data, which adopt structured information to describe the similarity or distance of XML data, like XML document and XML elements in document, and find the matching pairs which describe same object or classify them into separate groups, each group corresponding to the same object in real world. There are a lot of ways to describe the XML structure and content, such as a tree, Bayesian network, and set. The authors introduce some well-known algorithm base on these structures to solve matching XML data problems. Finally, the authors discuss directions for future research.
Chapter Preview
Top

Main Focus Of The Chapter

In pairwise entity resolution, which is also named as XML document matching or element matching, the main work concentrates on the similarity or distance of XML data. Compared with structured or unstructured data, the outstanding feature of XML data is its abundant structure information, with this respect, the most used matching approach is to describe the similarity or distance of XML document with structure information. There are many ways to describe the similarity of XML document structure, such as similarity of tree when using tree to simulate XML document structure, XMLDup system using Bayesian network similarity(Leit et al. 2007), similarity of sets when extract XML documents into set.

Complete Chapter List

Search this Book:
Reset