Structural Similarity Measures in Sources of XML Documents
Giovanna Guerrini (Universita di Genova, Italy), Marco Mesiti (Universita degli Studi di Milano, Italy) and Elisa Bertino (Purdue University, USA)
Copyright: © 2006
This chapter discusses existing approaches to evaluate and measure structural similarity in sources of XML documents. A relevant peculiarity of XML documents, indeed, is that information on the document structure is available in the document itself. In the chapter we present different approaches aiming at evaluating structural similarity at three different levels: among documents, between a document and a schema, and among schemas. The most relevant applications of such measures are for document classification and schema extraction, and for document and schema structural clustering, though other interesting applications such as document change detection and structural querying can be devised, and will be discussed throughout the chapter.