Web Retrieval of XML Documents: Practice and Challenges

Web Retrieval of XML Documents: Practice and Challenges

Barbara Catania, Elena Ferrari
Copyright: © 2003 |Pages: 30
DOI: 10.4018/978-1-59140-041-7.ch009
(Individual Chapters)
No Current Special Offers


Web is characterized by a huge amount of very heterogeneous data sources, that differ both in media support and format representation. In this scenario, there is the need of an integrating approach for querying heterogeneous Web documents. To this purpose, XML can play an important role since it is becoming a standard for data representation and exchange over the Web. Due to its flexibility, XML is currently being used as an interface language over the Web, by which (part of) document sources are represented and exported. Under this assumption, the problem of querying heterogeneous sources can be reduced to the problem of querying XML data sources. In this chapter, we first survey the most relevant query languages for XML data proposed both by the scientific community and by standardization committees, e.g., W3C, mainly focusing on their expressive power. Then, we investigate how typical Information Retrieval concepts, such as ranking, similarity-based search, and profile-based search, can be applied to XML query languages. Commercial products based on the considered approaches are then briefly surveyed. Finally, we conclude the chapter by providing an overview of the most promising research trends in the fields.

Complete Chapter List

Search this Book: