Modeling and Querying XMLBased P2P Information Systems: A Semantics-Based Approach

Alfredo Cuzzocrea

doi:10.4018/978-1-60566-028-8.ch009

Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Modeling and Querying XMLBased P2P Information Systems: A Semantics-Based Approach

Alfredo Cuzzocrea

Source Title: The Semantic Web for Knowledge and Data Management

DOI: 10.4018/978-1-60566-028-8.ch009

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Knowledge representation and management techniques can be efficiently used to improve data modeling and IR functionalities of P2P Information Systems, which have recently attracted a lot of attention from both industrial and academic research communities. These functionalities can be achieved by pushing semantics in both data and queries, and exploiting the derived expressiveness to improve file sharing primitives and lookup mechanisms made available by first-generation P2P systems. XML-based P2P Information Systems are a more specific instance of this class of systems, where the overall data domain is composed by very large, Internet-like distributed XML repositories from which users extract useful knowledge by means of IR methods implemented on top of XML join queries against the repositories. In this chapter, we first focus our attention on the definition and the formalization of the XML-based P2P Information Systems class, also deriving interesting properties on such systems, and then we present a knowledge-representation-and-management-based framework, enriched via semantics, that allows us to efficiently process knowledge and support advanced IR techniques in XML-based P2P Information Systems, thus achieving the definition of the so-called Semantically-Augmented XML-based P2P Information Systems. Also, we complete our analytical contribution with an experimental evaluation of our framework against state-of-the-art IR techniques for P2P networks, and its theoretical analysis in comparison with other similar semantics-based proposals.

Chapter Preview

Top

Introduction

Motivations

During the last years, there has been a growing interest for P2P Information Systems () (Aberer, 2001; Aberer & Despotovic, 2001), mainly because they fit a large number of real-life IT applications. Digital libraries over P2P networks are only a significant instance of , but it is very easy to foresee how large the impact of on innovative and emerging IT scenarios, such as e-government and e-procurement, will be in next years.

P2P networks are natively built on top of a very large repository of data objects (e.g., files), which is intrinsically distributed, fragmented, and partitioned among participant peers. P2P Users are usually interested in (i) retrieving data objects containing information of interest, like video and audio files, and (ii) sharing information with other (participant) users/peers. From the Information Retrieval (IR) perspective, P2P users (i) typically submit short, loose queries by means of keywords derived from natural-language-style questions (e.g., “find all the music files containing Mozart’s compositions” is posed by means of the keywords “compositions” and “Mozart”), and, due to resource sharing purposes, (ii) are usually interested in retrieving as result a set of data objects rather than a specific one. As a consequence, well-founded IR methodologies (e.g., ranking), which have already reached a significant degree of maturity, can successfully be applied in the context of P2P systems in order to improve the capabilities of these systems in retrieving useful information (i.e., knowledge), and achieve performance better than that of more traditional database-like query schemes. On the other hand, the latter schemes are quite inadequate in the absence of fixed, rigorously structured data schemas, as happens in P2P networks.

Furthermore, the consolidate IR mechanism naturally supports the self-alimenting nature of P2P systems, as in such a mechanism intermediate results can then be (re-)used to share new information, or to set and specialize new search activities. As regards schemas, from the database perspective, P2P users typically adopt a semi-structured (data) model to query data objects rather than a structured (data) model. This feature also poses unrecognized problems concerning the issue of integrating heterogeneous data sources over P2P networks. In addition to this, efficiently access data in P2P systems, which is another interesting aspect directly related to our work, is still a research challenge (Aberer et al., 2002).

Basically, P2P IR techniques extend traditional functionalities of P2P systems (i.e., file sharing primitives and simple lookup mechanisms based on partial- or exact-match of strings), by enhancing the latter via useful (and more complex) knowledge extraction features. Accomplishment of the definition and development of innovative knowledge delivery paradigms over P2P networks is the goal that underlies the idea of integrating IR techniques inside core layers of P2P networks. In fact, P2P networks meaningfully marry with the IR philosophy, thus allowing us to (i) successfully exploit self-alimenting mechanisms of knowledge production, and (ii) take advantage from innovative knowledge representation and extraction models based on semantics, metadata management, probability etc. Therefore, without loss of generality, we can claim that IR techniques can be effectively used to support even complex processes like knowledge representation, discovery, and management over P2P networks, being the retrieval of information in the vest of appropriate sets of data objects the basic issue to be faced-off.

Nevertheless, several characteristics of P2P networks pose important limitations to the accomplishment of this goal. Among these, we recall: (i) the completely decentralized nature of P2P networks, which enable peers and data objects to come and go at will; (ii) the absence of global or mediate schemas of data sources, which is very common in real-life P2P networks; (iii) excessive computational overheads that could be introduced when traditional IR methodologies (such as those developed in the context of distribute databases) are applied as-they-are to the context of P2P systems. To overcome these limitations, P2P IR research is devoted to design innovative search strategies over P2P networks, whit the goal of making these strategies as more efficient and sophisticated as possible. A possible solution consists in looking at semantics-based techniques, which is the goal of this chapter.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Modeling and Querying XMLBased P2P Information Systems: A Semantics-Based Approach

Abstract

Introduction

Motivations

Complete Chapter List