Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

On the Efficiency of Querying and Storing RDF Documents

Maria-Esther Vidal, Amadís Martínez, Edna Ruckhaus, Tomas Lampo, Javier Sierra

Source Title: Graph Data Management: Techniques and Applications

DOI: 10.4018/978-1-61350-053-8.ch016

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

In the context of the Semantic Web, different approaches have been defined to represent RDF documents, and the selected representation affects storage and time complexity of the RDF data recovery and query processing tasks. This chapter addresses the problem of efficiently querying and storing RDF documents, and presents an alternative representation of RDF data, Bhyper, which is based on hypergraphs. Additionally, access and optimization techniques to efficiently execute queries with low cost, are defined on top of this hypergraph based representation. The chapter’s authors have empirically studied the performance of the Bhyper based techniques, and their experimental results show that the proposed hypergraph based formalization reduces the RDF data access time as well as the space needed to store the Bhyper structures, while the query execution time of state-the-of-art RDF engines can be sped up by up to two orders of magnitude.

Chapter Preview

Top

Introduction

Emerging infrastructures such as the Semantic Web, the Semantic Grid, Service Oriented architectures and the Cloud of Linked Data support on-line access to a wealth of ontologies, data sources and Web services. Ontologies play an important role in these infrastructures, and provide the basis for the definition of concepts and relationships that make the recovery and integration of Web data and resources possible. Particularly, in the context of the Cloud of Linked Data, a large number of diverse datasets have become available, and an exponential growth has occurred during the last years. In October 2007, datasets consisted of over 2 billion RDF triples, which were interlinked by over 2 million RDF links. By May 2009 this had grown to 4.2 billion RDF triples interlinked by around 142 million RDF links. At the time this chapter was written, there were 13,112,409,691 triples in the Cloud of Linked Data; datasets can be on medical publications, airport data, drugs, diseases, and clinical trials, among others.

Furthermore, the number of available Web services has rapidly increased during the last few years. For example, the molecular biology databases collection currently includes 1,078 databases (Galperin, 2008) which is 110 more than the previous year (Galperin, 2007). Tools and services as well as the number of instances published by these resources follow a similar progression (Benson, 2007). In addition, thanks to this wealth, users rely more on various digital tasks such as data retrieval from public data sources or from the Cloud of Linked Data, as well as data analysis with Web tools or services organized in complex workflows. Thus, Web architectures need to be tailored for the provision of efficient storage structures and the processing of large number of resources and instances, in order to scale up to user requests.

In the context of the Semantic Web, several query engines have been developed to access RDF documents efficiently (e.g., AllegroGraph; Harth et al., 2007; Ianni et al., 2009; JENA; JENATDB; Neumann & Weikum, 2008; Wielemaker, 2005). The majority of these approaches have developed techniques to generate evaluation plans, and execution engines where these plans can be executed in a way that the processing time is reduced (e.g., AllegroGraph; Neumann & Weikum, 2008; Lampo et al., 2009; Vidal et al., 2010). Additionally, some of these approaches have implemented structures to efficiently store and access RDF data. Tuple Database or TDB (JENATDB) is a persistent graph storage layer for Jena. TDB works with the Jena SPARQL query engine (ARQ) to support SPARQL together with a number of extensions (e.g., property functions, aggregates, arbitrary length property paths).

YARS2 (Yet Another RDF Store, Version 2) (Harth et al., 2007) is a repository for queries against an indexed federation of RDF documents; three types of in-memory indices are used to scan keywords, perform atomic operations on RDF documents, and speed up combinations of patterns or values. RDF-3X (Neumann & Weikum, 2008) focuses on an index system, and its optimization techniques were developed to explore the space of plans that benefit from these index structures. Hexastore (Weiss et al., 2008) is a main memory indexing technique that uses the triple nature of RDF as an asset. RDF data is also indexed in six possible ways, one for each possible triple pattern permutation. Finally, secondary-memory index-based representations for large RDF datasets are presented in (e.g., Fletcher & Beck, 2009; McGlothlin & Khan, 2009; Weiss & Bernstein, 2009).

All these approaches may reduce the execution time of RDF queries; however, for some queries, the solution identified can be far from optimal. For instance, as we will show in this chapter, some queries can be reordered and grouped into small-sized star-shaped groups, and the execution time can be orders of magnitude less than the execution time of the original query. However, because these approaches are not tailored to identify this type of plans or to use their storage structure properties to exploit the benefits of these plans, the performance of these state-of-the-art RDF engines can be poor for this type of queries.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

On the Efficiency of Querying and Storing RDF Documents

Abstract

Introduction

Complete Chapter List