Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

A Performance Analysis of Semantic Caching for XML Query Processing

Boris Novikov, Alice Pigul, Anna Yarygina

Source Title: International Journal of Knowledge-Based Organizations (IJKBO) 3(4)

DOI: 10.4018/ijkbo.2013100103

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Caching is important for any system attempting to achieve high performance. The semantic caching is an approach trying to benefit from the certain knowledge of data semantics. The authors expect that this information might enable reuse of semantically close data rather than exactly equal to cached data in the traditional system. However, the major obstacle for extensive application of semantic caching for any data model or query language is the computational complexity of the query containment problem, which is, in general, undecidable. In this article the authors introduce and compare three approximate conservative query matching algorithms for semantic caching of semi-structured queries. The authors then analyze their applicability for distributed query processing. Based on this analysis, the authors outline few scenarios where semantic caching can be beneficial for query processing in a distributed system of heterogeneous semi-structured information resources.

Article Preview

Top

Introduction

The need to uniformly access heterogeneous information resources triggered a huge amount of research in the area of information integration performed in last decades. Several industrial strength integration software tools are currently available. However, the query evaluation performance delivered by these tools is still far from practically acceptable for any complex queries and is often unpredictable. Common techniques for performance improvement, such as indexing, query optimization, and caching, are well understood and developed for traditional databases but should be revised and re-evaluated in the integrated distributed heterogeneous environment.

The focus of this research is on advanced semantic caching techniques which are expected to be especially good in a distributed heterogeneous environment. In contrast with usual caching techniques, in addition to the usage patterns, semantic caching takes into account relationships and inter dependencies between data items to improve cache effectiveness and extend the ability of local query processing of cached data instead of distributed query processing.

The goal of this research is to evaluate the potential impact of different approaches to semantic caching and obtain quantitative estimations of semantic caching suitability for XML query processing.

The caching is known as important technique for any system attempting to achieve high performance. The idea of any caching is to reduce the number of repetitive accesses to the data items located in relatively slow large storage. Instead, copies of data items are temporarily stored in a fast (and relatively small) memory.

In almost all implementations cache is completely transparent for the application or system that uses the cached data. That is, the presence of cache affects performance, but not the functionality of the application, and, moreover, the application cannot explicitly control the behavior of the cache.

In this research, the data items copied from the primary data source into a cache are called cache units. Depending on the nature of cache units, physical, logical, and semantic caching are distinguished (Figure 1).

Figure 1.

Physical, logical, and semantic cache

In a common physical caching the cache units are pages, blocks, or other low-level data structures defined for a physical storage. Physical caching is hardly applicable for distributed environment because low level units are not related to the units transferred over a network, and page sizes may vary from site to site.

When a logical cache is used, high level objects, such as web pages, are cached instead of physical blocks. A potential disadvantage of a logical caching is that the cache unit is considered atomic and can be used only for exactly the same request.

The caching is called semantic caching if the cache units are meaningful items or sets in the high-level application data model. Thus, parts of a cache unit may be used to process queries other than original one, or data from several cache units may be used together to evaluate a new query.

The semantics of data in a semantic cache is usually represented with queries (Chidlovskii & Borgho, 2000; Dar, Franklin, Jonsson, Srivastava, & Tan, 1996; Qun Ren & Dunham, 2003). To find data in semantic cache one has to match queries instead of page addresses used in a physical cache.

The expectation is that the above mentioned should improve reuse of the cached data. On the other hand, the extraction of cached data becomes more complex and may result in certain performance penalties. Hence, our goal is to find circumstances and conditions under which the semantic caching is beneficial.

A significant amount of work in the area of semantic caching for different data models has been done by research community during the last two decades. It turns out that the major restricting factor for effective use of semantic caching is the query containment problem which is proven to be decidable only for limited class of queries and has polynomial complexity for much smaller class of queries.

To make the semantic caching efficient, one has to restrict the class of cached queries and use an approximate algorithm checking if a query can be evaluated based on the cache contents only. If an approximate algorithm can produce false positive answers, the use of cache may result in data loss in the query responses.

Complete Article List

Search this Journal:

Reset

Volume 14: 1 Issue (2024): Forthcoming, Available for Pre-Order

Volume 13: 1 Issue (2023)

Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming

Volume 11: 4 Issues (2021)

Volume 10: 4 Issues (2020)

Volume 9: 4 Issues (2019)

Volume 8: 4 Issues (2018)

Volume 7: 4 Issues (2017)

Volume 6: 4 Issues (2016)

Volume 5: 4 Issues (2015)

Volume 4: 4 Issues (2014)

Volume 3: 4 Issues (2013)

Volume 2: 4 Issues (2012)

Volume 1: 4 Issues (2011)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

A Performance Analysis of Semantic Caching for XML Query Processing

Abstract

Introduction

Complete Article List