Querying Multimedia Data by Similarity in Relational DBMS

Querying Multimedia Data by Similarity in Relational DBMS

Maria Camila Nardini Barioni (Federal University of ABC, Brazil), Daniel dos Santos Kaster (University of Londrina, Brazil), Humberto Luiz Razente (Federal University of ABC, Brazil), Agma J.M. Traina (University of São Paulo at São Carlos, Brazil) and Caetano Traina Júnior (University of São Paulo at São Carlos, Brazil)
DOI: 10.4018/978-1-60960-475-2.ch014

Abstract

Multimedia objects – such as images, audio, and video – do not present the total ordering relationship, so the relational operators (‘<’, ‘=’, ‘=’, ‘>’) are not suitable to compare them. Therefore, similarity queries are the most useful, and often the only types of queries adequate to search multimedia objects stored in a database. Unfortunately, the ubiquitous query language SQL – the most widely employed language in Database Management Systems (DBMS) – does not provide effective support for similarity queries. This chapter presents an already validated strategy that adds similarity queries to SQL, supporting a powerful set of similarity operators. The chapter also describes techniques to store and retrieve multimedia objects in an efficient way and shows existing DBMS alternatives to execute similarity queries over multimedia data.
Chapter Preview
Top

Introduction

With the increasing availability and capacity of recording equipments, managing the huge amount of multimedia data generated has been more and more challenging. Without a proper retrieval mechanism, such data is usually forgotten on a storage device and most of them are never touched again.

As the information embedded into multimedia data is intrinsically complex and rich, the retrieval approaches for such data usually rely on its contents. However, Multimedia Objects (MO) are seldom compared directly, because their binary representation is of little help to understand their content. Rather, a set of predefined features is extracted from the MO, which is thereafter used in place of the original object to perform the retrieval. For example, in Content-Based Image Retrieval (CBIR), images are preprocessed by specific feature extraction algorithms to retrieve their color or texture histograms, polygonal contours of the pictured objects, etc. The features are employed to define a mathematical signature that represents the content of the image regarding specific criteria. The features are employed in the search process.

Although many progress have been achieved in the recent years to handle multimedia content, the development of large-scale applications has been facing problems because existing Database Management Systems (DBMS) lack support for such data. The operators usually employed to compare numbers and small-texts in traditional DBMS are not useful to compare MO. Moreover, MO demand specific indexing structures and other advanced resources, for example, maintaining the query context during a user interaction with a multimedia database.

The most promising approach to overcome these issues is to add support for similarity-based data management inside the DBMS. Similarity can be defined through a function that compares pairs of MO and returns a value stating how similar (close) they are. As it is shown later in this chapter, employing similarity as the basis of the retrieval process allows writing very elaborated queries using a reduced set of operators and developing a consistent and efficient query execution mechanism.

Although a number of works has been reported in the literature describing the basic algorithms to execute similarity retrieval operations on multimedia and other complex object datasets (Roussopoulos et al., 1995, Hjaltason and Samet, 2003, Bohm et al., 2001), there are few works on how to integrate similarity queries into the DBMS core. Some DBMS provide proprietary modules to handle multimedia data and perform a limited set of similarity queries (IBM Corp., 2003, Oracle Corp., 2005, Informix Corp., 1999). However, such approaches are generalist and do not allow including domain-specific resources, which prevent many applications from using them. Moreover, it is worth to note that it is important considering the support of similarity queries in SQL as native predicates to allow representing queries that mix traditional and similarity-based predicates and to execute them efficiently in a Relational DBMS (RDBMS) (Barioni et al., 2008).

This chapter presents the key foundations toward supporting similarity queries as a native resource in RDBMS, addressing the fundamental aspects related to the representation of similarity queries in SQL. It also describes case studies showing how it is possible to perform similarity queries within existing DBMS (Barioni et al., 2006, Kaster et al., 2009).

In the following sections, we describe related work and fundamental concepts, including the general strategy usually adopted to represent and to compare MO, the kinds of similarity queries that can be employed to query multimedia data and some adequate indexing methods. We also discuss issues regarding the support of similarity queries in relational DBMS, presenting the current alternatives and also an already validated approach to seamlessly integrate similarity queries in SQL. There is also a description of case studies for the enhancement of existing DBMS with appropriate techniques to store multimedia data and algorithms to efficiently execute similarity queries over them. Finally, we conclude the chapter and give future research directions on multimedia retrieval support in DBMS.

Complete Chapter List

Search this Book:
Reset