Advanced Techniques for Object-Based Image Retrieval

Advanced Techniques for Object-Based Image Retrieval

Yu-Jin Zhang (Tsinghua University, China)
DOI: 10.4018/978-1-60566-026-4.ch011
OnDemand PDF Download:
$37.50

Abstract

Along with the progress of imaging modality and the wide utility of digital images (including video) in various fields, many potential content producers have emerged, and many image databases have been built. Because images require large amounts of storage space and processing time, how to quickly and efficiently access and manage these large, both in the sense of information contents and data volume, databases has become an urgent problem. The research solution for this problem, using content-based image retrieval (CBIR) techniques, was initiated in the last decade (Kato, 1992). An international standard for multimedia content descriptions, MPEG-7, was formed in 2001 (MPEG). With the advantages of comprehensive descriptions of image contents and consistence to human visual perception, research in this direction is considered as one of the hottest research points in the new century (Castelli, 2002; Zhang, 2003; Deb, 2004). Many practical retrieval systems have been developed; a survey of near 40 systems can be found in Veltkamp (2000). Most of them mainly use low-level image features, such as color, texture, and shape, etc., to represent image contents. However, there is a considerable difference between the users’ interest in reality and the image contents described by only using the above low-level image features. In other words, there is a wide gap between the image content description based on low-level features and that of human beings’ understanding. As a result, these low-level featurebased systems often lead to unsatisfying querying results in practical applications. To cope with this challenging task, many approaches have been proposed to represent and describe the content of images at a higher level, which should be more related to human beings’ understanding. Three broad categories could be classified: synthetic, semantic, and semiotic (Bimbo, 1999; Djeraba, 2002). From the understanding point of view, the semantic approach is natural. Human beings often describe image content in terms of objects, which can be defined at different abstraction levels. In this article, objects are considered not only as carrying semantic information in images, but also as suitable building blocks for further image understanding. The rest of the article is organized as follows: in “Background,” early object-based techniques will be briefly reviewed, and the current research on object-based techniques will be surveyed. In “Main Techniques,” a general paradigm for object-based image retrieval will be described; and different object-based techniques, such as techniques for extracting meaningful regions, for identifying objects, for matching semantics, and for conducting feedback are discussed. In “Future Trends,” some potential directions for further research are pointed out. In “Conclusion,” several final remarks are presented.
Chapter Preview
Top

Introduction

Along with the progress of imaging modality and the wide utility of digital images (including video) in various fields, many potential content producers have emerged, and many image databases have been built. Because images require large amounts of storage space and processing time, how to quickly and efficiently access and manage these large, both in the sense of information contents and data volume, databases has become an urgent problem. The research solution for this problem, using content-based image retrieval (CBIR) techniques, was initiated in the last decade (Kato, 1992). An international standard for multimedia content descriptions, MPEG-7, was formed in 2001 (MPEG). With the advantages of comprehensive descriptions of image contents and consistence to human visual perception, research in this direction is considered as one of the hottest research points in the new century (Castelli, 2002; Zhang, 2003; Deb, 2004).

Many practical retrieval systems have been developed; a survey of near 40 systems can be found in Veltkamp (2000). Most of them mainly use low-level image features, such as color, texture, and shape, etc., to represent image contents. However, there is a considerable difference between the users’ interest in reality and the image contents described by only using the above low-level image features. In other words, there is a wide gap between the image content description based on low-level features and that of human beings’ understanding. As a result, these low-level feature-based systems often lead to unsatisfying querying results in practical applications.

To cope with this challenging task, many approaches have been proposed to represent and describe the content of images at a higher level, which should be more related to human beings’ understanding. Three broad categories could be classified: synthetic, semantic, and semiotic (Bimbo, 1999; Djeraba, 2002). From the understanding point of view, the semantic approach is natural. Human beings often describe image content in terms of objects, which can be defined at different abstraction levels. In this article, objects are considered not only as carrying semantic information in images, but also as suitable building blocks for further image understanding.

The rest of the article is organized as follows: in “Background,” early object-based techniques will be briefly reviewed, and the current research on object-based techniques will be surveyed. In “Main Techniques,” a general paradigm for object-based image retrieval will be described; and different object-based techniques, such as techniques for extracting meaningful regions, for identifying objects, for matching semantics, and for conducting feedback are discussed. In “Future Trends,” some potential directions for further research are pointed out. In “Conclusion,” several final remarks are presented.

Key Terms in this Chapter

Content-Based Image Retrieval (CBIR): A process framework for efficiently retrieving images from a collection by similarity. The retrieval relies on extracting the appropriate characteristic quantities describing the desired contents of images. In addition, suitable querying, matching, indexing, and searching techniques are required.

Semiotics: The science that analyzes signs and sign systems and puts them in correspondence with particular meanings. It provides formal tools for image knowledge acquisition, generation, representation, organization, and utilization in the context of CBIR.

MPEG-7: This is an international standard named “multimedia content description interface” (ISO/IEC 15938). It provides a set of audiovisual description tools, descriptors, and description schemes for effective and efficient access (search, filtering, and browsing) to multimedia content.

Metadata: A data format that may contain numerous information, including information obtained indirectly from the image, as well as information related to the actual description of the image content. At the highest level, images are often accompanied and associated by metadata.

Complete Chapter List

Search this Book:
Reset