An Innovative Multiple-Object Image Retrieval Framework Using Hierarchical Region Tree

An Innovative Multiple-Object Image Retrieval Framework Using Hierarchical Region Tree

Wei-Bang Chen (Department of Mathematics and Computer Science, Virginia State University, Petersburg, VA, USA) and Chengcui Zhang (Department of Computer and Information Sciences, University of Alabama at Birmingham, Birmingham, AL, USA)
DOI: 10.4018/jmdem.2013070101
OnDemand PDF Download:


Inaccurate image segmentation often has a negative impact on object-based image retrieval. Researchers have attempted to alleviate this problem by using hierarchical image representation. However, these attempts suffer from the inefficiency in building the hierarchical image representation and the high computational complexity in matching two hierarchically represented images. This paper presents an innovative multiple-object retrieval framework named Multiple-Object Image Retrieval (MOIR) on the basis of hierarchical image representation. This framework concurrently performs image segmentation and hierarchical tree construction, producing a hierarchical region tree to represent the image. In addition, an efficient hierarchical region tree matching algorithm is designed for multiple-object retrieval with a reasonably low time complexity. The experimental results demonstrate the efficacy and efficiency of the proposed approach.
Article Preview

1. Introduction

The evolution of digital technology promotes information storage migrating from analogue to digital form, and results in convenient information sharing and distribution (Li et al., 2000). Since 1980, the digital revolution has driven the explosion of digital devices on the market, which makes digital imaging emerge from its infancy in the past decade. As the adage suggests, “a picture is worth a thousand words.” Information embedded in an image usually provides a more clear and succinct way to present an idea than a substantial amount of text. The emerging needs in retrieving information from images brings researchers’ attention, and thus, image retrieval has been an extremely active research area in the past decade. Many efforts have been made to address this challenging issue. These efforts can be classified into two categories: (1) text-based image search, and (2) content-based image retrieval (CBIR).

In most conventional text-based image search systems such as Flickr, all images in the search scope must first be annotated. The annotations such as the file name, caption, keywords, tags, and other text-based descriptions, are stored in the associated metadata. Then, the text-based database management systems (DBMS) retrieve images based on the annotations stored in the associated metadata (Luo et al., 2003). The major problems of text-based image retrieval systems are: (1) they heavily rely on image annotations or surrounding text rather than semantic content, and thus, cannot distinguish homonyms; (2) it would be difficult to precisely describe all visual content in an image with a limited set of words (Luo et al., 2003), and the perception and interpretation of visual content varies from person to person.

In contrast to text-based image search, content-based image retrieval (CBIR) has been introduced to cope with the issues that arise in text-based image retrieval systems. CBIR systems search images based on the visual content of images. The concept of CBIR was first introduced by Kato in 1992 to describe the automatic process of retrieving images from an image database according to the visual features extracted from images (Kato, 1992). CBIR systems view the query image and all target images in the database as a collection of primitive visual features such as color, texture, shape, and spatial location. On the basis of these primitive visual features, CBIR systems measure the similarity between a query image and each target image in the database. Then, the target images are ranked in the decreasing order of their similarities to the query image (Chen et al., 2004). From this image retrieval process, three fundamental bases can be summarized for content based image retrieval framework, namely primitive visual feature extraction; multi-dimensional indexing; and retrieval system design (Rui et al., 1997).

Content based image retrieval systems can be further categorized into two major approaches, including full image search and object-based image retrieval. The full image search retrieves images based on the global visual features extracted from the whole image (Samadani et al., 1993; Pentland et al., 1994; Kelly and Cannon, 1995; Stone and Li, 1996; Wong and Po, 2004). In general, full image search is relatively simple and efficient, but less human-centered. The reason is that humans find images based on the high level concepts, such as objects or scenes; however, global visual features used in the full image search cannot capture the properties of those high level concepts.

In contrast to full image search, another line of approaches is object-based image retrieval which attempts to capture the high level concepts embedded in images such as objects. In order to perform object-based search, it is essential to extract objects embedded in images. This extraction process is called image segmentation which splits images into meaningful regions, each of which represents a constituent object.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing