Article Preview
Top1. Introduction
With the rapidly growing number of digital images found on the Internet and housed in digital libraries, the need for effective and efficient tools to manage large image databases has grown dramatically. Specifically, the development of efficient image retrieval systems to find images of interest in this haystack of data has become an active research area in recent years (Thomee, 2010).
Content-based image retrieval (CBIR) techniques (Datta et al., 2008; Lew et al., 2006) are viable solutions to find desired images from multimedia databases and have evolved significantly since the early 1990s. They make use of low-level visual image features (e.g., color, texture, shape, etc.) instead of keywords to represent images, where each feature can be automatically and consistently extracted without human intervention. Consequently, they overcome the limitations entailed by text-based image retrieval, which include the large amount of manual labor required to annotate each image in the database, and the inconsistency among different annotators in perceiving the same image. However, as the ranking of retrievals is calculated based on selected image features, the retrieval accuracy may be unsatisfactory due to the semantic gap between low-level visual features and high-level semantic concepts. This semantic gap exists because objects of the same type do not have the same visual representation. For example, images of similar semantic content may be scattered far away from each other in the feature space, while images of dissimilar semantic content may share similar low-level features. To bridge the semantic gap, a great deal of research work has been focused on developing effective relevance feedback (RF) techniques (Liu et al., 2007; Zhou & Huang, 2003), which utilize users’ interaction to learn better representation of images as well as the query concept. RF, as an interactive search technique, has been used in CBIR systems to repeatedly modify the query descriptive information (feature, matching models, metrics or any meta knowledge) as response to the users’ feedback on retrieved results. Therefore, it learns the query close to its optimal and returns more user-desired images (i.e., improves the retrieval precision) after each round.