Content-based image retrieval (CBIR) could be described as a process framework for efficiently retrieving images from a collection by similarity. The retrieval relies on extracting the appropriate characteristic quantities describing the desired contents of images. Content-based video retrieval (CBVR) made its appearance in treating video in the similar means as CBIR treating images. Content-based visual information retrieval (CBVIR) combines CBIR and CBVR together (Zhang, 2003). With the progress of electronic equipments and computer techniques for visual information capturing and processing, a huge number of image and video records have been collected. Visual information becomes a well-known information format and a popular element in all aspects of our society. The large visual data make the dynamic research to be focused on the problem of how to efficiently capture, store, access, process, represent, describe, query, search, and retrieve their contents. In the last years, CBVIR has experienced significant growth and progress, resulting in a virtual explosion of published information. It has attracted many interests from image engineering, computer vision and the database community. The current focus of CBVIR is around capturing highlevel semantics, that is, the so-called Semantic-based Visual Information Retrieval (SBVIR). This article will first show some statistics about the research publications on SBVIR in recent years to give an idea about its developments statue. It then gives an overview on several current centers of attention, by summarizing results on subjects such as image and video annotation, human-computer interaction, models and tools for semantic retrieval, and miscellaneous techniques in applications. Finally, some future research directions, the domain knowledge and learning, relevance feedback and association feedback, as well as research at even high levels, such as cognitive level and affective level, are pointed out.
To get a general idea about the scale and progress of research on CBVIR and SBVIR for the past years, several searches in EI Compendex database (http://www.ei.org) for papers published in English from 1995 through 2005 have been made. In Table 1, the results of two searches in the title field for the numbers of English published papers (records) are listed: One term used is “image retrieval (IR)” and other term is “semantic image retrieval (SIR).” The papers found out by the second term should be a subset of the papers found out by the first term. Both numbers are increasing in that period, as seen from Table 1.Table 1.
List of English records found in the title field of EI Compendex
|(1) Image Retrieval||70||89||80||131||155||161||191||233||241||358||417||2126|
|(2) Semantic Image Retrieval||0||1||1||2||4||5||4||8||11||18||30||84|
|Ratio of (2) over (1)||0||1.12||1.25||1.53||2.58||3.11||2.09||3.43||4.56||5.03||7.19||3.95|
Key Terms in this Chapter
Image Engineering: An integrated discipline/subject comprising the study of all the different branches of image and video techniques.
Content-Based Visual Information Retrieval (CBVIR): A combination of CBIR and CBVR.
Content-Based Image Retrieval (CBIR): A process framework for efficiently retrieving images from a collection by similarity. The retrieval relies on extracting the appropriate characteristic quantities describing the desired contents of images. In addition, suitable querying, matching, indexing and searching techniques are required.
MPEG-7: An international standard named “Multimedia content description interface” (ISO/IEC 15938). It provides a set of audiovisual description tools, descriptors and description schemes for effective and efficient access (search, filtering and browsing) to multimedia content.
Content-Based Video Retrieval (CBVR): A process framework for efficiently retrieving required clip from video. The retrieval relies on the organization of video and nonlinear search techniques.