Image Retrieval Practice and Research

Image Retrieval Practice and Research

JungWon Yoon (University of South Florida, USA)
Copyright: © 2015 |Pages: 10
DOI: 10.4018/978-1-4666-5888-2.ch587
OnDemand PDF Download:
$30.00
List Price: $37.50

Chapter Preview

Top

Background

Image Type and Features

Visual materials have been a form of written communication for a long time. Images can communicate information which cannot be delivered through phonetic alphabet; however, compared to text documents, there were difficulties in generating, distributing, and accessing images (Jörgensen, 2003). Recently, the development of digital technologies enhanced the usability and accessibility of images. The use of images is pervasive; it is used for face and fingerprint identification, medical purposes, trademark or logo searching, education, art and historical research, journalistic work, entertainment, online shopping, and so on. As much as the use of images is diverse, types of images also vary. Enser (2008) suggested image taxonomy as follows: direct picture which can be viewed with normal human visible spectrum, indirect picture which requires equipment for viewing the picture (e.g., images used in medical field, molecular biology, archeology, etc.), hybrid picture which integrates texts (e.g., posters or cartoons), and visual surrogate including drawing, diagram, map/chart/plan, and device. Smeulders, Worring, Santini, Gupta, and Jain (2000) categorized image domain into broad and narrow domains. In broad domain, images are polysemic and the interpretation of an image is not unique, whereas in narrow domain, images can be interpreted in a limited and predictable way. Among various types of images, this chapter focuses on image indexing and retrieval of general photographic images, which belong to the direct picture in Enser’s taxonomy and the broad domain in Smeulders et al.’s category. Therefore, special types of images, such as medical images, face, trademarks, drawings, maps, symbols, cartoons, are beyond the scope of this chapter.

Images, especially those included in Smeulders et al.’ broad domain, have unique features that make image indexing and retrieval difficult. First, an image has multi-layered meanings, and image interpretation is subjective and personal. An image may convey different meanings to different people, depending on their socio-cultural background, image need and usage purpose, disciplines, and other contextual background. Therefore, any set of index terms given by (an) indexer(s) or a creator may be different from viewers’ interpretations on that image, and this has been the main problem of text-based image retrieval. Second, visual similarity does not always match with conceptualization. In other words, one conceptual object (e.g., glass) may have very different visual appearances, and two different conceptual objects (e.g., a starfish and the statue of liberty) may have similar visual appearance (Johansson, 2000). Third, there are concepts which do not have specific visual features, such as places (e.g. Florida), events (e.g., party), and abstract, symbolic, conceptual, and emotional concepts (e.g., poverty, celebrity, stylish), and so on. The gap between visual features of an image and semantic meanings that people recognize from the image are named “semantic gap” in the Content-Based Image Retrieval (CBIR) research field. The CBIR community recognizes that semantic meanings cannot be extracted solely from visual features, because human image retrieval process associates contextual background (Johansson, 2000; Jörgensen, 2007).

Key Terms in this Chapter

Text-Based Image Retrieval: Image indexing and retrieval techniques which are based on textual information, such as title, keywords, captions, image descriptions, and so on. Also known as Concept-Based or Description-Based Image Retrieval.

Tags (Social Tagging): Keywords or descriptions which are usually provided by a creator or viewers of the information object, especially in Web 2.0 environment. Since group of users participates in describing information objects, it is considered collaborative indexing mechanism and it can represent user-oriented index terms.

Content-Based Image Retrieval (CBIR): Image indexing and retrieval techniques which use image contents, that is, low-level (primitive) features of an image, such as color, shapes, textures, and so on. Queries are also provided in a form of images (sketches or image examples).

Semantic Gap: The gap between primitive (low-level) features of an image and semantic meanings that people recognize from the image.

Semantic Image Retrieval: Image retrieval approaches which focus on automatic methods of extracting semantically meaningful representations from low-level features of images. Reducing semantic gap is a main concern of semantic image retrieval efforts.

Controlled vocabulary: A list of terms which represent selected concepts and semantic relations among those selected concepts. For subject access, synonyms, broader and narrower concepts, and related concepts are usually provided. Subject headings, thesaurus, and ontologies are the types of controlled vocabulary.

Complete Chapter List

Search this Book:
Reset