Revisiting the Feature and Content Gap for Landmark-Based and Image-to-Image Retrieval in Medical CBIR

Revisiting the Feature and Content Gap for Landmark-Based and Image-to-Image Retrieval in Medical CBIR

Hayit Greenspan (Tel-Aviv University, Israel)
Copyright: © 2011 |Pages: 21
DOI: 10.4018/978-1-60960-780-7.ch005
OnDemand PDF Download:
List Price: $37.50


Medical image content-based retrieval entails several possible scenarios. One scenario relates to retrieving based on image landmarks. In this scenario, quantitative image primitives are extracted from the image content, in an extensive pre-processing phase, following which these quantities serve as metadata in the archive, for any future search. A second scenario is one in which image-to-image matching is desired. In this scenario, the query input is an image or part of an image and the search is conducted by a comparison on the image level. In this paper we review both retrieval scenarios via example systems developed in recent years in our lab. An example for image landmark retrieval for cervix cancer research is described based on a joint collaboration with National Cancer Institute (NCI) and the National Library of Medicine (NLM) at NIH. The goal of the system is to facilitate training and research via a large archive of uterine cervix images
Chapter Preview

Defining The Image Retrieval Task

Medical content-based image retrieval (CBIR) deals with retrieving visual information from medical images. For an extended overview of the role of CBIR within medical information retrieval and health systems such as Picture Archiving and Communications Systems (PACS), see Smeulders, Worring, Santini, Gupta, and Jain, (2000); Müller, Michoux, Bandon, and Geissbühler, (2004); Lehmann, Antani, and Long, (2004). In this paper we focus on visual retrieval tasks. An important retrieval scenario is one in which quantitative parameters are of interest. Examples include: “Retrieve all images from the image archive that contain more than 10% of a certain tissue category”; “Retrieve all images that have a stenosis above 70%”. For such retrieval objectives, the image content needs to be analyzed and processed in order to extract the quantitative metadata of interest. Once the quantitative parameters are extracted, the indexing and search tasks are closely related to text search and are immediate. Image landmarks are high-level features that comprise the image content. Such features can be translated easily into text-based indices for high-level search. The query for them is also text-based, comprised of a descriptor or a quantity of interest. The main challenge in image-landmark based retrieval is not the search, but rather the indexing of the image content and its storage as metadata, along with the image data. If manual indexing is possible, such indexing will require an extended amount of time and is therefore extremely expensive, both in manpower and in man-hours cost. For automated indexing schemes, the challenge is to automatically detect, segment and quantify the image content into a-priori defined set of landmarks. Tools need to be developed that deal with each specific landmark. In the domain of medical images, incorporating expert’s explicit domain knowledge as a-priori information may facilitate the task, providing anatomical constraints on spatial layouts, sizes and more.

In a second retrieval scenario, the query is an image, and the task is to find similar looking images in the database. This retrieval task is often termed an “image-to-image” (or alternatively a “query-by-example”) retrieval task. The output of the task is an ordered set of images, ranked by similarity to the input image. In an image comparison task, two key components need to be addressed: the feature representation of the image, and the similarity measures used for ordering the images. Feature extraction may be pursued in several levels of automation, from complete manual extraction to complete automation (Deserno, Antani, & Long, 2008). A manual process may be labor-intensive and prone to error, yet systems that enable manual intervention may in fact be able to extract more high-level image characteristics and landmarks, that are a key part of landmark-based retrieval. Completely automated systems dealing with features need to operate on several levels of granularity in the image representation – from the global to the more localized representation. This is defined in (Deserno, et al., 2008) as the structure gap, also termed the feature gap. Global parameters that describe an entire image, such as grayscale histograms, are often insufficient for medical applications. More localized features can be extracted for individual regions-of-interest (ROI). For example, color and texture measures computed from specific tissue regions. Additional features may be needed to support spatial information from the individual pixel spatial coordinates to relational features that define the layout of several regions or objects within the image.

Complete Chapter List

Search this Book: