Article Preview
Top1. Introduction
Growth of medical image databases is accelerating in the past few years. In the medical field, digital images such as computed tomography (CT), magnetic resonance imaging (MRI), ultrasound, nuclear medical imaging, endoscopy and microscopy, which were used for diagnostics or therapy, are produced in medical centers ever increasingly and have resulted in large volumes of data (Smeulder, Worring, Santini, Gupta, & Jain, 2000).
In order to deal with these data, it is necessary to develop appropriate information systems to efficiently manage these collections. Image searching is one of the most important services that need to be supported by such systems. In general, two different approaches have been applied to allow searching on image collections: one based on image textual meta data and another based on image content information.
The first retrieval approach is based on attaching textual metadata to each image and uses traditional techniques to retrieve them by keywords (Ogle & Stonebraker, 1995; Lieberman, Rosenzweig, & Singh, 2001). However, these systems require a previous annotation of the database images, which is a very laborious and time-consuming task. Furthermore, the annotation process is usually inefficient because users, generally, do not make the annotation in a systematic way. In fact, different users tend to use different words to describe the same image characteristic. The lack of systematization in the annotation process decreases the performance of the keyword-based image search.
These shortcomings have been addressed by the so-called Content-Based Image Retrieval (CBIR) systems (Smeulder, Worring, Santini, Gupta, & Jain, 2000; Flickner et al., 1995; Rui, Huang, & Chang, 1999). Systems for content-based image retrieval have been introduced in the early 1990s (Muller, Michoux, Bandon, & Geissbuhler, 2004). In these systems, image processing algorithms are used to extract feature vectors that represent image properties such as color, texture, and shape. In this approach, it is possible to retrieve images similar to one chosen by the user (query-by-example). One of the main advantages of this approach is the possibility of an automatic retrieval process, contrasting to the effort needed to annotate images. Generally speaking, CBIR aims at developing techniques that support effective searching and browsing of large image digital libraries on the basis of automatically derived image features (Chen, Wang, & Krovetz, 2003).
Images are particularly complex to manage. Besides the volume they occupy; retrieval is an application-and-context-dependent task (Rui, Huang, & Chang, 1999). It requires the translation of high-level user perceptions into low-level image features (this is the so-called “semantic gap” problem). Moreover, image indexing is not just an issue of string processing (which is the case of standard textual databases). To index visual features, it is common to use numerical values for the n features and then to represent the image or object as a point in an n-dimensional space (Aslandogan & Yu, 1999). Multi-dimensional indexing techniques (Gaede & Gunther, 1998; Bohm, Berchtold, & Keim, 2001) and common similarity metrics (Weber, Schek, & Blott, 1998) are factors to be taken into account. In this context, the main challenges faced are the specification of indexing structures to speed up image retrieval and the query specification as a whole. Furthermore, query processing also depends on cognitive aspects related to visual interpretation. Several other problems – query languages, data mining – contribute to attract computer scientists to this area.