In the past decade, there has been rapid growth in the use of digital media, such as images, video, and audio. As the use of digital media increases, retrieval and management techniques become more important in order to facilitate the effective searching and browsing of large multimedia databases. Before the emergence of content-based retrieval, media was annotated with text, allowing the media to be accessed by text-based searching. Through textual description, media is managed and retrieved based on the classification of subject or semantics. This hierarchical structure, like yellow pages, allows users to easily navigate and browse, or search using standard Boolean queries. However, with the emergence of massive multimedia databases, the traditional text-based search suffers from the following limitations (Wei, Li, & Wilson, 2006): Manual annotations require too much time and are expensive to implement. As the number of media in a database grows, the difficulty in finding desired information increases. It becomes infeasible to manually annotate all attributes of the media content. Annotating a 60-minute video, containing more than 100,000 images, consumes a vast amount of time and expense. Manual annotations fail to deal with the discrepancy of subjective perception. The phrase, “an image says more than a thousand words,” implies that the textual description is sufficient for depicting subjective perception. To capture all concepts, thoughts, and feelings for the content of any media is almost impossible. Some media contents are difficult to concretely describe in words. For example, a piece of melody without lyric or irregular organic shape cannot easily be expressed in textual form, but people expect to search media with similar contents based on examples they provided. In an attempt to overcome these difficulties, content- based retrieval employs content information to automatically index data with minimal human intervention.
Content-based retrieval has been proposed by different communities for various applications. These applications include:
Medical diagnosis: The medical community is currently developing picture archiving and communication systems (PACS), which integrate imaging modalities and interfaces with hospital and departmental information systems in order to manage the storage and distribution of images to radiologists, physicians, specialists, clinics, and imaging centers. A crucial requirement in PACS is to provide an efficient search function to access desired images. As images with the similar pathology-bearing regions can be found and interpreted, those images can be applied to aid diagnosis for image-based reasoning. For example, Wei and Li (2006) proposed a content-based retrieval system for locating mammograms with similar pathological characteristics.
Intellectual property: Trademark image registration has applied content-based retrieval techniques to compare a new candidate mark with existing marks to ensure that there is no repetition. Copyright protection can also benefit from content-based retrieval as copyright owners are able to search and identify unauthorized copies of images on the Internet. For example, Jiang, Ngoa, and Tana (2006) developed a content-based system using adaptive selection of visual features for trademark image retrieval.
Broadcasting archives: Every day, broadcasting companies produce a lot of audio-visual data. To deal with these large archives, which can contain millions of hours of video and audio data, content-based retrieval techniques are used to annotate their contents and summarize the audio-visual data to drastically reduce the volume of raw footage. For example, Lopez and Chen (2006) developed a content-based video retrieval system to support news and sports retrieval.
Multimedia searching on the Internet: Although a large amount of multimedia has been made available on the Internet for retrieval, existing search engines mainly perform text-based retrieval. To access the various media on the Internet, content-based search engines can assist users in searching the media with the most similar contents based on queries. For example, Khan (2007) designed a framework for image annotation and used ontology to enable content-based image retrieval on the Internet.
Key Terms in this Chapter
Query by Example: A method of searching a database using example media as search criteria. This mode allows the users to select predefined examples requiring the users to learn the use of query languages.
Boolean Query: A query that uses Boolean operators (AND, OR, and NOT) to formulate a complex condition. A Boolean query example can be “university” OR “college.”
Feature Extraction: A subject of multimedia processing which involves applying algorithms to calculate and extract some attributes for describing the media.
Content-Based Retrieval: An application that directly makes use of the contents of media, rather than annotation inputted by the human, to locate the desired data in large databases.
Relevance Feedback: A technique that requires users to identify positive results by labeling those which are relevant to the query, and subsequently analyzes the user’s feedback using a learning algorithm.
Semantic Gap: The difference between the high-level user perception of the data and the lower-level representation of the data used by computers. As high-level user perception involves semantics which cannot be directly translated into logic context, bridging the semantic gap is considered a challenging research problem.
Similarity Measure: A measure that compares the similarity of any two objects represented in the multidimensional space. The general approach is to represent the data features as multidimensional points and then to calculate the distances between the corresponding multidimensional points.