Semantics-based retrieval is a trend of the Content-Based Multimedia Retrieval (CBMR). Typically, in multimedia databases, there exist two kinds of clues for query: perceptive features and semantic classes. In this chapter, we proposed a novel framework for multimedia database organization and retrieval, integrating the perceptive features and semantic classes. Thereunto, a semantics supervised cluster-based index organization approach (briefly as SSCI) was developed: the entire data set is divided hierarchically into many clusters until the objects within a cluster are not only close in the perceptive feature space, but also within the same semantic class; then an index entry is built for each cluster. Especially, the perceptive feature vectors in a cluster are organized adjacently in disk. Furthermore, the SSCI supports a relevance feedback approach: users sign the positive and negative examples regarded a cluster as unit rather than a single object. Our experiments show that the proposed framework can improve the retrieval speed and precision of the CBMR systems significantly.
The advances in the data capturing, storage, and communication technologies have made vast amounts of multimedia data be available to consumer and enterprise applications (Smeulders, 2002). To find needed data from multimedia databases, the initial method is the multimedia data are categorized and labeled according to human semantic understanding, then retrieved with the labeled keywords matching. It is an efficient method to organize a data collection by semantic classification according to people’s custom. However, it is a difficult and expensive manual task to label a large data set with semantic concepts, and the labeling process is subjective, inaccurate and incomplete. Moreover, the amount of the data in one class is too large to looking up. So the researchers proposed a CBMR technology. In the CBMR system, multimedia objects are usually represented by high-dimensional perceptive feature vectors, for example, an image is represented by a visual perceptive feature vector with some number of dimensions, and the similarity between two objects is defined by a distance function, e.g., Euclidean distance, between the corresponding perceptive feature vectors. CBMR is the similarity query. Similarity query is usually implemented by finding k feature vectors most similar to the feature vector of the query example, namely k-nearest neighbor (k-NN) search. Now CBMR has gained a degree of succeed, and a number of techniques extracting low-level perceptive features of multimedia automatically have been brought out. However, one side, there is no efficient index methods for large-scale perceptive features data that is represented by high-dimensional vectors. On the other hand, users of multimedia search engines are generally interested in retrieving data based on semantics, such as a video clip for “shoot events in football games” and so on. But the perceptive features of some data with relevant semantics may not be located very close in the perceptive feature space, or vice versa, the objects with similar perceptive features may come from different semantic classes. The difficulty in supporting semantics lies in the gap between perceptive features and semantic concepts, the so-called semantic gap (Smeulders, 2002). Thus, indexing multimedia data based only on perceptive features sometimes could not provide satisfied solutions.
Typically, there exist usually two kinds of clues for query in a large-scale multimedia database: 1) semantic classes, 2) perceptive features. Intuitively, it is reasonable to develop techniques that combine the advantages of both semantics and perceptive feature index.
In this chapter, we propose a semantics supervised cluster based index approach (briefly as SSCI) to achieve the target. We model the relationship between semantic classes and perceptive feature distributions of the data set with the Gaussian mixture model (GMM). The SSCI method proceeds as follows: the entire data set is divided hierarchically by a modified clustering technique into many clusters until the objects within a cluster not only are close in the perceptive feature space but also are within the same semantic class and the cluster here is called as index cluster, in particular, the perceptive feature vectors in an index cluster are organized adjacently in disk; an index entry (cluster index) including semantic clue and perceptive feature clue is built for each index cluster.
Based on the SSCI, we develop our approximate nearest neighbor (NN) searching technique that consists two phases: the first phase computes the distances between the query example and each cluster index and returns the clusters with the smallest distances, the so-called candidate clusters; then the second phase retrieves the original feature vectors within the candidate clusters to gain the approximate nearest neighbors. The main character of our technique is that it distinctly improves the speed and the semantic precision of CBMR.