Efficient Face Retrieval Based on Bag of Facial Features

Efficient Face Retrieval Based on Bag of Facial Features

Yuanjia Du (Trident Microsystems Europe B.V., The Netherlands), Yuanjia Du (Trident Microsystems Europe B.V., Netherlands), Ling Shao (The University of Sheffield, United Kingdom), Ling Shao (The University of Sheffield, United Kingdom), P. Archambeau (University of Liège, Belgium), S. Erpicum (University of Liège, Belgium) and M. Pirotton (University of Liège, Belgium)
Copyright: © 2011 |Pages: 15
DOI: 10.4018/978-1-61520-991-0.ch005
OnDemand PDF Download:
List Price: $37.50


In this chapter, the authors present an efficient retrieval technique for human face images based on bag of facial features. A visual vocabulary is built beforehand using an invariant descriptor computed on detected image regions. The vocabulary is refined in two ways to make the retrieval system more efficient. Firstly, the visual vocabulary is minimized by only using facial features selected on face regions which are detected by an accurate face detector. Secondly, three criteria, namely Inverted-Occurrence-Frequency Weights, Average Feature Location Distance and Reliable Nearest-Neighbors, are calculated in advance to make the on-line retrieval procedure more efficient and precise. The proposed system is experimented on the Caltech Human Face Database. The results show that this technique is very effective and efficient on face image retrieval.
Chapter Preview


The last decade has witnessed great interest in research on content-based image retrieval. This has paved the way for a large number of new techniques and systems, and a growing interest in associated fields to support such systems. Likewise, digital imagery has expanded its horizon in many directions, resulting in an explosion in the volume of image data required to be organized. As a result, image retrieval techniques are becoming increasingly important in multimedia information systems (Datta, et al. 2005).

Early image retrieval techniques focused on the retrieval of entire images (Smeulders, et al. 2000). Given a query image, the goal was to retrieve entire scenes or it was assumed that images contain a single object occupying most of the image. Background clutter or partial occlusions were not explicitly handled. Intra-class variations, camera viewpoint or illumination variations were usually not explicitly modeled. Later, some approaches tried to extract ‘objects’ from images by segmenting them into regions with coherent image properties like color or texture, however, such systems enjoyed only limited successes since the segments and description of segments are crude (Sivic, 2006).

Recently, Local Invariant Regions based retrieval has emerged as a cutting edge methodology in specific object retrieval. The first influential image retrieval algorithm using local invariant regions was introduced by Schmid and Mohr (Schmid & Moger, 1997). The local regions that are invariant to rotation, translation and scaling are detected around Harris corner points (Harris & Stephens, 1988). Differential greyvalue invariants are used to characterize the detected invariant regions in a multi-scale way to ensure invariance under similarity transformations and scale changes. Semi-local constraints and a voting algorithm are then applied to reduce the number of mis-matches. Van Gool et al. (2001) described a method for finding occurrences of the same object or scene in a database using local invariant regions. Both geometry-based and intensity-based regions are employed. The geometry-based regions are extracted by first selecting Harris corner points as ‘anchor points’ then finding nearby edges detected by Canny’s detector (Canny, 1986) to construct invariant parallelograms. The intensity-based regions are defined around local extrema in intensities. The intensity function along rays emanating from a local extremum is evaluated. An invariant region is constructed by linking those points on the rays where the intensity function reaches extrema. Color moment invariants introduced in (Mindru, et al. 1998) are adopted as region descriptor for characterizing the extracted regions. A voting process is carried out by comparing the descriptor vectors of the query and test images to select the most relevant images to the query. False positives are further rejected using geometric and photometric constraints. An image retrieval technique based on matching of distinguished regions is presented in (Obdrzalek & Matas, 2002). The distinguished regions are the Maximally Stable Extremal Regions introduced in (Matas, et al., 2002). An extremal region is a connected area of pixels that are all brighter or darker than the pixels on the boundary of the region. Local invariant frames are then established on the detected regions by studying the properties of the covariance matrix and the bi-tangent points. Correspondences between local frames are evaluated by directly comparing the normalized image intensities. Matching between query and database images are then done based on the number and quality of established correspondences.

Complete Chapter List

Search this Book: