Performing face recognition under extreme poses and lighting conditions remains a challenging task for current state-of-the-art biometric algorithms. The recognition task is even more challenging when there is insufficient training data available in the gallery, or when the gallery dataset originates from one side of the face while the probe dataset originates from the other. The authors present a new method for computing the distance between two biometric signatures acquired under such challenging conditions. This method improves upon an existing Semi-Coupled Dictionary Learning method by computing a jointly-optimized solution that incorporates the reconstruction cost, the discrimination cost, and the semi-coupling cost. The use of a semi-coupling term allows the method to handle partial 3D face meshes where, for example, only the left side of the face is available for gallery and the right side of the face is available for probe. The method also extends to 2D signatures under varying poses and lighting changes by using 3D signatures as a coupling term. The experiments show that this method can improve recognition performance of existing state-of-the-art wavelet signatures used in 3D face recognition and provide excellent recognition results in the 3D-2D face recognition application.
Top1. Introduction
With the proliferation of newer sensors capable of capturing higher quality data, the problem of face recognition remains ever interesting for its wide applicability to numerous real life applications (e.g., law enforcement, aid distribution, health care administration, personnel management, video game augmentation, virtual personal assistance). While the problem of face recognition from a frontal view without occlusions under various lighting conditions has been extensively investigated, few have addressed the problem by using a bridging three-dimensional (3D) representation of the data.
Three dimensional data of the face is an empowering medium. The Face Recognition Vendor Test (FRVT) 2006 (Phillips et al., 2007) demonstrated the usefulness of 3D data in the task of recognition. In this large-scale test, algorithms using 3D data performed equally well against algorithms that use very high resolution 2D data in the task of face recognition. However, algorithms using 3D shape data have an inherent advantage at dealing with variations in pose and illumination. Although the cost of 3D sensors is still high, the success of household entertainment devices capable of producing depth information (e.g., Microsoft Kinect) suggests that 3D sensors will be widely available in the near future. Thus, it is especially beneficial to take advantage of 3D data in addressing current recognition challenges involving either 3D or 2D data.
One of these challenges is incomplete or partial 3D data due to occlusion. For example, when the subject in front of the camera looks sideways, the sensor can only capture part of the face due to the face’s convexity. Because there is significant visual difference between a partial face and a complete face, the ability to recognize subjects with only a partial face is particularly desirable.
Existing methods address this challenge by taking advantage of the intrinsic symmetry of the face (Passalis & Perakis, 2011), splitting the face into multiple parts (Li, Imai, & Kaneko, 2010), or learning the variation of the face through robust descriptors (Liao & Jain, 2011). These methods, however, suffer from specific limitations ranging from an exhaustive weight search step to reduce the size of the signature (Passalis & Perakis, 2011), being limited to use with frontal faces (Li et al., 2010), or the dictionary for the descriptors being neither optimized for the task of recognition nor for reconstruction of the original descriptors (Liao & Jain, 2011).
Another of these challenges is to handle 2D face recognition under both strong pose and lighting condition changes. These variables result in great variations in the observed data (Figure 13) and make the recognition task particularly challenging. Previous works have addressed this problem by fitting appearance models, using pose-robust and light-robust descriptors or by hallucinating 3D views of the 2D subject.
Figure 13. Data available per subject in UHDB11. Each column is one lighting condition. Each row is one subject's pose.
Contrary to existing methods, our approach tackles the problem of face recognition using 3D partial data and the problem of 2D face recognition under pose and lighting variations by learning a dictionary that jointly optimizes for reconstruction cost, discrimination cost, and semi-coupling cost. The semi-coupling cost favors dictionaries learned for different partial and 2D faces to be linearly mappable to the encoding of 3D frontal faces. Our contributions include the following: