Multimodal Biometrics Fusion for Human Recognition in Video

Multimodal Biometrics Fusion for Human Recognition in Video

Xiaoli Zhou (University of California - Riverside, USA) and Bir Bhanu (University of California - Riverside, USA)
DOI: 10.4018/978-1-60566-725-6.ch019
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

This chapter introduces a new video based recognition system to recognize noncooperating individuals at a distance in video, who expose side views to the camera. Information from two biometric sources, side face and gait, is utilized and integrated for recognition. For side face, an enhanced side face image (ESFI), a higher resolution image compared with the image directly obtained from a single video frame, is constructed, which integrates face information from multiple video frames. For gait, the gait energy image (GEI), a spatiotemporal compact representation of gait in video, is used to characterize human walking properties. The features of face and gait are extracted from ESFI and GEI, respectively. They are integrated at both of the match score level and the feature level by using different fusion strategies. The system is tested on a database of video sequences, corresponding to 45 people, which are collected over several months. The performance of different fusion methods are compared and analyzed. The experimental results show that (a) the idea of constructing ESFI from multiple frames is promising for human recognition in video and better face features are extracted from ESFI compared to those from the original side face images; (b) the synchronization of face and gait is not necessary for face template ESFI and gait template GEI; (c) integrated information from side face and gait is effective for human recognition in video. The feature level fusion methods achieve better performance than the match score level methods fusion overall.
Chapter Preview
Top

Introduction

Biometrics is the study of methods for uniquely recognizing humans based upon one or more intrinsic physical or behavioral traits, such as fingerprint, face, voice, gait, iris, signature, hand geometry and ear. The biometric trait of an individual is characterized by a set of discriminatory features or attributes. Therefore, the performance of a single biometric system is constrained by the intrinsic factors of a trait. However, this inherent limitation of a single biometric can be alleviated by fusing the information presented by multiple sources. A system that consolidates the evidence presented by multiple biometric sources are expected to be more reliable.

It has been found to be difficult to recognize a person from arbitrary views when one is walking at a distance. For optimal performance, a system should use as much information as possible from the observations. Based on the different inherent characteristic, a fusion system, which combines face and gait cues from video sequences, is a potential approach to accomplish the task of human recognition at a distance. The general solution to analyze face and gait video data from arbitrary views is to estimate 3-D models. However, the problem of building reliable 3-D models for non-rigid face, with flexible neck and the articulated human body from low resolution video data remains a hard one. In this chapter, integrated face and gait recognition approaches without resorting to 3-D models is addressed. Experiment results show the effectiveness of the proposed system for human recognition at a distance in video. The contributions of this chapter are as follows:

  • A system that integrates side face and gait information from video data in a single camera scenario is presented. The experimental results demonstrate the feasibility and effectiveness of the proposed system for human recognition at a distance.

  • Both face and gait recognition systems integrate information over multiple frames in a video sequence for improved performance. To overcome the problem of the limited resolution of a face at a distance, a Enhanced Side Face Image (ESFI), a higher resolution image compared with the image directly obtained from a single video frame, is constructed to fuse the information of face from multiple video frames. Experiments show that better face features can be extracted from constructed ESFI compared to those from original side face images. For Gait, the Gait Energy Images (GEI), a spatio-temporal compact representation of gait in video, is used to characterize human working properties.

  • The fusion of side face and gait biometrics is explored at both the match score level and the feature level. The match score level fusion schemes includes Sum and Max rules. The performance characterization is analyzed using the Q statistic. The feature level fusion schemes are implemented in two ways. In the first approach, feature concatenation is conducted directly on the features of face and gait, which are obtained using PCA and MDA combined method from ESFI and GEI, respectively. In the second approach, MDA is applied after, not before, the concatenation of face and gait features that are obtained using PCA from ESFI and GEI, respectively.

  • Various experiments are performed on 45 people with data from 100 video sequences collected over several months. Performance comparisons between different biometrics and different fusion methods are presented. The experimental results demonstrate the effectiveness of the fusion at the feature level in comparison to the match score level. Besides the recognition rates, the performance is also compared using CMC curves. They further demonstrate the strength of the proposed fusion system.

The chapter is organized as follows. Section 2 gives a review of recent work about the integration of face and gait. Section 3 presents the overall technical approach. It introduces the construction of Enhanced Side Face Image (ESFI) and Gait Energy Image (GEI). It describes feature extraction from ESFI and GEI. It explains the proposed schemes to fuse side face and gait at both the match score level and the feature level fusion for human recognition. A number of dynamic video sequences are tested using the approaches presented. Experimental results are compared and analyzed. Section 4 concludes the chapter.

Complete Chapter List

Search this Book:
Reset