State-of-the-Art on Video-Based Face Recognition

State-of-the-Art on Video-Based Face Recognition

Yan Yan (Tsinghua University, Beijing, China) and Yu-Jin Zhang (Tsinghua University, Beijing, China)
Copyright: © 2009 |Pages: 7
DOI: 10.4018/978-1-59904-849-9.ch213
OnDemand PDF Download:
$37.50

Abstract

Over the past few years, face recognition has gained many interests. Face recognition has become a popular area of research in computer vision and pattern recognition. The problem attracts researchers from different disciplines such as image processing, pattern recognition, neural networks, computer vision, and computer graphics (Zhao, Chellappa, Rosenfeld & Phillips, 2003). Face recognition is a typical computer vision problem. The goal of computer vision is to understand the images of scenes, locate and identify objects, determine their structures, spatial arrangements and relationship with other objects (Shah, 2002). The main task of face recognition is to locate and identify the identity of people in the scene. Face recognition is also a challenging pattern recognition problem. The number of training samples of each face class is usually so small that it is hard to learn the distribution of each class. In addition, the within-class difference may be sometimes larger than the between-class difference due to variations in illumination, pose, expression, age, etc. The availability of the feasible technologies brings face recognition many potential applications, such as in face ID, access control, security, surveillance, smart cards, law enforcement, face databases, multimedia management, human computer interaction, etc (Li & Jain, 2005). Traditional still image-based face recognition has achieved great success in constrained environments. However, once the conditions (including illumination, pose, expression, age) change too much, the performance declines dramatically. The recent FRVT2002 (Face Recognition Vendor Test 2002) (Phillips, Grother, Micheals, Blackburn, Tabassi & Bone 2003) shows that the recognition performance of face images captured in an outdoor environment and different days is still not satisfying. Current still image-based face recognition algorithms are even far away from the capability of human perception system (Zhao, Chellappa, Rosenfeld & Phillips, 2003). On the other hand, psychology and physiology studies have shown that motion can help people for better face recognition (Knight & Johnston, 1997; O’Toole, Roark & Abdi, 2002). Torres (2004) pointed out that traditional still image-based face recognition confronts great challenges and difficulties. There are two potential ways to solve it: video-based face recognition technology and multi-modal identification technology. During the past several years, many research efforts have been concentrated on video-based face recognition. Compared with still image-based face recognition, true video-based face recognition algorithms that use both spatial and temporal information started only a few years ago (Zhao, Chellappa, Rosenfeld & Phillips, 2003). This article gives an overview of most existing methods in the field of video-based face recognition and analyses their respective pros and cons. First, a general statement of face recognition is given. Then, most existing methods for video-based face recognition are briefly reviewed. Some future trends and conclusions are given in the end.
Chapter Preview
Top

Introduction

Over the past few years, face recognition has gained many interests. Face recognition has become a popular area of research in computer vision and pattern recognition. The problem attracts researchers from different disciplines such as image processing, pattern recognition, neural networks, computer vision, and computer graphics (Zhao, Chellappa, Rosenfeld & Phillips, 2003).

Face recognition is a typical computer vision problem. The goal of computer vision is to understand the images of scenes, locate and identify objects, determine their structures, spatial arrangements and relationship with other objects (Shah, 2002). The main task of face recognition is to locate and identify the identity of people in the scene. Face recognition is also a challenging pattern recognition problem. The number of training samples of each face class is usually so small that it is hard to learn the distribution of each class. In addition, the within-class difference may be sometimes larger than the between-class difference due to variations in illumination, pose, expression, age, etc.

The availability of the feasible technologies brings face recognition many potential applications, such as in face ID, access control, security, surveillance, smart cards, law enforcement, face databases, multimedia management, human computer interaction, etc (Li & Jain, 2005).

Traditional still image-based face recognition has achieved great success in constrained environments. However, once the conditions (including illumination, pose, expression, age) change too much, the performance declines dramatically. The recent FRVT2002 (Face Recognition Vendor Test 2002) (Phillips, Grother, Micheals, Blackburn, Tabassi & Bone 2003) shows that the recognition performance of face images captured in an outdoor environment and different days is still not satisfying. Current still image-based face recognition algorithms are even far away from the capability of human perception system (Zhao, Chellappa, Rosenfeld & Phillips, 2003). On the other hand, psychology and physiology studies have shown that motion can help people for better face recognition (Knight & Johnston, 1997; O'Toole, Roark & Abdi, 2002). Torres (2004) pointed out that traditional still image-based face recognition confronts great challenges and difficulties. There are two potential ways to solve it: video-based face recognition technology and multi-modal identification technology. During the past several years, many research efforts have been concentrated on video-based face recognition. Compared with still image-based face recognition, true video-based face recognition algorithms that use both spatial and temporal information started only a few years ago (Zhao, Chellappa, Rosenfeld & Phillips, 2003).

This article gives an overview of most existing methods in the field of video-based face recognition and analyses their respective pros and cons. First, a general statement of face recognition is given. Then, most existing methods for video-based face recognition are briefly reviewed. Some future trends and conclusions are given in the end.

Top

Background

From a general point of view, a complete video-based face recognition system includes face detection module, face tracking module, feature extraction module and face recognition module. Face detection is at the bottom layer. The task of face detection is to determine the spatial position and pose of the face(s). Face tracking is at the middle layer. It follows the continuous change of face position over time. Feature extraction is at a higher layer. Its task is to locate the position of facial features such as eye, nose, etc, and pull out related information. Face recognition module is at the top layer. The face recognition module identifies or verifies the input face(s), with the help of databases. Figure 1 gives the general framework of video-based face recognition system, with a flowchart and some examples.

Key Terms in this Chapter

Biometric Authentication: Technologies rely on physical characteristics that are unique for each person to ascertain the identity of an individual.

Face Tracking: A computer technology that determines the continuous location of the face(s) on each frame of the image sequence.

Sequential Importance Sampling: A very common particle filter algorithm that approximates the probability density functions by a set of random samples with associated weights.

Face Recognition: Given still or video images of a scene, identify or verify one or more persons in the scene using a stored database of faces.

Video-Based Face Recognition: Given a video containing face(s), identify or verify one or more persons using a stored database.

Particle Filters: Techniques which also known as Sequential Monte Carlo methods (SMC), are sophisticated model estimation techniques based on simulation.

Face Detection: A computer technology that determines the locations and sizes of human faces in digital images.

Human Face Based Video Retrieval: A process that one searches the video sequences to find the face shot according to the query face image.

Complete Chapter List

Search this Book:
Reset