Robust Human Face Tracking in Eigenspace for Perceptual Human-Robot Interaction

Robust Human Face Tracking in Eigenspace for Perceptual Human-Robot Interaction

Richard Jiang (Loughborough University, UK) and Abdul Sadka (Brunel University, UK)
Copyright: © 2011 |Pages: 13
DOI: 10.4018/978-1-60960-024-2.ch004
OnDemand PDF Download:
No Current Special Offers


This chapter introduces a robust human face tracking scheme for vision-based human-robot interaction, where the detected face-like regions in the video sequence are tracked using unscented Kalman filter (UKF), and face occlusion are tackled by using an online appearance-based scheme using principle component analysis (PCA). The experiment is carried out with the standard test video, which validates that the proposed PCA-based face tracking can attain robust performance in tackling face occlusions.
Chapter Preview

1. Introduction

Visual content analysis has become a hot topic in machine vision research due to its application in a number of practical robotic applications (Jensen et al, 2005; Hartley & Zisserman, 2004; Bucher et al, 2003; Lang et al, 2003; Alam & Bal, 2007). Intelligent machine vision analysis can enable machine or computer with at least two gifts in practical applications: one is automatic localization and mapping (SLAM) (Hartley, 2004) for scene structure understanding; another is the intelligent understanding of human activity in the visual scene for perceptual human-computer interaction. The recognition of salient human objects (Bucher, 2003; Lang, 2003; Alam, 2007) presented in the visual scene then becomes a primary task in this kind of applications.

As for human motion tracking and analysis (Isard & Blake, 1998; Zhou et al, 2008; Vadakkepat et al, 2008; Heuring & Murray, 1999; Boheme et al, 1998; Feyrer & Zell, 1999; Hong & Jain, 1999), human head tracking is a useful technique for human-robot interaction applications, where computer or robotic systems need to detect human in the scene, understand human intention or behavior, and adapt their responsive actions to human activity promptly. With this viewpoint, an intelligent vision-based system need to track human motion and understand the events happened in the scene. Figure 1 is an example of the walking robot controlled by computer, which may need the capability of smart perception in assigned tasks, such as finding the right person in the visual scene.

Figure 1.

A robot with camera eyes needs smart perception in assigned tasks, such as finding the right person’s head in the visual scene


Vision-based head tracking usually means two tasks in its technical implementation. The first task in head tracking is to detect the head, a specific visual object in a cluttered scene. The second task is to track the head motion in a cluttered scene. The combination of these two tasks makes the problem challenging, while a bunch of methods have been investigated.

Head tracking can be carried out by motion segmentation and tracking (Alam & Bal, 2007; Isard & Blake, 1998), where the moving object is segmented from its background. More than using visual feature alone, multimodal tracking (Zhou et al, 2008; Vadakkepat et al, 2008) with stereo audio signal from audiovisual sensor network can also locate the position of speaker, while motion blob can be referred as inference assistance. However, not all moving objects are human. Therefore, classification of moving objects is requested in this kind of approach to discriminate a human face from other objects.

Finding a human face in the scene seems extremely easy for the human visual system. It is however a complex problem in computer-based systems. The difficulty resides in the fact that faces are non rigid objects. Face appearance may vary between two different persons but also between two photographs of the same person, depending on the light conditions, the emotional state of the subject and pose. Faces also vary apparently with added features, such as glasses, hat, moustache beards and hair style. A number of approaches have been developed for this task. Among them, Viola-Jones approach (Viola & Jones, 2001; Meynet, 2007) has been reported to attain great success in face detection. However, to be applied in practical applications, the detected faces need to be tracked consistently, which demands the robustness to challenges such as face occlusions.

Complete Chapter List

Search this Book: