Facial Expression Recognition

Facial Expression Recognition

Raymond Ptucha (Rochester Institute of Technology, USA) and Andreas Savakis (Rochester Institute of Technology, USA)
Copyright: © 2015 |Pages: 12
DOI: 10.4018/978-1-4666-5888-2.ch051
OnDemand PDF Download:
$30.00
List Price: $37.50

Chapter Preview

Top

Background

The study of the six universal expressions, i.e. fear, sadness, happiness, anger, disgust, and surprise, has made great strides in recent years from constrained frontal posed faces to unconstrained faces in natural conditions (Maja Pantic & Rothkrantz, 2000; Shuai-Shi, Yan-Tao, & Dong, 2009b; Zhihong, Pantic, Roisman, & Huang, 2009). Figure 1 shows the basic steps necessary for a facial expression recognition system. Face detection is often accomplished with the Viola-Jones approach because of its low computational requirements and high detection rates (Viola & Jones, 2001). Recent face detection methods improve the accuracy, efficiency, or robustness (Zhu & Ramanan, 2012). Following face detection, faces are normalized to a reference shape and size. Typically, the eye and mouth corners are localized, and an affine warp to canonical frontal face is defined.

Figure 1.

Flowchart of fundamental operations used for facial expression recognition

Facial expression methods can be broadly categorized as geometric or appearance-based (Fasel & Luettin, 2003; Shuai-Shi, Yan-Tao, & Dong, 2009a). Geometric methods localize facial landmarks such as the outline of eyes, lips, nose, etc. (Martin, Werner, & Gross, 2008; Yeongjae & Daijin, 2009). Appearance-based methods work holistically with facial pixels enabling the capture of facial muscle subtleties such as nose wrinkles or dimple formation (Shan, Gong, & McOwan, 2009).

Geometric methods require computing size, shape, and location of key facial features such as the eyes, mouth, and eyebrows. Active Shape Model (ASM) or Active Appearance Model (AAM) are two of the most popular facial landmark localization methods (Cootes, Edwards, & Taylor, 2001). Given enough training data and accurate facial landmark localization, shape models perform very well for expression classification (Martin et al., 2008; Yeongjae & Daijin, 2009).

Key Terms in this Chapter

Posed: When a person is directed to behave a certain way in a controlled setting.

Affective Computing: Computing that recognizes, interprets, and influences human emotions.

Dimensionality Reduction: The process of reducing the dimensionality of an input space, usually with the intent of both improving compute performance as well as increasing classification accuracy.

Static Predictors: The usage of a single image or video frame to estimate a person’s expression or emotion.

Expression Recognition: The estimation of a person’s expression using automated computer vision and machine learning techniques.

Temporal Predictors: The usage of many contiguous static frames over time to estimate a person’s expression or emotion.

Spontaneous: When a person naturally or involuntarily behaves a certain way in an unconstrained setting.

Expression: Involuntary and voluntary movement of facial muscles evoked by emotion, mood, or nonverbal communication.

Complete Chapter List

Search this Book:
Reset