TopBackground
The study of the six universal expressions, i.e. fear, sadness, happiness, anger, disgust, and surprise, has made great strides in recent years from constrained frontal posed faces to unconstrained faces in natural conditions (Maja Pantic & Rothkrantz, 2000; Shuai-Shi, Yan-Tao, & Dong, 2009b; Zhihong, Pantic, Roisman, & Huang, 2009). Figure 1 shows the basic steps necessary for a facial expression recognition system. Face detection is often accomplished with the Viola-Jones approach because of its low computational requirements and high detection rates (Viola & Jones, 2001). Recent face detection methods improve the accuracy, efficiency, or robustness (Zhu & Ramanan, 2012). Following face detection, faces are normalized to a reference shape and size. Typically, the eye and mouth corners are localized, and an affine warp to canonical frontal face is defined.
Figure 1. Flowchart of fundamental operations used for facial expression recognition
Facial expression methods can be broadly categorized as geometric or appearance-based (Fasel & Luettin, 2003; Shuai-Shi, Yan-Tao, & Dong, 2009a). Geometric methods localize facial landmarks such as the outline of eyes, lips, nose, etc. (Martin, Werner, & Gross, 2008; Yeongjae & Daijin, 2009). Appearance-based methods work holistically with facial pixels enabling the capture of facial muscle subtleties such as nose wrinkles or dimple formation (Shan, Gong, & McOwan, 2009).
Geometric methods require computing size, shape, and location of key facial features such as the eyes, mouth, and eyebrows. Active Shape Model (ASM) or Active Appearance Model (AAM) are two of the most popular facial landmark localization methods (Cootes, Edwards, & Taylor, 2001). Given enough training data and accurate facial landmark localization, shape models perform very well for expression classification (Martin et al., 2008; Yeongjae & Daijin, 2009).