Facial Action Recognition in 2D and 3D

Facial Action Recognition in 2D and 3D

Michel Valstar (University of Nottingham, UK), Stefanos Zafeiriou (Imperial College London, UK) and Maja Pantic (Imperial College London, UK and University of Twente, EEMCS, The Netherlands)
Copyright: © 2014 |Pages: 20
DOI: 10.4018/978-1-4666-5966-7.ch008
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Automatic Facial Expression Analysis systems have come a long way since the earliest approaches in the early 1970s. We are now at a point where the first systems are commercially applied, most notably smile detectors included in digital cameras. As one of the most comprehensive and objective ways to describe facial expressions, the Facial Action Coding System (FACS) has received significant and sustained attention within the field. Over the past 30 years, psychologists and neuroscientists have conducted extensive research on various aspects of human behaviour using facial expression analysis coded in terms of FACS. Automating FACS coding would make this research faster and more widely applicable, opening up new avenues to understanding how we communicate through facial expressions. Mainly due to the cost effectiveness of existing recording equipment, until recently almost all work conducted in this area involves 2D imagery, despite their inherent problems relating to pose and illumination variations. In order to deal with these problems, 3D recordings are increasingly used in expression analysis research. In this chapter, the authors give an overview of 2D and 3D FACS recognition, and summarise current challenges and opportunities.
Chapter Preview
Top

Introduction

Scientific work on facial expressions can be traced back to at least 1872 when Charles Darwin (1872) published The Expression of the Emotions in Man and Animals. He explored the importance of facial expressions for communication and described variations in facial expressions of emotions. Today, it is widely acknowledged that facial expressions serve as the primary nonverbal means for human beings to regulate interactions with each other (Ekman & Rosenberg, 2005). They communicate emotions, clarify and emphasise what is being said, and signal comprehension, disagreement and intentions (Pantic, 2009).

Two main approaches for facial expression measurement can be distinguished: message and sign judgement (Cohn & Ekman, 2005). Message judgement aims to directly decode the meaning conveyed by a facial display (such as being happy, angry or sad), while sign judgement aims to study the physical signal used to transmit the message (such as raised cheeks or depressed lips). A steadfast of message judgment approaches is the theory of six basic expressions first suggested by Darwin (1872) and later extended by Paul Ekman (2003). They suggested that the six basic emotions, namely anger, fear, disgust, happiness, sadness and surprise, are universally transmitted through prototypical facial expressions. This direct relation underpins all message-judgement approaches. As a consequence, and helped by the simplicity of this discrete representation, prototypic facial expressions of the six basic emotions are the most commonly studied expressions and represent the main message-judgement approach.

There are two major drawbacks of message judgement approaches. Firstly, it cannot explain the full range of facial expressions, as the set of expressions that can be explained is restricted by the set of messages. Secondly, message judgement systems often assume that facial expression and target behaviour (e.g. emotion) have an unambiguous many-to-one correspondence, which is not the case according to studies in psychology (Ambady & Rosenthal, 1992) and in general, relations between messages and their associated displays are not universal, with facial displays and their interpretation varying from person to person or even from one situation to another.

The most commonly used set of descriptors in sign-judgement approaches is that specified by the Facial Action Coding System (FACS), (Ekman & Friesen, 1978; Ekman, Friesen, & Hager, 2002). The FACS is a taxonomy of human facial expressions. It was originally developed by Ekman and Friesen in 1978, and revised in 2002. The revision specifies 32 atomic facial muscle actions, named Action Units (AUs), and 14 additional Action Descriptors (ADs) that account for miscellaneous actions, such as jaw thrust, blow and bite. The FACS is comprehensive and objective, as opposed to message-judgement approaches. Since any facial expression results from the activation of a set of facial muscles, every possible facial expression can be comprehensively described as a combination of AUs (as shown in Figure 1). And while it is objective in that it describes the physical appearance of any facial display, it can still be used in turn to infer the subjective emotional state of the subject, which cannot be directly observed and depends instead on personality traits, context and subjective interpretation (Pantic, Nijholt, Pentland, & Huang, 2008).

Figure 1.

Examples of combinations of upper and lower face AUs defined in the FACS, resulting in two prototypical facial expressions

Complete Chapter List

Search this Book:
Reset