Advocating a Componential Appraisal Model to Guide Emotion Recognition

Marcello Mortillaro (University of Geneva, Switzerland), Ben Meuleman (Swiss Center for Affective Sciences - University of Geneva, Switzerland) and Klaus R. Scherer (Swiss Center for Affective Sciences–University of Geneva, Switzerland)
Copyright: © 2012 |Pages: 15
DOI: 10.4018/jse.2012010102
Most models of automatic emotion recognition use a discrete perspective and a black-box approach, i.e., they output an emotion label chosen from a limited pool of candidate terms, on the basis of purely statistical methods. Although these models are successful in emotion classification, a number of practical and theoretical drawbacks limit the range of possible applications. In this paper, the authors suggest the adoption of an appraisal perspective in modeling emotion recognition. The authors propose to use appraisals as an intermediate layer between expressive features (input) and emotion labeling (output). The model would then be made of two parts: first, expressive features would be used to estimate appraisals; second, resulting appraisals would be used to predict an emotion label. While the second part of the model has already been the object of several studies, the first is unexplored. The authors argue that this model should be built on the basis of both theoretical predictions and empirical results about the link between specific appraisals and expressive features. For this purpose, the authors suggest to use the component process model of emotion, which includes detailed predictions of efferent effects of appraisals on facial expression, voice, and body movements.
Three Theoretical Perspectives On Emotion

Most research on emotion expression implicitly or explicitly used a discrete emotion perspective (Scherer, Clark-Polner, & Mortillaro, 2011) and the same is true for automatic emotion recognition systems. Discrete emotion theory has been formulated on the basis of findings concerning few intense emotions –called basic emotions – that are expected to have prototypical facial expressions and emotion-specific physiological signatures (Ekman, 1992, 1999; Ekman, Levenson, & Friesen, 1983; Ekman, Sorenson, & Friesen, 1969). This theory dominated the field for decades and it is still the most widely used. There is robust evidence about the existence of some facial configurations that are cross-culturally labeled with the same emotion terms. However, several studies show that people frequently report the experience of emotional states that are not part of this set of basic emotions (Scherer & Ceschi, 2000; Scherer, Wranik, Sangsue, Tran, & Scherer, 2004), and, more importantly, that for spontaneous and enacted emotional expression, these complete prototypical expressions rarely occur (Naab & Russell, 2007; Russell & Fernandez-Dols, 1997; Scherer & Ellgring, 2007a). Discrete emotion theorists tried to solve these problems by suggesting the concept of emotion families (Ekman, 1992). An emotion family includes several lexically marked variations of basic emotions labels; all terms within a family share a common theme (characteristics unique to the family) and variations due to individual, cultural, and contextual factors. Nevertheless, it is not clear how these variations should be modeled and whether they can be identified from vocal or facial expressions.

