Finding Useful Features for Facial Expression Recognition and Intensity Estimation by Neural Network

Finding Useful Features for Facial Expression Recognition and Intensity Estimation by Neural Network

Naoki Imamura (Kyoto Institute of Technology, Kyoto, Japan), Hiroki Nomiya (Graduate School of Information Science, Kyoto Institute of Technology, Kyoto, Japan) and Teruhisa Hochin (Graduate School of Information Science, Kyoto Institute of Technology, Kyoto, Japan)
Copyright: © 2020 |Pages: 17
DOI: 10.4018/IJSI.2020040105

Abstract

Facial expression intensity has been proposed to digitize the degree of facial expressions in order to retrieve impressive scenes from lifelog videos. The intensity is calculated based on the correlation of facial features compared to each facial expression. However, the correlation is not determined objectively. It should be determined statistically based on the contribution score of the facial features necessary for expression recognition. Therefore, the proposed method recognizes facial expressions by using a neural network and calculates the contribution score of input toward the output. First, the authors improve some facial features. After that, they verify the score correctly by comparing the accuracy transitions depending on reducing useful and useless features and process the score statistically. As a result, they extract useful facial features from the neural network.
Article Preview
Top

Introduction

In recent years, information technology has developed rapidly. Depending on higher performance, more storage, and less price of multimedia data recording devices, many people can create multimedia data easily. That is why collecting and using effectively various daily data called lifelog was proposed (Aizawa, 2009). In particular, they can make a video, for example, a home video, and it is supported to have useful information.

Nomiya et al. (2013) researched the impressive scene retrieval in lifelog videos using facial expressions. In this research, it is possible to search for a scene regarding a specific facial expression.

Morikuni et al. (2015) thought that the person in lifelog videos had various facial expression strength. They proposed a measure digitizing the degree of facial expression called “expression intensity.” Sakaue et al. (2017) expanded the facial expression to the basic six facial expressions (anger, disgust, fear, happiness, sadness, and surprise) and improved estimation accuracy of the facial expression intensity. Shinohara et al. (2018) reduced manual work and evaluated the expression intensity with the absolute value.

In the previous research by Shinohara et al., there is a problem that the actual expression is not always the expression whose intensity is highest. Since the range of the possible value of the expression intensity for each facial expression is different, it is not possible to compare the value with another facial expression. For this reason, it is impossible to search for multiple facial expressions at the same time, and it is inconvenient. Also, there is the other problem that they determine the correlation of facial features by their subjectivity. Correlation should be determined by statistical data. In order to solve the problems, we think that it is necessary to recognize the facial expression and it is effective to weight the feature values for the contribution score of each feature.

In facial expression recognition, there are methods using a neural network (Kobayashi & Hara, 1995) or a convolutional neural network (Nishime et al., 2017; Khorrami et al., 2015). The recognition accuracy is about 90% (Kobayashi & Hara, 1995; Khorrami et al., 2015) or 57% (Nishime et al., 2017). By masking specific input data and analyzing the relationship between the input and output information, the researches (Kobayashi & Hara, 1995; Nishime et al., 2017) conformed what facial features strongly influenced the recognition. However, the former needs to fix the face position because of using the values of movement for facial points from the neutral face. Also, the latter determines the relationship manually because splitting a facial image into sixteen and masking one of them regardless of the face angle are required. The research (Khorrami et al., 2015) qualitatively analyzed the neural network by visualizing the spatial patterns that maximally excite different neurons in the convolutional layers but did not compare each expression.

In this paper, we verify the facial expression recognition and whether we can calculate the contribution score correctly. We improve some facial features and recognize the facial expression by using a neural network to aim for high accuracy. Also, we calculate the contribution score of each input toward the output. Then, we extract useful facial features defined on the basis of the contribution score.

The remainder of this paper is organized as follows. Section 2 presents related work. Section 3 presents the previous research. Section 4 describes our approach. Section 5 shows the experiment to compare the accuracy transition depending on useful and useless facial features. Finally, Section 6 concludes this paper.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2021): Forthcoming, Available for Pre-Order
Volume 8: 4 Issues (2020)
Volume 7: 4 Issues (2019)
Volume 6: 4 Issues (2018)
Volume 5: 4 Issues (2017)
Volume 4: 4 Issues (2016)
Volume 3: 4 Issues (2015)
Volume 2: 4 Issues (2014)
Volume 1: 4 Issues (2013)
View Complete Journal Contents Listing