Unobtrusive Academic Emotion Recognition Based on Facial Expression Using RGB-D Camera Using Adaptive-Network-Based Fuzzy Inference System (ANFIS)

Unobtrusive Academic Emotion Recognition Based on Facial Expression Using RGB-D Camera Using Adaptive-Network-Based Fuzzy Inference System (ANFIS)

James Purnama (Swiss German University, Indonesia) and Riri Fitri Sari (University of Indonesia, Depok, Indonesia)
DOI: 10.4018/IJSSCI.2019010101


Quality of learning in the classroom is influenced by many factors. One of them is the academic emotions of the students. The emotion detection in the classroom cannot be done by using sensors attached to the body of the students, because it would disturb the concentration of the students. The proposed solution is by using unobtrusive emotion detection, e.g. by placing video capture equipment, which is not visible at the front of the student's desk. In this study, an RGB - Depth Microsoft Kinect camera is used to record facial expressions by considering the convenience factor of the students, speed of response time, and cost efficiency. A combination of Cohn-Kanade dataset and EURECOM dataset is used as the training set in machine learning with Adaptive-Network-Based Fuzzy Inference System (ANFIS) algorithm, with 8 sample of Asian race students (4 male and 4 female students).
Article Preview


In the era of Information Technology, the teaching and learning process in the classroom is still the main method of learning. In addition, there are e-learning methods and Intelligent Tutoring System, which are supplements to the classroom learning method. Quality of learning in the classroom is influenced by many aspects, one of which is the Academic Emotions.

Pekrun et al. (2002) describes that the Academic Emotions can affect learning qualities which confidence, excitement, frustration, interest, flow/engagement, boredom, confusion, and anxiety (Arroyo et al., 2009; Azcarraga et al., 2011; Burleson, 2006; Azcarraga et al., 2010; Kapoor et al., 2007; Zeidner, 2007).

Previously, emotion detection used sensor devices that are plugged / affixed to the body of the research subject. The sensor devices are physiological signals or biosignal sensors to measure body signals, such as Electromyography (EMG), Electro Dermal Activity (EDA) or Galvanic Skin Response (GSR). This method is commonly referred to as the Hidden Ways (covertly). This method was perceived only suitable for testing in the laboratory, and are not effective when implemented in real conditions, since mounting the sensor devices obviously disturb the concentration of the research subjects, and are likely to affect the feelings / emotions (Guo et al., 2013).

The concept of unobtrusive emotion detection is to ensure that the detection process is not flashy or attracts undue attention from the research subjects. The main objective of our work is to create the unobtrusive environment which give comfort to the user, by working as much as possible so that the user does not aware of the existence of the sensor device used (Broek et al., 2009).

Haq and Jackson (2011) concluded that unobtrusive emotion detection analysis is a combination of a multimodal channel, such as audio (speech expressions), and video (facial expression and body gesture). Reisenzein et al. (2013) proved that the relation between facial expression and emotion is coherence. Based on their experiments, we know that emotion such as “amusement” and “smiling” have high coherence, while other positive emotions and smiling have low to coherence relation. Jack et al. (2012) published revolutionary findings, that facial expressions of emotion are not culturally universal. It means the Western Caucasian people have different facial expressions compared to Eastern Asian to express their emotion. These findings give an impact and is challenging to the researchers of emotion detection.

This research focuses on facial expressions-based emotion detection which is mainly in posed facial expressions recognition. Since the last few years, the research direction has changed to the spontaneous expression (Bettadapura, 2012). This new direction is due to the rise of low-cost and powerful RGB-D camera sensor, such as Microsoft Kinect (Zhang, 2012; Andersen et al., 2012) and other sensors such as the PlayStation Eye (Sonny, 2014).

Research in emotion detection has had a lot of interest since decades ago, marked by a lot of publications in this area. With the birth of technology innovation of RGB-D sensors (depth camera) which are cheap and affordable during the last few years ago, research in the field of emotion detection start a new chapter, namely unobtrusive emotion detection.

Mahmoud et al. (2011) has described the collection and annotation of a 3D multi-modal corpus of naturalistic complex mental states. The corpus consists of 108 videos and 12 mental states collected from a Microsoft Windows Kinect depth camera. The main contributions of their research are that they provide multimodal 3D corpus of the combination of facial expressions, as well as hand and body gesture.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 12: 4 Issues (2020): Forthcoming, Available for Pre-Order
Volume 11: 4 Issues (2019): 1 Released, 3 Forthcoming
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing