Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Audiovisual Facial Action Unit Recognition using Feature Level Fusion

Zibo Meng, Shizhong Han, Min Chen, Yan Tong

Source Title: International Journal of Multimedia Data Engineering and Management (IJMDEM) 7(1)

DOI: 10.4018/IJMDEM.2016010104

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Recognizing facial actions is challenging, especially when they are accompanied with speech. Instead of employing information solely from the visual channel, this work aims to exploit information from both visual and audio channels in recognizing speech-related facial action units (AUs). In this work, two feature-level fusion methods are proposed. The first method is based on a kind of human-crafted visual feature. The other method utilizes visual features learned by a deep convolutional neural network (CNN). For both methods, features are independently extracted from visual and audio channels and aligned to handle the difference in time scales and the time shift between the two signals. These temporally aligned features are integrated via feature-level fusion for AU recognition. Experimental results on a new audiovisual AU-coded dataset have demonstrated that both fusion methods outperform their visual counterparts in recognizing speech-related AUs. The improvement is more impressive with occlusions on the facial images, which would not affect the audio channel.

Article Preview

Top

Introduction

Facial activity is one of the most powerful and natural means for human communication (Pantic & Bartlett, 2007a). Driven by the recent advances in human-centered computing, there is an increasing need for accurate and reliable characterization of the displayed facial behavior. The Facial Action Coding System (FACS) developed by Ekman and Friesen (Ekman, Friesen, & Hager, 2002) is the most widely used and objective system for facial behavior analysis. Based on the FACS, the facial behavior is described by a small set of facial Action Units (AUs), each of which is anatomically related to the contraction of a set of facial muscles. Given different interpretation rules or systems, e.g. Emotion FACS rules (Ekman et al., 2002), AUs have been used in inferring various human affective states. In addition to the application of human behavior analysis, an automatic system for facial AU recognition is desired in interactive games, online/remote learning, and other human computer interaction (HCI) related applications.

As demonstrated in the survey papers (Pantic, Pentland, Nijholt, & Huang, 2007b; Zeng, Pantic, Roisman, & Huang, 2009; Sariyanidi, Gunes, & Cavallaro, 2015), great progress has been made over the years on automatic AU recognition from posed/deliberated facial displays. Recognizing facial AUs from spontaneous facial displays, however, is challenging due to subtle and complex facial deformation, frequent head movements, temporal dynamics of facial action, etc. Furthermore, it is especially challenging to recognize AUs involved in speech. As discussed in (Ekman et al., 2002), the AUs are usually activated at low intensities with subtle facial appearance/geometrical changes when they are responsible for producing speech. In addition, they will often introduce ambiguity, e.g., occlusions, in recognizing other AUs.

For example, pronouncing a phoneme /b/ has two consecutive phases, i.e., Stop and Aspiration phases. In the Aspiration phase, the lips are apart and the oral cavity between the teeth is visible, as shown in Figure 1(b), which are the major facial appearance clues to recognize AU25 (lips part) and AU26 (jaw drop), respectively. In the Stop phase, the lips are pressed together due to the activation of AU24 (lip presser), as shown in Figure 1(a). Consequently, the oral cavity is occluded by the lips and AU26 is “invisible” in the visual channel.

Figure 1.

Example images of speech-related facial behaviors, where different combinations of AUs are activated to pronounce a phoneme /b/

All existing approaches on facial AU recognition extract information solely from the visual channel. In contrast, this paper proposes a novel approach, which exploits the information from both visual and audio channels, to recognize speech-related AUs. This work is motivated by the fact that facial AUs and voice are highly correlated in natural human communications. Specifically, voice/speech has strong physiological relationships with some lower face AUs such as AU25 (lips part), AU26 (jaw drop), and AU24 (lip presser) because jaw and lower-face muscle movements together with the soft palate, tongue and vocal cords produce the voice.

These relationships are well recognized and have been exploited in natural human communications. For example, without looking at the face, people will know that the other person is opening his/her mouth when hearing laughter. Following the example of recognizing AU26 (jaw drop) in the Stop phase of pronouncing the phoneme /b/, we can infer that AU26 (jaw drop) has been activated when hearing the sound /b/, even when it is “invisible” in the visual channel.

Complete Article List

Search this Journal:

Reset

Volume 15: 1 Issue (2024)

Volume 14: 1 Issue (2023)

Volume 13: 4 Issues (2022): 1 Released, 3 Forthcoming

Volume 12: 4 Issues (2021)

Volume 11: 4 Issues (2020)

Volume 10: 4 Issues (2019)

Volume 9: 4 Issues (2018)

Volume 8: 4 Issues (2017)

Volume 7: 4 Issues (2016)

Volume 6: 4 Issues (2015)

Volume 5: 4 Issues (2014)

Volume 4: 4 Issues (2013)

Volume 3: 4 Issues (2012)

Volume 2: 4 Issues (2011)

Volume 1: 4 Issues (2010)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Audiovisual Facial Action Unit Recognition using Feature Level Fusion

Abstract

Introduction

Complete Article List