Audio and Visual Speech Recognition Recent Trends

Audio and Visual Speech Recognition Recent Trends

Lee Hao Wei, Seng Kah Phooi, Ang Li-Minn
ISBN13: 9781466639584|ISBN10: 146663958X|EISBN13: 9781466639591
DOI: 10.4018/978-1-4666-3958-4.ch002
Cite Chapter Cite Chapter

MLA

Wei, Lee Hao, et al. "Audio and Visual Speech Recognition Recent Trends." Intelligent Image and Video Interpretation: Algorithms and Applications, edited by Jing Tian and Li Chen, IGI Global, 2013, pp. 42-86. https://doi.org/10.4018/978-1-4666-3958-4.ch002

APA

Wei, L. H., Phooi, S. K., & Li-Minn, A. (2013). Audio and Visual Speech Recognition Recent Trends. In J. Tian & L. Chen (Eds.), Intelligent Image and Video Interpretation: Algorithms and Applications (pp. 42-86). IGI Global. https://doi.org/10.4018/978-1-4666-3958-4.ch002

Chicago

Wei, Lee Hao, Seng Kah Phooi, and Ang Li-Minn. "Audio and Visual Speech Recognition Recent Trends." In Intelligent Image and Video Interpretation: Algorithms and Applications, edited by Jing Tian and Li Chen, 42-86. Hershey, PA: IGI Global, 2013. https://doi.org/10.4018/978-1-4666-3958-4.ch002

Export Reference

Mendeley
Favorite

Abstract

This chapter focuses on a brief introduction on the origins of the audio-visual speech recognition process and relevant techniques often used by researchers in the field. Brief background theory regarding commonly used methods for feature extraction and classification for both audio and visual processing are discussed with highlights pertaining to Mel-Frequency Cepstral Coefficient, and contour/geometric based lips feature extraction with corresponding tracking methods (Yingjie, Haiyan, Yingjie, & Jinyang, 2011; Liu & Cheung, 2011). Proposed solution concepts will include time derivatives of mel-frequency cepstral coefficients for audio feature extraction, Chroma-colour-based (YCbCr) Face segmentation, Feature Point extraction, Localized Active Contour tracking algorithm, and Hidden Markov Models with Vitebri algorithm incorporated. Information contained in this chapter focuses on being informative for novice speech processing candidates but insufficient mastery knowledge. Additional suggested reading materials should assist in expediting field mastery.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.