Applications of Kernel Theory to Speech Recognition
Joseph Picone (Mississippi State University, USA), Aravind Ganapathiraju (Mississippi State University, USA) and Jon Hamaker (Mississippi State University, USA)
Copyright: © 2007
Automated speech recognition is traditionally defined as the process of converting an audio signal into a sequence of words. Over the past 30 years, simplistic techniques based on the design of smart feature-extraction algorithms and physiological models have given way to powerful statistical methods based on generative models. Such approaches suffer from three basic problems: discrimination, generalization, and sparsity. In the last decade, the field of machine learning has grown tremendously, generating many promising new approaches to this problem based on principles of discrimination. These techniques, though powerful when given vast amounts of training data, often suffer from poor generalization. In this chapter, we present a unified framework in which both generative and discriminative models are motivated from an information theoretic perspective. We introduce the modern statistical approach to speech recognition and discuss how kernel-based methods are used to model knowledge at each level of the problem. Specific methods discussed include kernel PCA for feature extraction and support vector machines for discriminative modeling. We conclude with some emerging research on the use of kernels in language modeling.