Second Language Learners' Spoken Discourse: Practice and Corrective Feedback through Automatic Speech Recognition

Second Language Learners' Spoken Discourse: Practice and Corrective Feedback through Automatic Speech Recognition

Catia Cucchiarini (Radboud University, The Netherlands) and Helmer Strik (Radboud University, The Netherlands)
DOI: 10.4018/978-1-4666-6042-7.ch029
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

This chapter examines the use of Automatic Speech Recognition (ASR) technology in the context of Computer Assisted Language Learning (CALL) and language learning and teaching research. A brief introduction to ASR is first provided, to make it clear why and how this technology can be used to the benefit of learning and development in second language (L2) spoken discourse. This is followed by an overview of the state of the art in research on ASR-based CALL. Subsequently, a number of relevant projects on ASR-based CALL conducted at the Centre for Language and Speech Technology of the Radboud University in Nijmegen (the Netherlands) are presented. Possible solutions and recommendations are discussed given the current state of the technology with an explanation of how such systems can be used to the benefit of Discourse Analysis research. The chapter concludes with a discussion of possible perspectives for future research and development.
Chapter Preview
Top

Background: Automatic Speech Recognition (Asr)

Standard ASR systems are generally employed to recognize words. The ASR system consists of a decoder (the search algorithm) and three 'knowledge sources': the language model, the lexicon, and the acoustic models. The language model (LM) contains probabilities of words and sequences of words. Acoustic models are models of how the sounds of a language are pronounced; in most cases so-called hidden Markov models (HMMs) are used, but it is also possible to use artificial neural networks (ANNs). The lexicon is the connection between the language model and the acoustic models. It contains information on how the words are pronounced, in terms of sequences of speech sounds. Therefore, the lexicon contains two representations for every entry: an orthographic transcription representing how a word is written and a phonological transcription representing how a word is pronounced. Since words can be pronounced in different ways, lexicons often contain more than one entry for some words, i.e. the pronunciation variants, which indicate possible pronunciations of one and the same word.

Complete Chapter List

Search this Book:
Reset