The Potential of Text-to-Speech Synthesis in Computer-Assisted Language Learning: A Minority Language Perspective

Neasa Ní Chiaráin (Trinity College Dublin, Ireland) and Ailbhe Ní Chasaide (Trinity College Dublin, Ireland)
DOI: 10.4018/978-1-7998-1097-1.ch007


This chapter describes the potential of text-to-speech synthesis (TTS) as a tool that can transform CALL platforms. Illustrating this point, a specific platform, An Scéalaí, is presented. By incorporating TTS, this platform facilitates the training of literacy skills, writing, and reading, with an emphasis at all times on the spoken language. The platform is described, as is the way in which it functions as a personalised tutor, prompting the learner towards self-correction. The prompts are delivered in both spoken/auditory form (using TTS voices) and in written form. The auditory feedback enables prooflistening, as well as spoken instructions pertaining to specific errors not picked up in the prooflistening process. The learner's progress is monitored throughout and records of the process are harvested for future research. The language in focus is Irish, and the linguistic complexities being targeted in the present implementation are explained, along with the relevant sociolinguistic context.
This chapter deals with the use of text-to-speech (TTS) technology in language learning. The vast potential of TTS technology in language pedagogy has been relatively little explored to date. It is argued here that TTS is a uniquely valuable tool as it allows the spoken language to be placed at the centre of all language learning activities, even activities such as writing and grammatical training. TTS can bring learners’ own choice of materials and compositions to life, rather than depending on prerecorded materials. It is a particularly valuable technology in the case of language learning where the learner has limited access to native speaker models of the language. This last is especially true of minority languages, such as Irish, on which the work reported here is based.

An iCALL platform for Irish is described here which illustrates some of the potential advantages that accrue with the integration of TTS. The TTS systems being used are developed as part of the ABAIR initiative (ABAIR, 2019; Ní Chasaide et al., 2017). One feature of the present iCALL development is that it is being done collaboratively with the speech technology building itself, a fact which confers many advantages, as will be discussed below.

Irish is a minority language which is spoken as a community language in pockets, mostly on the western seaboard. However, as an official language of the State, it is widely taught in Ireland. The social and linguistic context is therefore quite different to what typically pertains in the major languages and there are many specific challenges that arise in teaching a minority language where access to native speakers is limited. These issues are discussed in some detail as they provide the context for the present work.

Although developed for the minority language context, the type of educational application and the advantages of the approach would also hold in other language learning contexts. It is therefore presented as a possible model for other languages as the advantages of the incorporation of speech technology have widespread applicability.

The specific iCALL application being described here, An Scéalaí (the Storyteller), aims to integrate the skills of writing, listening, speaking and reading in a speech-based application. It is intended to promote a holistic approach to the acquisition of different language skills so that the connections are intuitively grasped by learners.

An Scéalaí, currently under development in collaboration with the ABAIR initiative, is a web-based iCALL language learning platform. It is an open-ended platform, modularly constructed which is being added to incrementally. It provides the learner with language learning tools and content as well as feedback prompts for the correction of user-generated content and encourages autonomous language learning. At the same time, it is intended that this platform, by harvesting learner data, will be used by researchers and by platform developers for further refinements of this and other systems. Possibly the greatest strength of the current system derives from the symbiotic relationship between the speech technology and the iCALL researchers and developers and their close collaboration with learners and teachers.

Natural Language Processing: A field of study, which brings together different disciplines such as linguistics, computer science and artificial intelligence in order to computationally process human language for specific purposes, e.g. generating ‘intelligent’ feedback in language learning activities based on unpredictable learner input.

Initial Mutation: A feature common to all Celtic languages where initial consonants undergo ‘mutations’ (consonantal alternations, e.g. of stops and fricatives). Specific grammatical contexts trigger this process e.g. the formation of a question or a negative statement. Take, for example, the case of the verb ‘dún’ ( to close) in the present tense: ‘ Dún ann sí’ (statement: ‘she closes’); ‘ An ndún ann sí? ’ (question: ‘does she close?’); ‘Ní dhún ann sí ’ (negative statement ‘she does not close’). Verbal forms in Irish can have as many as 42 inflected forms.

Morphophonemics: The branch of linguistics that studies the relationship between written (morphological) and aural (phonological or phonetic) processes.

Text-to-Speech Synthesis: Refers to artificially generated speech produced by computer, usually on the basis of text input, i.e., a text-to-speech system (TTS).

Gaeltacht Areas: Regions in which the Irish language is the predominant vernacular. These areas are mainly located on the western seaboard of Ireland. These zones have official recognition from the Irish Government as particular cultural and socioeconomic zones.

Intelligent Computer-Assisted Language Learning (ICALL): The intersection of Speech Technology and Natural Language Processing with Computer-Assisted Language Learning. iCALL uses techniques from the field of artificial intelligence in order to bring ‘understanding’ of language to the development of CALL tools resulting in the semblance of ‘genuine’ interactivity between the learner and the system.

Personalisation in CALL: The process by which a language learning environment tailors itself to meet an individual learner’s needs or preferences, e.g. providing individualised feedback on a particular learner’s written input.

