Development of Word Recognition across Speakers and Accents

Karen E. Mulak (University of Western Sydney, Australia) and Catherine T. Best (University of Western Sydney, Australia & Haskins Laboratories, USA)
DOI: 10.4018/978-1-4666-2973-8.ch011
The pronunciation of a given word can contain considerable phonetic variation both within and between speakers, affects, and accents. For reliable word recognition, children must learn to hear through the variation that does not change a word’s identity, while still discerning variation that does not belong to a given word’s identity. This requires knowledge of phonologically specified word invariants above the level of phonemic specification. Reviewing developmental accounts and empirical evidence, this chapter discusses the emergence of children’s ability to attend to speaker- and accent-independent invariants. The authors focus particularly on changes between the ages of 7.5-10.5 months, where evidence points to a developing ability to recognize speech across within-speaker and within-group variation, and 14-19 months, where increasing evidence suggests a shift from phonetically to more phonologically specified word forms. They propose a framework that describes the attentional shifts involved in this progression, with emphasis on methodological concerns surrounding the interpretation of existing research.
Variation in word pronunciation is ubiquitous in natural speech, which would seem at first glance to pose difficulties for young children’s ability to learn and recognize spoken words. Variation arises naturally from the process of speech production, which is a complicated, dynamic task involving many variables that combine to create myriad phonetic-acoustic variations, to the effect that no two pronunciations of a given word are ever exactly the same. Variations in productions of the same word by the same speaker range from minute changes in tongue position, jaw position, amplitude envelope, temporal coordination of articulators, through to grosser changes in voice quality when whispering or shouting; when sad or excited; and articulatory changes when speaking in formal versus casual contexts. Speech variability is magnified even further across speakers, where vocal tract characteristics and changes in articulatory style add even more dimensions of variation than those that can occur within a speaker. Moreover, those between-speaker variations are even further exaggerated when the speakers have different regional accents, which contain even larger variation to consonants and vowels, as well as to meter, stress patterning, and intonation as compared to within-accent speakers. These multiple sources of variation in word pronunciation are known to affect the speed and accuracy of adults’ spoken word recognition (e.g., Nygaard, Burt, & Queen, 2000), and to cause notable difficulties for Automatic Speech Recognition (ASR) systems (e.g., Benzeghiba, et al., 2007; Henton, 2006). Thus, we might well expect the young children learning their first language to have great difficulties with handling variation in word pronunciation, given their much more restricted experience with, and knowledge about, spoken language.

Both common experience and controlled laboratory studies demonstrate that adults do understand spoken words across this wide range of variation, usually quite easily. Recent research shows that young children demonstrate an incomplete ability to accommodate at least some forms of variation in speech. This chapter discusses how perceivers resolve these intra- and inter-speaker variations in speech segments and spoken words, with a specific focus on development of these skills in infants in the first two years of life. For theoretical and technical clarity, we will begin with a brief overview of key terms and concepts in speech research.

