Culturally Determined Preferences: Automatic Speech Recognition (ASR) Systems vs. Live Help

Culturally Determined Preferences: Automatic Speech Recognition (ASR) Systems vs. Live Help

Osamuyimen Stewart (IBM T.J. Watson Research Labs, USA) and Joyram Chakraborty (State University of New York, USA)
DOI: 10.4018/978-1-61520-883-8.ch004

Abstract

Theoretical models for the study of cross cultural variables in communication abound. However, there are very few empirical studies to validate any of these models in the Human-Computer Interaction (HCI) literature involving Automatic Speech Recognition (ASR). This is the gap this chapter seeks to fill by addressing the broad and foundational question of whether a framework for cross cultural dimensions can be used to investigate how people use (or are likely to use) ASR systems versus Live (human) help. In particular, the authors focus on one of Hofstede’s (1991) five factors: individualism-collectivism. They show that using Hofstede’s questionnaire does not yield expected results in the HCI domain involving ASR. Consequently, the authors propose a new set of questions derived from cultural and psycholinguistic factors surrounding how people might tackle some common problems. This new questionnaire proves to be effective in deriving the cross-cultural distinctions congruent with benchmarked predictions, while also providing empirical evidence for culturally determined preferences for the use of ASR systems. Furthermore, the authors explore one implication from this study based on the discussion of the cross-cultural correlation between the nature of a task (simple or complex) and the evolution or adoption of ASR systems for self help.
Chapter Preview
Top

Introduction

One of the critical success criteria of any technology is its effective use and adoption by large sections of the user population across all cultures. Thus, any systematic cross cultural evaluation of Human-Computer Interaction (HCI) technology is always significant because each culture or environment poses a unique set of challenges that must be dealt with in order for a particular technology to gain a foothold and be successfully used. Subtle properties or changes in the design can have major implications in the culture-context in which the technology is deployed. For example, Russo and Boor (1993) discuss many cases of cross-cultural blunders in product launch, two of which will suffice: When the British manufacturer Rolls Royce introduced the Silver Mist in Germany the adverts proved disastrous as the word ‘mist’ in German means manure. Also, when the Italian car maker Fiat introduced Uno in Finland it proved equally embarrassing as the word ‘Uno’ in Finnish translates as garbage. Thus, mere language translation without proper understanding of the culture-context can prove costly for cross-cultural usability. In general, language is regarded as a major vehicle of culture (Marcus and Gould, 2000; Kaplan, 1966; Kluckholn, 1950; Triandis et al, 1988; Ting Toomey, 1999). Therefore, Automatic Speech Recognition (ASR) systems whereby humans use language to interact with computers offer an excellent opportunity for studying cross cultural issues. Increasingly, human interaction with computer systems involving ASR is becoming globally pervasive. Even in remote parts of the developing world such as India and Africa, the phenomenon of humans interacting with computer systems using speech is gaining in popularity. This has been attributed to the penetration of the ubiquitous mobile phone which allows various non-governmental organizations (NGOs) and technology service providers to offer telephone-based automated services and information to the populace (e.g., check mobile phone minutes used, check rolling electricity blackout times, etc.).

However, while the promise of using our own voice to interact with computer systems is being realized, the impact of culture on such systems is something that is generally acknowledged but not systematically studied (Nielson, 1990; Chakraborty et al, 2008; Stewart & Chakraborty, 2008; Stewart et al, 2009). The study of cross-cultural issues in technology becomes more important especially in light of the fact that ASR systems are fraught with many usability problems including speech recognition errors, the cognitive burden on users having to quickly respond to the system (which is exacerbated in the event of an error), constraints on what the system can understand, etc., all of which increase the difficulty of using such systems, and underscores the need to investigate how different cultures perceive their use and/or usefulness in the context of HCI. Several theoretical models for the study of cross cultural variables in communication abound (Aykin, 2005; Chakraborty et al, 2008; Chakraborty, 2009; Hofstede, 1991; Marcus & Gould, 2000; Nielson, 1990; Yeo, 1996). Unfortunately, there is very little empirical data to validate any of the models in ASR. This is the issue we address in this chapter by focusing on the broad and foundational question of whether a framework for cross-cultural research like the one proposed by Hofstede (1991) can be applied directly to ASR. More specifically, in order to investigate culturally determined user preferences for the use of ASR systems, we examine the applicability of Hofstede’s proposed questionnaire based on one major cultural variable: individualism versus collectivism. We address the following related questions: (a) Does individualism affect or influence how people interact with ASR systems? (b) Does collectivism in any way or form affect or influence how people interact with ASR systems?

Key Terms in this Chapter

Continuum: This expresses a locus or center as a range of attributes going from one end of the scale, e.g., point A to another end, e.g., point G, instead of expressing the center as one exact or specific point. This implies that any point between A and G can qualify or suffice as the center, instead of having one single point as the exact target.

Live (Human) Help: These are the people who work as customer service agents in Call Centers, tasked with the responsibility of helping customers solve problems over the telephone, Web, (or sometimes in person)

Psycholinguistics: The systematic study of how psychology or aspects of human behavior affects the use of language

Individualism: This refers to a behavior where a person focuses on themselves and seeks to get things done on their own, because that is how they are wired—the do-it-yourself person who relishes individual accomplishment.

Extrinsic Context: The external environment or conditions that regulate or influence the behavior of an individual or an object

Culture Context: This refers to the environment or conditions that underlie a behavior or surround an event.

Culture: This refers to a property, object, behavior, language, etc that is commonly shared by a group of people which sufficiently distinguishes them from others.

Intrinsic Context: The internal inner workings that occur inside an individual or object to regulate or influence their external behavior

Collectivism: Describes a behavior in which a person always desires to work in a group or be linked to a group rather than doing things all by themselves

ASR: Automatic Speech Recognition (ASR) refers to a computer aided system that is designed to perform (automate) tasks which are traditionally done by a human. In the process, it converts human speech to text, interprets the recognized text, and then speaks back a response to the user.

Complete Chapter List

Search this Book:
Reset