Developing Enculturated Agents: Pitfalls and Strategies

Matthias Rehm (Aalborg University, Denmark)
DOI: 10.4018/978-1-61520-883-8.ch016
Embodied Conversational Agents (ECAs) are complex multimodal systems with rich verbal and nonverbal repertoires. There human-like appearance raises severe expectations regarding natural communicative behaviors on the side of the user. But what is regarded as “natural” is to a large degree dependent on our cultural profiles that provide us with heuristics of behavior and interpretation. Thus, integrating cultural aspects of communicative behaviors in virtual agents and thus enculturating such systems seems to be inevitable. But culture is a multi-defined domain and thus a number of pitfalls arise that have to be avoided in the endeavor. This chapter presents some of the pitfalls for enculturating interactive systems and presents strategies on how to avoid these pitfalls in relation to the standard development process of Embodied Conversational Agents.
Enculturated Agents: A Definition

This chapter argues that Embodied Conversational Agents (ECAs) (Cassell, Sullivan, Prevost & Churchill, 2000) are prototypical devices for enculturating the human computer interface. It examines the standard development process for ECA systems and discusses at each step strategies to avoid the pitfalls that arise from integrating culture as a computational parameter into the process. The chapter is not going to argue for or against specific cultural theories, but relies on Hofstede’s (2001) dimensional theory of culture as a widely used example.

Embodied Conversational Agents can be regarded as a special case of multimodal dynamic interactive systems (see Figure 1 for some examples). They promote the idea that humans, rather than interacting with tools prefer to interact with an artifact that possesses some human-like qualities. If it is true, as Reeves and Nass’ (1996) media equation suggests, that people respond to computers as if they were humans, then there are good chances that people are also willing to form social relationships with virtual agents. As a consequence, it seems inevitable to take cultural aspects into account when creating such agents. Due to their embodiment, agents present complex multimodal systems with rich verbal and nonverbal repertoires. Additionally, the appearance of the agent might play an important role when taking cultural aspects into account.

Figure 1.

Examples of Embodied Conversational Agents. Top row: affective spectators (Damian, Janowski & Sollfrank, 2009), an autonomous bot in second life (Rehm & Rosina, 2008), the Gamble multiuser dice game (Rehm, 2008), interacting with virtual dancers (Rehm, Vogt, Bee & Wissner, 2008). Bottom row: collaborating agents in edutainment (Rehm, André, Conradi, Hammer, Iversen, Lösch, Pajonk & Stamm, 2006), a virtual tourist guide, the FearNot! anti-bullying system (Hall, Woods, Aylett, Newall & Paiva, 2005).

Embodied Conversational Agents as an interface metaphor have a great potential to realize cultural aspects of behavior in several fields of human computer interaction:

  • 1.

    Information presentation: By adapting their communication style to the culturally dominant persuasion strategy, agents become more efficient in delivering information or selling a point or a product.

  • 2.

    Entertainment: Endowing characters in games with their own cultural background has two advantages. It makes the game more entertaining i.) by providing coherent behavior modifications based on the cultural background and ii.) by letting the characters react in a believable way to (for them) weird behavior of other agents and the user.

  • 3.

    Education: For educational purposes, experience-based role-plays become possible, e.g. for increasing cultural awareness of users or for augmenting the standard language textbook with behavioral learning.

The following issues for enculturating Embodied Conversational Agents are discussed in this chapter:

  • 1.

    Enculturating agents opens up a challenging research field because culture penetrates most of the above mentioned features (verbal and nonverbal behavior, appearance) of an agent. Thus, enculturating such a system has to rely on a solid theoretical framework that is able to describe or even predict these influences.

  • 2.

    Another critical issue that has to be discussed but is not easily solved is the following: apart from a specific cultural theory, different levels of culture like national culture, regional culture, the culture of the agents vs. the culture of the developer have to be regarded.

  • 3.

    Moreover, the developers’ own cultural background provides them with implicit design heuristics for the system, which have to be challenged actively at every step of the process.

Key Terms in this Chapter

Cultural Heuristics: Our cultural backgrounds largely depend how we interpret interactions with others, which aspects we find relevant, and what kind of behavior is deemed annoying or insulting. We use the term cultural heuristics to denote such behavioral patterns related to cultural backgrounds.

Enculturating Interactive Systems: The challenge of integrating cultural aspects of human interaction in an interactive system. Cultural aspects can consist of the different aspects mentioned in this chapter like proxemics, gaze behavior, appearance etc.

Embodied Conversational Agent (ECA): Virtual characters serve as communication partners for the user. Apart from verbal interactions, the embodiment allows realizing non-verbal interaction channels like gaze, facial expressions or gestures. Interaction modeling concentrates on the communicative functions of verbal and non-verbal behavior.

Multimodal Interaction: The use of more than one input and output channel is called multimodal interaction, e.g. speech and gestures for input and text and sound for output.

Cultural Dimensions: Prominent cultural theory by Hofstede (2001) that defines culture as a concept consisting of five value dimensions. For a concrete culture a value can be given for each dimension. Dimensions are hierarchy, gender, identity, uncertainty avoidance, and orientation.

Proxemics: Spatial behavior in face-to-face interactions has been termed proxemics by Hall (1966). He distinguished different spatial areas like personal and social that are linked to different routine behavior.

Multimodal Corpus: A multimodal corpus is a collection of video data that is annotated along the timeline in order to code information in the video like e.g. gestural expressivity, emotions, or dialogue functions. The annotation serves as an empirical foundation for modeling the behavior of an agent.

