Designing ECAs to Improve Robustness of Human-Machine Dialogue

Designing ECAs to Improve Robustness of Human-Machine Dialogue

Beatriz López Mencía (Universidad Politécnica de Madrid, Spain), David D. Pardo (Universidad Politécnica de Madrid, Spain), Alvaro Hernández Trapote (Universidad Politécnica de Madrid, Spain) and Luis A. Hernández Gómez (Universidad Politécnica de Madrid, Spain)
DOI: 10.4018/978-1-60960-617-6.ch003
OnDemand PDF Download:


One of the major challenges for dialogue systems deployed in commercial applications is to improve robustness when common low-level problems occur that are related with speech recognition. We first discuss this important family of interaction problems, and then we discuss the features of non-verbal, visual, communication that Embodied Conversational Agents (ECAs) bring ‘into the picture’ and which may be tapped into to improve spoken dialogue robustness and the general smoothness and efficiency of the interaction between the human and the machine. Our approach is centred around the information provided by ECAs. We deal with all stages of the conversation system development process, from scenario description, to gesture design and evaluation with comparative user tests. We conclude that ECAs can help improve the robustness of, as well as the users’ subjective experience with, a dialogue system. However, they may also make users more demanding and intensify privacy and security concerns.
Chapter Preview


Spoken Language Dialogue Systems (SLDSs) are being introduced into the user interfaces of an ever growing range of devices and applications. This is in spite of pervading problems encountered by almost anyone who has had to deal with these systems as users, and which result in low quality, inefficient communication, or even its breakdown altogether.

A widely held view among those interested in improving the users’ experience with SLDSs is that Embodied Conversational Agents (ECAs) can be beneficial to the communication process by fostering a more effective, affective and user-friendly interaction (Foster, 2007). Adequate, broadly accepted cognitive and operational models explaining how humans combine speech and gesture have not yet been found (Kopp et al., 2008). Nevertheless, empirical evidence is growing that shows that ECAs provide a supra-linguistic communication channel that, through gestures and other visual cues, may be used to add semantic richness, improve intelligibility by adding redundancy, and display attitudes and moods that open the scope for emotional and empathic dialogue strategies (Buisine et al., 2004; Cassell et al., 2000b; Picard, 2003).

As regards dialogue design, ECAs in research have been approached mainly from linguistic and social interaction perspectives, and from Artificial Intelligence-related fields (a snapshot of efforts containing elements along these lines was offered in the Functional Markup Language workshop at the AAMAS conference in 2008, from which the example references in this paragraph have been taken). The latter approaches (AI) deal with constructing appropriate, “intelligent” responses to users or other agents in a particular context of interaction (as in the ICT virtual human project by Lee et al., 2008); the former are concerned with a variety of elements in a hierarchy of linguistic and, more generally, conversational functions (Heylen & ter Maat, 2008), and they tend to focus primarily on high level aspects such as interpersonal stance (Bickmore, 2008) and social relations (see, e.g., Samtani et al., 2008).

Surprisingly, however, relatively little attention has been given to studying how ECAs can affect dialogue when miscommunication occurs at the lower yet most common levels, and in particular at the speech recognition level. In this chapter we present a relatively simple dialogue system we have built for the purpose of identifying a set of basic dialogue situations that either arise when problems occur (for instance, a no-input or a no-match) or are prone to lead to interaction problems (e.g., turn-taking), and we show the process of designing ECA behaviour for each of these situations.

Our goal is to illustrate how ECAs can help in these low-level interaction problem cases and also to show the overall design, validation and user-centred testing process as a common sense example of the general approach we believe can be applied to interaction problems higher in the hierarchy. As we shall see, some of the strategies will involve an affective element introduced for two reasons: to influence the users’ emotions during the interaction so as to obtain a desired result such as success with an error recovery strategy, and, ultimately, to try to improve the users’ subjective opinion of the system – even, and especially, when there are problems and recovery strategies don’t work well and the task is accomplished inefficiently or fails altogether.

The chapter is structured as follows: we begin with some background on the literature on embodied conversational systems, robustness issues in dialogue systems and how ECAs can help. We then enter the main body of the chapter, which is a case study on ECA behaviour design applied to the improvement of (mostly ASR-related) robustness in SLDSs. First we present a guiding application and interaction scenario and we explain the behavioural strategies devised for the ECAs in the scenario. Next, we describe the user tests we performed to compare interaction performance and the subjective experience of users between a version of the system with ECAs and a version with only speech output, after which we present and discuss a few of the more interesting results we obtained. We then offer a few thoughts on actual and desirable, present and future, research directions. The concluding section summarises and highlights the main points made in the chapter.

Complete Chapter List

Search this Book: