Natural Human-System Interaction Using Intelligent Conversational Agents
Yacine Benahmed (Université de Moncton, Canada), Sid-Ahmed Selouani (Université de Moncton, Canada) and Habib Hamam (Université de Moncton, Canada)
Copyright: © 2009
In the context of the prodigious growth of network-based information services, messaging and edutainment, we introduce new tools that enable information management through the use of efficient multimodal interaction using natural language and speech processing. These tools allow the system to respond to close-to natural language queries by means of pattern matching. A new approach which gives the system the ability to learn new utterances of natural language queries from the user is presented. This automatic learning process is initiated when the system encounters an unknown command. This alleviates the burden of users learning a fixed grammar. Furthermore, this enables the system to better respond to spontaneous queries. This work investigates how an information system can benefit from the use of conversational agents to drastically decrease the cognition load of the user. For this purpose, Automated Service Agents and Artificial Intelligence Markup Language (AIML) are used to provide naturalness to the dialogs between users and machines.
Spoken dialogue constitutes the most natural and powerful means to interact with computers. Systems based on natural spoken dialogue start to appear feasible with the recent improvements in computer engineering and in speech and language processing. Speech-based interfaces increasingly penetrate into environments that can benefit from hands-free and/or eyes-free operation. In the context of the growth of network-based information services, messaging and edutainment, or the demand for personalized real-time services, automatic speech recognition and speech synthesis are highly promising. They are considered as sufficiently mature technologies to allow their inclusion as effective modalities in both telephony and multimodal Human-Computer Interaction (HCI) (Deng & Huang, 2004). Some applications of Internet searching and navigating are currently known; Opera version 9 has a basic voice interface (Opera, 2007). However, its recognition engine is pretty basic and cannot be trained; also, its synthesized voice is robotized. Google provides this kind of functionality through Google Voice Search which is still in a demo state. To use it, a user needs to make a phone call. This is not very convenient for users (Google, 2006). Another application is the one developed by Dr. Meirav Taieb-Maimon and colleagues from Ben-Gurion University of the Negev where car drivers can consult the Internet through voice commands (Sommer, 2005). Lyons et al. (2004) introduced a concept of a dual-purpose speech interaction that provides meaningful input to the computer in order to manage calendar and other communication tools for users. In these applications, the limitation of automatic speech recognition engines is outlined as the main obstacle to the efficiency of the interaction.