Robotic Hardware and Software Integration for Changing Human Intentions

Robotic Hardware and Software Integration for Changing Human Intentions

Akif Durdu, Ismet Erkmen, Aydan M. Erkmen, Alper Yilmaz
Copyright: © 2014 |Pages: 26
DOI: 10.4018/978-1-4666-4607-0.ch067
(Individual Chapters)
No Current Special Offers


Estimating and reshaping human intentions are among the most significant topics of research in the field of human-robot interaction. This chapter provides an overview of intention estimation literature on human-robot interaction, and introduces an approach on how robots can voluntarily reshape estimated intentions. The reshaping of the human intention is achieved by the robots moving in certain directions that have been a priori observed from the interactions of humans with the objects in the scene. Being among the only few studies on intention reshaping, the authors of this chapter exploit spatial information by learning a Hidden Markov Model (HMM) of motion, which is tailored for intelligent robotic interaction. The algorithmic design consists of two phases. At first, the approach detects and tracks human to estimate the current intention. Later, this information is used by autonomous robots that interact with detected human to change the estimated intention. In the tracking and intention estimation phase, postures and locations of the human are monitored by applying low-level video processing methods. In the latter phase, learned HMM models are used to reshape the estimated human intention. This two-phase system is tested on video frames taken from a real human-robot environment. The results obtained using the proposed approach shows promising performance in reshaping of detected intentions.
Chapter Preview


Human-Robot interaction, which is generally used for cooperation tasks, requires reliable and full communication capabilities. In the case when the human and robot does not communicate, the cooperation between them can be established by estimating the intention of the human and/or the robot. If the human is able to express his intention clearly for a specific task then estimating the intention reduces to how the communication is performed and the budget necessary for perform the communication (K. A. Tahboub, 2006).

As observed in the nature and studied in engineering, biomimetic interactions between two intelligent agents require estimating the intention of one another, which eventually results in “either morphing their actions to that of the other’s intention” or “change their actions based on a certain strategy to adapt to other’s intention”. We term this strategic change the reshaping of intentions. The intention reshaping is an emerging area of research and there are only a handful of studies on plan/goal recognition and intention estimation in the recent years. The first study introduced the reshaping of human intention via robots is by Terada et al. (2007) which was limited to examining the psychological statuses of humans when they interact with robots. As described by Dennett in 1987, these psychological statuses include physical, design and intentional stance. Our problem design, however, is significantly different. We demonstrate how autonomous robots intelligently estimate the human intention in realistic scenarios, which is used to guide their motion to reshape the intention. In particular, intention estimation and recognition problems are characterized by prototypical phases of the defined problem domain. For instance, in the case when the problem is interpretation of human actions in a video, researchers use classification and learning algorithm trained from a database of known human actions. For example, Miyake et al. (2010) estimate the changes in facial expressions in a video to estimate human intentions. In similar vein, other researchers study gesture recognition and action recognition for estimating the human intentions (E. Daprati et al., 2007; F. Loula et al., 2005; V. Sevdalis et al., 2009, 2010; J.E. Cutting et al., 1977; F. Loula et al., 2005 etc.).

In this study, we adopt Aristotle’s description of vision: “what is where by looking” and emphasize two most essential problems: estimation (identifying “what”) and localization (identifying “where”) (O.M.C. Williams, 2005). We realize this by following the conjecture that human’s intention is reflected in his posture and the objects in the direction of motion, such as a person heading to a café possibly will drink coffee. In our design, a human walks in a room, which contains a set of autonomous robots and other objects, such as a bookshelf, computer, cameras and a table supporting the experiment. The motion of the human and the interactions are processed and estimated by computers connected to two cameras. Figure 1 schematically illustrates the software and hardware integration. For each captured image, the vision based inference (VIS) software detects and tracks the human and generates a feature descriptor, which is mapped to a low-dimensional vector representing the characteristics of the human intention. As will be discussed later, this reduced space provides a simple yet powerful representation.

Figure 1.

The flow diagram of the prototyping system. A digital image from the camera is acquired and processed to generate a feature vector. This feature vector is then translated to a lower dimensional descriptor. With no initial models on how images are formed, the vision-based inference system mapping is learned from a training data containing example input–output pairs.


Complete Chapter List

Search this Book: