Computer-Controlled Graphical Avatars and Reinforcement Learning

Computer-Controlled Graphical Avatars and Reinforcement Learning

Yuesheng He (Hong Kong Baptist University, Hong Kong) and Yuan Yan Tang (Hong Kong Baptist University, Hong Kong)
DOI: 10.4018/978-1-4666-1806-0.ch018
OnDemand PDF Download:
List Price: $37.50


Controlling Graphical avatars intelligently in real-time applications such as 3D computer simulating environment has become important as the storage and computational power of computers has increased. Such avatars are usually controlled by Finite State Machines (FSM), in which each individual state represents the status of the avatars. The FSMs are usually manually designed, and the number of states and transitions are therefore limited. A more complex approach is needed for the avatar’s actions, which are automatically generated to adapt to different situation. The levels of the missions and algorithms for the control are the essential elements to achieve the requirements, respectively. Reinforcement Learning can be used to control the avatar intelligently in the 3D environment. When simulating the interactions between avatars and changeable environments, the problem becomes more difficult than working in a certain unchanged situation. Specific Framework and methods should be created for controlling the behaviors of avatars, such as using hierarchical structure to describe these actions. The approach has many problems to solve such as where the levels of the missions will be defined and how the learning algorithm will be used to control the avatars, et cetera. In this chapter, these problems will be discussed.
Chapter Preview


Designing an intelligent 3D avatar (virtual human) is a challenged work for researchers in the areas of 3D graphics and Machine Learning.

In the 3D graphical animation environment, background 3D objects in computer animation are also usually controlled by the computer. If virtual humans' movements are unrealistic due to their poor intelligence, the animator needs to manually edit them, which will result in a huge amount of extra cost. Traditional techniques such as decision trees and flocking have been used to control such avatars (Norman I. Badler C. B., 1993). However, those techniques can only generate reactive movements, and cannot realize strategic movements that benefit the avatars in the future.

For the applications of virtual human, such as simulating tasks of human's action in the buildings or cities, or accomplish a certain job in a certain environment, the requirement is further amplified by the fact that the user is generally not a skilled engineer and can therefore not be expected to be able or willing to provide constant, detailed instructions (T. Conde, 2006). Any high level plan for the virtual human must be on the base of the low level motion control and can support any optimization approaches which have possibility to be integrated into the motion control mechanism to simulate walking in different environments. Thus, to describe the actions by using hierarchical reinforcement learning and the algorithm of reinforcement learning to solve the Markov Decision Process (MDP) (Prabhu, 2007) problem is one of the key issues.

In this case, a framework of RL has been proposed for bridging the semantic gap effectively and achieving intelligent behaviors of avatars automatically. First, the semantic gap between the low-level computable geometric features and the avatars real physical actions are partitioned into small fragment, and multiple approaches of RL are proposed to bridge these small gaps effectively. Second, Hierarchical Reinforcement algorithms are proposed by incorporating concept ontology and multi-task learning to achieve more complex behaviors such as accomplishing a whole mission.

Thus, a framework for RL based on the average reward optimality criterion will be presented. Formulations of RL based on the average reward MDP model, both for discrete-time and continuous-time will be investigated.

The contents of this chapter will be:

  • 1.

    The behavioral features of avatars in the 3D graphical environments;

  • 2.

    Efficient algorithms of Reinforcement learning to achieve intelligent control;

  • 3.

    A framework of achieving autonomous actions of intelligent avatars;

  • 4.

    Relationship between the simple actions and complex behaviors of intelligent avatars.



Intelligent computer-controlled (Norman I. Badler C. B., 1993) (Anguelov, 2005) avatars (virtual humans) aim to provide virtual characters with realistic behaviors which imply endowing them with autonomy in inhabited virtual environments. Autonomous behavior consists in interacting with users or the environment and reacting to stimulus or events. Reactions are intelligent behaviors which are often been made by virtual humans themselves.

Reinforcement Learning (RL) (Sutton, 1998) (Theodore J. Perkins PERKINS, 2002) is an effective way to control avatars in the 3D graphical environments in real time. Moreover, hierarchical reinforcement learning (HRL) is a general framework for scaling RL to problems with large state and action spaces by using the task (or action) structure to restrict the space of policies.

Meanwhile, avatars are essential in computer animation (Norman I. Badler C. B., 1993) (Norman I. Badler J. A., 2002) (T. Conde, 2006), simulation and games. In many computer games, users can usually control an avatar to interact with other computer-controlled avatars. The intelligence of the computer-controlled avatar is important as it can affect the quality of the animation. In the same time, it provides a strong way to test the effectiveness of the algorithms of RL.

Complete Chapter List

Search this Book: