A Modular Framework for Vision-Based Human Computer Interaction

A Modular Framework for Vision-Based Human Computer Interaction

Giancarlo Iannizzotto (University of Messina, Italy) and Francesco La Rosa (University of Messina, Italy)
Copyright: © 2013 |Pages: 22
DOI: 10.4018/978-1-4666-3994-2.ch060
OnDemand PDF Download:


This chapter introduces the VirtualBoard framework for building vision-based Perceptual User Interfaces (PUI). While most vision-based Human Computer Interaction applications developed over the last decade focus on the technological aspects related to image processing and computer vision, our main effort is towards ease and naturalness of use, integrability and compatibility with the existing systems and software, portability and efficiency. VirtualBoard is based on a modular architecture which allows the implementation of several classes of gestural and vision-based human-computer interaction approaches: it is extensible and portable and requires relatively few computational resources, thus also helping in reducing energy consumption and hardware costs. Particular attention is also devoted to robustness to environment conditions (such as illumination and noise level). We believe that current technologies can easily support vision-based PUIs and that PUIs are strongly needed by modern applications. With the exception of gaming industry, where vision-based PUIs are already being intensively studied and in some cases exploited, more effort is needed to merge the knowledge from HCI and computer vision communities to develop realistic and industrially appealing products. This work is intended as a stimulus in this direction.
Chapter Preview


Computer technology is rapidly saturating all aspects of our life and producing an epochal shift towards digital information management and communication; nevertheless, in many offices, laboratories and meeting rooms, when it’s time for creativity, people prefer to share their ideas and present their theses by using traditional media – felt pens on whiteboards, paper sheets and even combinations of everyday-life objects like ashtrays or pencil holders, mugs, and so on, usually placed and moved on a desk to represent a graph or a schema and its evolution.

This choice is so common due to the prompt availability of such objects and to the naturalness and immediacy of interaction provided by physical objects. Unfortunately, the information generated during such interactions cannot be easily recorded in digital format and can easily be lost.

Perceptual User Interfaces (PUIs) adopt alternate sensing modalities to replace or complement traditional mouse and keyboard input: specialized devices are exploited to provide the user with alternate interaction channels, such as speech, hand gesture, eye-gaze tracking and even face or full-body gesture. Often those devices resemble everyday-life objects, both in shape and in usage. The computer can therefore be hidden, disappearing in the environment (Want, Borriello, Pering, & Farkas, 2002).

A number of points should be considered when building a new PUI:

  • When collaborative use of the media is required, it might be necessary to discriminate between two or more users interacting with each medium. Moreover, when a user moves from one medium to another (for example, moves from a desk to another or to a whiteboard), the user interface of the first medium must be able to understand that the user has gone and the user interface of the second medium must be able to detect that the user is approaching.

  • Even when invisibility is not strictly required, the user interface should be as much unobtrusive as possible. Wires, gloves, heavy or large head-mounted displays, user-perceivable sensors which make the user feel clumsy must be avoided. As mentioned by Norman (Norman, 1998), the aim of the user interface of a computerized system should be to let the user concentrate on the task at hand and forget about the tool being exploited (the computer).

  • Initiating an interaction session (Engagement) with an user interface should be natural and fast. For example, initiating a user interaction with a mouse only requires the user to pick the device, while initiating an interaction with a sensorized glove requires the user to wear the glove (which takes more time and effort). Closing a session (Disengagement) should be equally easy and straightforward.

  • The user-perceived predictability of the system is a main issue: a lack of predictability in responding to user input rapidly leads to user frustration. A good level of predictability requires adequate feedback, as the user needs to know whether the interface correctly decoded a command, if the command was executed or there was any error, and even if the command was received at all. For the system to be predictable, every user input must always have a corresponding response: if an user input does not have a matching response, i.e. it is an unexpected input, then adequate improvement in the sensing accuracy and, eventually, error detection and correction or recovery processes must be triggered (Bellotti et al., 2002). There must not be undefined states and each response must be perceived as coherent and intuitive by the user.

  • The novel user interface should also support, at an initial stage (i.e. until the alternative interaction paradigm is fully accepted by software application developers), the legacy WIMP (Windows-Icons-Menu Pointers) applications, without forcing the user to revert to mouse and keyboard.

  • For a user interface to be of interest for the industry, it should be reasonably cheap. This does not mean that we can expect an innovative perceptual interface to be as cheap as a mouse or a keyboard, of course (at least, at the initial phases of its commercial launch). But it should not cost more than (or as much as) the computer it connects to.

Complete Chapter List

Search this Book: