Abstract
Many modern museum exhibits employ interactive digital installations that can display content on large public surfaces, such as tabletops, walls, floors, etc. Recently, such displays have begun to include special devices that are able to track the user position and thus offer a personalized rendering with respect to the user point of view. While many qualitative evaluations of such systems exist, little effort has been done to define a quantitative testing framework. This is mainly due to the subjective nature of this kind of experience, which makes it difficult to produce objective data with standardized and repeatable procedures. With this chapter, the authors introduce a metric and a practical setup that can be adopted to evaluate a wide range of viewer-dependent displays.
TopIntroduction
Most stereoscopic displays, regardless of their physical implementation, work by showing to each eye of the user a different image. Specifically, these should be exactly the images that would have impressed his retinas if he really was before the depicted scene.
In many common 3D applications, such as movies or games, the stereo pair is obtained by using two separate physical cameras or by producing two independent scene renderings as viewed by two slightly different points of view. It is clear that the 3D illusion holds if and only if the optical system of the user exactly replicates the one that produced the images. This ultimately means that both optical centers of the user eyes must overlap exactly with the optical centers of the original cameras.
Differently, the perceived scene will be distorted as the 3D objects reconstructed (by the brain) will diverge from the original in size, position and proportions (see Fig. 1). Additionally, from a perception point of view, the effect can be further aggravated by the fact that points that would originally project into incident lines of sight, would probably result skewed when observed from the wrong point of view. This, in turn, would supply to the brain data that cannot be correctly interpreted in any way, resulting in an unpleasant sensation for most people. These shortcomings (not really advertised by the entertainment industry) are in fact responsible for the fluctuating quality of the user experience in 3D theaters.
Figure 1. The stereo inconsistency problem addressed by this work. Any stereo pair, when observed from a location different from the position of the capturing cameras, will result in impaired perception. Under these condition any observer will see an unpredictably distorted 3D object.
The only viable solution to this limitation is to provide a rendering dependent on the user position and on his interocular distance. Of course, such approach would hinder the ability to offer a shared fruition to several users without adopting personal displays.
Nevertheless, viewer-dependent displays have been extensively proposed in recent scientific literature, since they offer many other advantages. For starters, they are able to guarantee that the viewed objects exhibit a correct size within the Euclidean space where the user resides, thus allowing to interact with them naturally and to make meaningful comparisons between virtual and physical prototypes. Moreover, viewer-dependent rendering lets the user walk around the scene, viewing it from different angles and enabling the same inspection dynamic that would be possible with a real object. Such ideas are not new at all and have been widely developed in literature since their early implementations with the first immersive virtual reality and CAVE environments (Deering, 1992; Cruz-Neira, Sandin, & DeFanti, 1993).
More recently, Harish and Narayanan (2009) combined several independent monitors arranged in a polyhedra to create a multiple-angle display and a fiducial marker system to track the user pose. In their system the object is visualized as if it was inside the solid space defined by the monitors. Garstka and Peters (2011) used a single planar surface to display non-stereoscopic content according to the pose of the user head obtained with a Kinect sensor.
A combination of Kinect devices and traditional range scanners have been adopted in a very similar approach by Pierard, Pierlot, Lejeune and Van Droogenbroeck (2012). It should be noted that, albeit implementing view-dependent solutions, the aforementioned approaches do not exploit stereoscopy. In fact, their primary goal was to enable the user to walk around the object rather than to offer a realistic depth perception.
Stereo vision is exploited, for instance, by Hoang, Hoang and Kim (2013), that used standard head tracking techniques to allow slight head movements when looking at a 3D scene on a monitor. The concept is very similar to the non-stereoscopic technique proposed a few years earlier by Buchanan and Green (2008). In those cases, while the correct projection is always offered to the user, he is not allowed to inspect the object by moving around it.
Key Terms in this Chapter
Interactive Multimedia: An interaction paradigm which displays multimedia content, based on some user input.
Multimedia Technology: Relates to the reproduction of multimedia content like images, videos, audio files, etc.
Interactive Tables: A table or surface which is able to display some content and allows for user input by means of some interaction paradigm.
Tracking System: A system composed by hardware and software capable of tracking an object in the 2D/3D space.
Machine Learning: A branch of artificial intelligence, concerns the construction and study of systems that can learn from some acquired data.
Mobile Devices: A small, handheld computing device, typically having a display screen with touch input and/or a miniature keyboard (i.e. smartphone, tablet).