Modeling Binocular and Motion Transparency Processing by Local Center-Surround Interactions

Modeling Binocular and Motion Transparency Processing by Local Center-Surround Interactions

Florian Raudies (Boston University, USA & Center of Excellence for Learning in Education, Science and Technology (CELEST), USA & Center for Computational Neuroscience and Neural Technology (CompNet), USA) and Heiko Neumann (Institute of Neural Information Processing, University of Ulm, Germany)
DOI: 10.4018/978-1-4666-2539-6.ch006

Abstract

Binocular transparency is perceived if two surfaces are seen in the same spatial location, but at different depths. Similarly, motion transparency occurs if two surfaces move differently over the same spatial location. Most models of motion or stereo processing incorporate uniqueness assumptions to resolve ambiguities of disparity or motion estimates and, thus, can not represent multiple features at the same spatial location. Unlike these previous models, the authors of this chapter suggest a model with local center-surround interaction that operates upon analogs of cell populations in velocity or disparity domain of the ventral second visual area (V2) and dorsal medial middle temporal area (MT) in primates, respectively. These modeled cell populations can encode motion and binocular transparency. Model simulations demonstrate the successful processing of scenes with opaque and transparent materials, not previously reported. Results suggest that motion and stereo processing both employ local center-surround interactions to resolve noisy and ambiguous disparity or motion input from initial correlations.
Chapter Preview
Top

Introduction

In real world situations transparency and semi-transparency often occur as likely as opaque surfaces in different depths or moving differently. For instance, reflections in the window of a driving vehicle overlay with the background outside the car. Driving with a dirty windshield, where the dust on the windshield is moving independently from the outside world creates two layers of independent motion in the same region of the visual field. Also, semi-transparency occurs, for instance, when viewing crowds of people moving in stripes of opposite directions from an elevated position. Here, by spatial and temporal integration of motion signals in the visual system, two motions are perceived at the same spatial locations of the visual field.

Although transparency and semi-transparency configurations occur in real-world situations common models of stereo and motion processing typically do not support the processing of transparent surfaces. Instead, these models assume the existence of a single computational solution generating a single depth representation or motion quantity for each spatial location. Thus, the majority of neural models do not successfully explain the processing of transparent and opaque surfaces. This motivated the development of a new generalized model for binocular and motion transparency processing using essentially the same computational mechanisms operating upon different feature domains. These proposed mechanisms and their representations are inspired by recent reports of neural populations in different visual areas and their selectivity to orientation, spatial frequency, disparity, and visual motion. These neural populations can encode multiple stimulus disparities or motions (Treue et al., 2000). Our objectives for the newly proposed model are several fold. First, we suggest that the perception of transparent (as well as opaque) motion and stereo stimuli can be explained by the same mechanism of local center-surround interaction. Second, the suggested mechanisms of our model architecture are in accordance with the general physiology on disparity and motion processing and show coinciding tuning properties of the model’s population with recorded data. Finally, the model is probed with realistic image sequences and results demonstrate its successful application for computational vision.

Our modeling effort is inspired by the functions and representations of various brain areas that mainly contribute to visual processing. For example, in the primary visual area (V1) simple cells are selective to the high-low and low-high intensity edges and complex cells are sensitive to oriented edges independent of the local contrast polarity (Hubel & Wiesel, 1962). The results of further investigations lead to a refinement of models by describing motion and disparity selectivity in primary visual cortex based in specific filtering mechanisms, such as, e.g., Gabor functions (Priebe et al., 2006; Prince et al., 2002; Ringach, 2002; Cumming & Parker, 1999). These Gabor functions characterize the selectivity of neurons by spatial frequencies, size, orientation, and temporal frequency. This selectivity forms the initial feature representation in the primary visual area upon which motion and disparity processing mechanisms – among others – can operate. The secondary visual area (V2) is known to encode binocular disparity (Hubel & Wiesel, 1970; Hubel & Livingstone, 1987; Thomas et al., 2002). The middle temporal area (MT) can be seen as major stage for integration and segregation of visual image motion, but also encodes binocular disparity (Rodman & Albright, 1987; Born & Bradley, 2005; Bradley et al., 1995; Maunsell & Van Essen, 1983; Nover et al., 2005; Treue & Andersen, 1996). Taken together, these evidences inspired the development of a biologically plausible model for the processing of binocular and motion transparency presented below.

Complete Chapter List

Search this Book:
Reset