Mapping with Monocular Vision in Two Dimensions

Mapping with Monocular Vision in Two Dimensions

Nicolau Leal Werneck (Universidade de São Paulo, Brazil) and Anna Helena Reali Costa (Universidade de São Paulo, Brazil)
Copyright: © 2012 |Pages: 11
DOI: 10.4018/978-1-4666-1574-8.ch022
OnDemand PDF Download:
No Current Special Offers


This article presents the problem of building bi-dimensional maps of environments when the sensor available is a camera used to detect edges crossing a single line of pixels and motion is restricted to a straight line along the optical axis. The position over time must be provided or assumed. Mapping algorithms for these conditions can be built with the landmark parameters estimated from sets of matched detection from multiple images. This article shows how maps that are correctly up to scale can be built without knowledge of the camera intrinsic parameters or speed during uniform motion, and how performing an inverse parameterization of the image coordinates turns the mapping problem into the fitting of line segments to a group of points. The resulting technique is a simplified form of visual SLAM that can be better suited for applications such as obstacle detection in mobile robots.
Chapter Preview


Estimating the localization of a camera and a map of an environment at the same time is a problem that has been approached by researchers from two directions (Strasdat et al., 2010). In Computer Vision it is the Structure from Motion (SFM) problem, and in Mobile Robotics it is a case of Simultaneous Localization and Mapping (SLAM) where the sensor is a camera. Most of the existing techniques rely on some sort of image feature extraction to produce landmark observations that are further analyzed to estimate the camera track and landmark positions. This article demonstrates how simple image processing methods, similar to what is used in robot navigation and localization, can be used to perform environment mapping. The proposal takes in consideration some restrictions in the environment and motion that lead to simpler processes in the whole mapping system, from the image analysis to the data association and parameter estimation. Some performed tests demonstrate the possibility of applying this technique for obstacle avoidance based on monocular mapping, and also reveal the necessary steps for the development of a more elaborate system for 3D reconstruction of indoor environments.

This research concerns images taken from indoor and other man-made environments, from where large edges can be extracted. The localization of robots in bi-dimensional maps under these conditions is a well-studied problem (Borenstein, 1996). The landmarks used are vertical edges, and their positions on the ground plane constitute the map. While localization has been well studied, the problem of building maps using only cameras under these or other restrictions, but without the help of metric sensors, has only recently received more attention. Many researchers studied systems where a stereoscopic apparatus perceives visual landmarks with three-dimensional location and corresponding visual descriptors to perform visual SLAM (Sünderhauf & Protzel, 2007). These landmarks are usually detected using punctual feature extraction algorithms, and the stereoscopic rig is used as a metric sensor.

Stereo rigs are inherently wide, hard to build, and have limited precision. These problems motivate the research on monocular SLAM techniques that have been growing lately. Some recent developments in this area have been the use of edge landmarks (Eade & Drummond, 2009), the use of alternative parameterizations (Solá, 2010) and also the creation of maps with non-calibrated cameras that are correct up to scale parameters (Civera et al., 2007). The use of edge landmarks is very important to many application scenarios, especially indoor environments, and the alternative parameterizations make the problem more suited to filtering techniques such as the extended Kalman filter. The creation of scaled or otherwise transformed maps also makes the problem easier to approach, and makes the technique more generally applicable.

The research presented in this article relates to these recent trends. The features extracted from the input images are intended to be edges, and they are detected by a search for peaks in the image gradient over a line orthogonal to their direction. This kind of detection is very simple but has great application potential, and was inspired from an idea by Nourbakhsh (1997). One of its great advantages is making use of information that is frequently ignored in systems that only work with features that have low auto-correlation in all directions, which excludes edges.

Complete Chapter List

Search this Book: