A Survey of 3D Rigid Registration Methods for RGB-D Cameras

A Survey of 3D Rigid Registration Methods for RGB-D Cameras

Vicente Morell-Gimenez, Marcelo Saval-Calvo, Victor Villena-Martinez, Jorge Azorin-Lopez, Jose Garcia-Rodriguez, Miguel Cazorla, Sergio Orts-Escolano, Andres Fuster-Guillo
Copyright: © 2018 |Pages: 25
DOI: 10.4018/978-1-5225-5628-2.ch004
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Registration of multiple 3D data sets is a fundamental problem in many areas. Many researches and applications are using low-cost RGB-D sensors for 3D data acquisition. In general terms, the registration problem tries to find a transformation between two coordinate systems that better aligns the point sets. In order to review and describe the state-of-the-art of the rigid registration approaches, the authors decided to classify methods in coarse and fine. Due to the high variety of methods, they have made a study of the registration techniques, which could use RGB-D sensors in static scenarios. This chapter covers most of the expected aspects to consider when a registration technique has to be used with RGB-D sensors. Moreover, in order to establish a taxonomy of the different methods, the authors have classified those using different characteristics. As a result, they present a classification that aims to be a guide to help the researchers or practitioners to select a method based on the requirements of a specific registration problem.
Chapter Preview
Top

Introduction

Registration of multiple 3D data sets is a fundamental problem in many areas as computer vision, medical imaging (Yang et al., 2013), object reconstruction (Pottmann et al., 2002), mobile robotics (Tamas and Goron, 2012), augmented reality (Duan et al., 2009), etcetera. Many researches and applications in this area are using low-cost RGB-D sensors for 3D data acquisition.

In general terms, the registration problem tries to find a transformation between two coordinate systems that better aligns the point sets. This problem is aimed in the case of computer vision by finding (commonly using iterative techniques) the transformation that minimizes the distance between two data sets with an overlapped region. Related to this problem, RGB-D low-cost cameras (e.g. Microsoft Kinect, Primesense Carmine...) provide both depth and color data simultaneously with a good frame rate, which fit many requirements in a wide range of applications. Several reviews related to the registration problem can be found in the literature. In (Zitová and Flusser, 2003) a complete color image registration survey is presented. Tam et al. (Tam et al., 2013) made a survey of registration methods for rigid and non-rigid point clouds and meshes. In (Rusinkiewicz and Levoy, 2001) a comparison among different ICP (Iterative Closest Point) methods is presented while in (Pomerleau et al., 2013) it is proposed a similar study but with real-world data sets.

In this paper, we are focused on a review and classification of the state-of-the-art in rigid registration methods for RGB-D sensors. Therefore, our main contributions are:

  • A study of the most used approaches to register data obtained from RGB-D sensors

  • A classification and presentation of the different proposals based on a set of different characteristics.

Different techniques can be used to estimate the 3D information from the real world: 3D lasers, stereo cameras, time-of-flight cameras, RGB-D cameras, etcetera. Each kind of sensor has some advantages for a specific purpose. However, many of them are used in a wide range of problems. Some 3D laser systems do not provide color information, so algorithms, which need visual features, are not suitable. Other 3D lasers systems provide color (using different approaches to incorporate color to the depth information), but their cost is prohibitive. Stereo cameras suffer from the lack of textures: image areas without texture do not provide depth information. The visual information of time-of-flight cameras, like SR4000, is infrared. It is affected by natural light and, normally, is noisy. In our previous work (Cazorla et al., 2010) we made some experiments using the SIFT visual feature method (Lowe, 2004) with this kind of cameras. As the SR4000 camera provides noisy images, the repeatability of the SIFT feature was low. RGB-D low-cost sensors (we will refer Kinect as a RGB-D sensor), provide both color and depth information with about 15 frames/second, using structured light. They are composed of three sensors: an IR (infrared) projector, an IR CMOS camera and a RGB camera. The IR sensor provides the depth information. The IR projector sends out a fixed pattern of bright and dark speckles. Using structured light techniques, depth is calculated by triangulation against a known pattern from the projector. The pattern is memorized at a known depth and then for each pixel, a correlation between this known pattern and current pattern is done, providing the current depth at that pixel. The Kinect camera has a resolution of 640 x 480 (307,200 pixels) and works in a range between 1 and 8 meters approximately. A detailed analysis of the accuracy and resolution of this camera can be found in (Khoshelham and Elberink, 2012) and (Lee et al., 2012). Other cameras based on the same RGB-D sensor are Primesense Carmine1 and Asus Xtion2.

Complete Chapter List

Search this Book:
Reset