Object Recognition Pipeline: Grasping in Domestic Environments

Object Recognition Pipeline: Grasping in Domestic Environments

John Alejandro Castro Vargas (University of Alicante, Spain), Alberto Garcia Garcia (University of Alicante, Spain), Sergiu Oprea (University of Alicante, Spain), Sergio Orts Escolano (University of Alicante, Spain) and Jose Garcia Rodriguez (University of Alicante, Spain)
Copyright: © 2019 |Pages: 13
DOI: 10.4018/978-1-5225-8060-7.ch021

Abstract

Object grasping in domestic environments using social robots has an enormous potential to help dependent people with a certain degree of disability. In this chapter, the authors make use of the well-known Pepper social robot to carry out such task. They provide an integrated solution using ROS to recognize and grasp simple objects. That system was deployed on an accelerator platform (Jetson TX1) to be able to perform object recognition in real time using RGB-D sensors attached to the robot. By using the system, the authors prove that the Pepper robot shows a great potential for such domestic assistance tasks.
Chapter Preview
Top

Background

Object recognition is the process by which objects are detected in images, obtaining information related to their position and orientation in the scene (Garcia-Garcia et al., 2016). There are several approaches for this purpose but we will emphasize those ones which are based on local features since they are more robust in unstructured environments with occlusions. These techniques extract representative local features from both the scene and the models and then allowing identify those model objects by matching the extracted features.

The classical pipeline for object recognition based on local features is three-staged (Guo et al., 2014). First, several keypoints are detected in order to extract representative information from the scene. This will improve the computational cost of the pipeline by processing and discarding ambiguous regions that do not provide important information. Next, the neighborhoods of those keypoints are described for the training or matching stage by encoding them into descriptors (Bronstein et al., 2010). Finally, correspondences between descriptors from the scene and the model objects are obtained. This last stage of the recognition pipeline can be further divided into three steps (Guo et al., 2014): (1) matching or correspondence search, frequently using techniques like Nearest Neighbor (NN) to match the descriptors, (2) hypothesis generation, obtaining a transformation from the object model to the possibly detected object in the scene, and (3) verification, to determine if the obtained transformation is valid for the model and the hypothesis.

In the literature, grasping methods are classified into two categories (Bohg et al., 2014): analytical and empirical. In the analytical approaches, physical formulations are applied in order to synthesize grasp points (Bicchi and Kumar, 2000). Otherwise, in the case of empirical solutions, mathematical and physical models are used allowing the system to learn from simulations or a real robot (Kamon et al., 1996).

Complete Chapter List

Search this Book:
Reset