Perceiving the World With Sound: An Overview to Robot Audition

Perceiving the World With Sound: An Overview to Robot Audition

Usama Saqib, Robin Kerstens
Copyright: © 2023 |Pages: 30
DOI: 10.4018/978-1-6684-5381-0.ch003
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Robot perception is the ability of a robotic platform to perceive its environment by the means of sensor inputs, e.g., laser, IMU, motor encoders, and so on. Much like humans, robots are not limited to perceiving their environment through vision-based sensors, e.g., cameras. Robot perception, through the scope of this chapter, encompasses acoustic signal processing techniques to locate the presence of a sound source, e.g., human speaker, within an environment for human-robot interaction (HRI), that has gained great interest within scientific community. This chapter will serve as an introduction to acoustic signal processing within robotics, starting with passive acoustic localization and building up to contemporary active sensing methods, such as the usage of neural networks and spatial map generation. The origins of active acoustic localization, which finds its roots in biomimetics, are also discussed.
Chapter Preview
Top

Introduction

The detection and localization of acoustic reflectors such as walls, objects or people within an environment is a popular topic within the area of robotics. Traditionally, camera and laser-based technologies are used to detect the presence of these landmarks to generate a spatial map of a 3D space and aid these robotic platforms in navigating within their environment. However, these light-based sensing modalities often face challenges such as a lack of light, overexposure (glare), the inability of detecting transparent surfaces such as windows, false reflections, or their sensitivity to occlusion. These issues can be addressed when incorporating sound-based sensing modalities. Research in animal auditory system has inspired researchers to develop technologies to locate the presence of sound sources within an environment.

This chapter will serve as an introduction to robot audition, starting with the subdomains of robot audition and building up to more contemporary active sensing methods, such as the usage of Artificial Intelligence (AI) on recorded data to detect and track acoustic sources and for spatial map generation. The origins of active acoustic sensing, which finds its roots in biomimetics, are also discussed. This chapter is written as a reference for people working on robot perception using sound and wants to contribute to future works by bringing new challenges to the field of robot perception. The chapter will begin with an introduction to biomimicry in robotics, which aims to mimic an animal’s auditory system to localize the position of sound sources in the nearby environment. Biomimicry facilitates intelligent designs in robots to achieve high performance and robustness when navigating between and localizing acoustic sources in a dynamic environment. Designers of such robots make use of new materials, sensors and actuators to provide high capabilities that allow robots to mimic biological processes such as hearing.

Furthermore, this chapter will review techniques in scientific literature associated with passive acoustic localization and active acoustic localization, which are the two important sub-domains of robot audition for SSL. Passive acoustic localization involves detecting sound generated by objects present in an environment while active acoustic localization techniques probe an environment with a known sound to detect the position of objects within an environment. Both sub-domains have their fair share of advantages and disadvantages. For example, active acoustic localization is useful in a quiet environment. This is normally the case when a robot explores an underground environment, such as caves, tunnels, and sewers. Bats, rats and even some aquatic mammals are known to use these techniques to navigate and hunt in complete darkness. These animals probe the environment with a unique sound, or call, and use acoustic echoes to distinguish flora and fauna, different types of animals/prey, and everything needed for their survival. Therefore, a discussion on the different types of probe signals that can be used in robotics to acquire spatial information from the environment is also an important highlight of this chapter. More specifically, analysis of additive white Gaussian noise (AWGN), coded emissions, and chirp signals will be discussed in detail. The application of spatial mapping using echolocation is also an important highlight of this chapter, which incorporate spatial filtering techniques, such as, beamforming techniques.

Finally, the chapter will review data-driven approaches to using contemporary methods such as neural networks for perceiving the environment in an artificially intelligent way. This is relatively a newer approach that combines physics-based model of sound with machine learning to teach robotic platforms to learn to classify and predict their surroundings.

Key Terms in this Chapter

ML: – Machine learning

DOA: – Direction of arrival

MISO: – Multiple input and single output

ROV: – Remotely operated vehicle

MVDR: – Minimum variance distortionless response

SONAR: – Sound navigation and ranging

RIR: – Room impulse response

IPD: – Interaural phase difference

ITD: – interaural time difference

CASA: – Computational auditory scene analysis

MIMO: – Multiple input and multiple output

SIMO: – Single input and multiple output

AIR: – Acoustic impulse response

CFAR: – Constant False Alarm Rate

DL: – Deep learning

DSB: – Delay and sum beamformer

RNN: – Recurrent neural network

Complete Chapter List

Search this Book:
Reset