Spatial and 3-D Audio Systems

Spatial and 3-D Audio Systems

Hüseyin Hacıhabiboğlu (Graduate School of Informatics, Middle East Technical University (METU), Turkey)
Copyright: © 2015 |Pages: 10
DOI: 10.4018/978-1-4666-5888-2.ch594
OnDemand PDF Download:
$30.00
List Price: $37.50

Chapter Preview

Top

Background

The earliest system that was capable of recoding and reproducing sounds was the paleophone, developed by the French inventor Charles Cros. More well known is Edison’s phonograph. Both devices allowed recording and reproducing sound by physical means (i.e. by recording a physical trace of the sound on a soft material and then using a stylus coupled to a diaphragm and a horn to obtain back the recorded sound). The advent of electroacoustic transducers made it possible to amplify recorded signals by electrical means. This made monophonic systems widely available by the early 20th century. Regardless of their reproduction quality, these systems lacked the spatial context, which every recorded soundscape will embody. Even if a symphonic orchestra is recorded and that recording is played back using a monophonic system, the perceived sound scene will be constrained to the aperture of a single loudspeaker.

Blumlein is credited with the invention of stereophonic sound in 1931 making it possible to record and partially reproduce the spatial context of a sound scene over a pair of loudspeakers. Innovative stereophonic recording and reproduction techniques have since been proposed and stereophony is still one of the most popular methods to render recorded or synthesized sound scenes with their spatial contexts.

Steinberg asserted in 1934 that the accurate reproduction of the auditory perspective requires at least three independently recorded and reproduced audio channels (Steinberg, 1934). While the early subjective experiments also support this assertion, it was not feasible to record, distribute and reproduce three-channels of audio using the technology at the time.

The basic mechanisms of human sound source localization and spatial perception were known as early as the late 19th century (Rayleigh, 1877). The renewed interest in spatial hearing in 1950s made it possible to design recording and reproduction systems that allow a listener to experience three-dimensional sound scenes over a pair of headphones. This technology is now known as binaural audio. Binaural audio has some stringent requirements on the processing and reproduction chain to deliver full 3-D audio.

Gerzon’s work on spatial harmonic decomposition of a sound field and a special microphone to achieve such a decomposition resulted in Ambisonics (Gerzon, 1973). First-order Ambisonics typically uses 4-8 loudspeakers to deliver 3-D audio to a single listener positioned at the middle of the loudspeaker rig. The formulation of Ambisonics allows higher-order expansions of the sound field, which are used in higher-order Ambisonics (HOA).

Berkhout applied the principles of acoustic holography for the purpose of acoustic control (Berkhout, 1988). It was realized that the fundamental idea could also be applied to the physically accurate reconstruction of sound fields. Practical systems have since been developed and that technology is now known as wave-field synthesis (WFS).

The following section explains these systems in more detail, discusses the existing problems and presents the state-of-the-art.

Key Terms in this Chapter

Cross-Talk: The reception of contralateral binaural signal at an ear due to the reproduction of binaural audio over loudspeakers.

Head-Related Transfer Function (HRTF): The direction dependent frequency response of the pinna, the head and the torso.

Binaural Audio: A method of 3-D audio reproduction over headphones.

Wave Field Synthesis (WFS): A horizontal acoustic holography method based on the Kirchhoff-Helmholtz integral equation to generate physically accurate sound fields.

Ambisonics: A 3-D audio reproduction method based on the spherical harmonic decomposition of a sound field.

Intensity Panning: Panning the direction of a stereophonic phantom source using only the level differences between the left and the right channels.

Transaural Audio: A 3-D audio reproduction system to play binaural signals back from loudspeakers after cross-talk cancellation.

Time-Intensity Panning: Panning the direction of a stereophonic phantom source using both the level differences and the time delay between the left and the right channels.

Complete Chapter List

Search this Book:
Reset