What is the real potential of computer science when applied to music? It is possible to synthesize a “real” guitar using physical modelling software, yet it is also possible virtually to create a guitar with 40 strings, each 100 metres long. The potential can thus be seen both in the simulation of that which in nature already exists, and in the creation of that which in nature cannot exist. After a brief introduction to spatial hearing and the binaural spatialization technique, passing from principles of psychoacoustics to digital signal processing, the reader will be included on a voyage through multi-dimensional auditory worlds, first simulating what in nature already exists, starting from zero and arriving at three “soundscape dimensions”, then trying to advance the idea of a fourth “auditory dimension”, creating synthetically a four-dimensional soundscape.
What is the real potential of computer science when applied to music?
Using physical modelling synthesis techniques it is possible to simulate as accurately as possible an acoustic guitar: of course, this is useful in terms of the opportunities made available to musicians to use an instrument they cannot in reality play, and in terms of acoustical studies of the instrument itself. But: Is that it? Once created, a guitar mathematical model can be altered as far as the imagination can extend: an acoustic guitar made of gold, with 40 strings, each of 100 metres in length, could virtually be created and played with a one-square-metre plectrum! Therefore, the potential can be seen both in the simulation of that which in nature already exists, and in the creation of that which in nature cannot exist.
Another kind of example will be discussed later in the chapter: instead of toying with the simulation of musical instruments, we shall try to create virtual acoustical environments through the simulation of three-dimensional (henceforth: 3D) soundscapes. The binaural spatialization technique will be used in order to achieve this goal; multi-dimensional soundscapes will be simulated not by placing real sound sources, such as loudspeakers, within the three dimensions, but by simulating the behaviour of our outer ear in terms of directional modifications brought to the sound input into the hearing system.
The mechanisms of spatial hearing will be investigated and analyzed, and three localization cues will be characterized and simulated. These are the Interaural Level Differences (ILDs), the Interaural Time Differences (ITDs), and the Direction-Dependent Filtering (DDF). Within the simulation of a real environment, these parameters would all be coherent with the position of the sound source: for example, for a sound source placed at 60° of azimuth, the sound would reach first the right ear and then the left (ITDs). Furthermore, it would be more intense at the right ear (ILDs) than at the left, and the sound would be filtered depending on the particular resonances of our outer hearing systems for that specific sound source location.
It must, however, be asked what could happen if the three localization cues were incoherent with the real position of the sound source. Of course, this is impossible in nature, and equally so in a standard soundscape simulation, when loudspeakers are placed in a 3D space. Still, achieving such incoherence is not impossible in a system based on headphones, where the signals sent to the hearing system are much more controllable, thus the whole reproduction system results may be much more flexible.
In this case, the binaural spatialization technique is useful not only to simulate a real 3D soundscape, but also to create new soundscapes, i.e., environments that are impossible to find in the real world. This seems to be one of the amazing new options offered by computer science: while it could indeed be considered inessential to simulate a feature that already exists in nature, it is particularly interesting to create a feature that as yet has no existence in the real world.
To appreciate this new ‘digital feature’ fully, it may help to think about a voyage into multiple dimensions; the results may appear similar to Abbott’s graphic achievements (seeAbbott, 1999) when he wrote Flatland (frequent reference will be made to this book later in the chapter). A monophonic diotic signal (the same at both ears) could be perceived as a point sound source located in the middle of the head: zero dimensions, or the point. By introducing intensity and content differences between the two channels (ILDs) and creating a dichotic signal (different for each of the ears), it is possible to obtain a standard headphone stereo signal, with multiple sound sources located along a line between the ears (always inside the head): one dimension, or the line. Through introducing time differences between the two channels (ITDs), it is possible to obtain the sensation of the sound coming from out of the head, and with multiple sound sources located in a plane: two dimensions, or the square. Then, upon introducing a simulation of the DDF, with different frequency filtering for each virtual sound source, the perception reaches the third dimension, the cube. The auditory passage between these steps could be visualized as the graphic perception of a point that becomes a line, then a square and at the end a cube. What, then, about the fourth dimension?
Key Terms in this Chapter
Soundscape: An acoustical environment or an environment created by sound.
Impulse Response (IR): The recording of an impulse signal that has passed through and been processed by a given system. It unequivocally describes the behaviour of that specific system to all of the possible input signals.
Sound Localization: The judgement of the specific location of a sound source.
Head Related Transfer Function (HRTF): The transfer function of the external hearing system and of all the other elements that contribute to the directional modifications of the signals input to the hearing system (torso and shoulders).
Fourth Dimension: There is no unequivocal definition of the fourth dimension. For most, scientists included, time is considered to be the fourth dimension, yet in this chapter an attempt at a different definition is executed.
Three-Dimensional Soundscape: The ensemble of the sounds heard in a particular location where the sound sources are placed in the three dimensions (x, y and z, or length, height and width).
Convolution: A fundamental mathematical function that involves sampling multiplications in multi-dimensional matrices. In the case of digital signal processing, convolution can be seen as a particular kind of multiplication between two vectors.
Binaural: Relating or involving a sound stimulus presented to both ears. In a more general sense, the expression ‘binaural spatialization’ refers to the technique for the simulation of 3D soundfields over standard stereo headphones. When the binaural signals, previously modified by the mean of special processes called ‘cross-talk’ cancellation filters, are played through loudspeakers, the term used to define this situation is transaural.
Sound Spatialization: An ensemble of sound processing techniques oriented towards the simulation of multi-dimensional soundscapes.
Localization Cues: Specific attributes of the sound event that are used by the hearing system in order to establish the position of a sound source in a 3D soundscape.