Natural Human-Computer Interaction with Musical Instruments

Natural Human-Computer Interaction with Musical Instruments

George Tzanetakis (University of Victoria, Canada)
Copyright: © 2016 |Pages: 21
DOI: 10.4018/978-1-5225-0264-7.ch006
OnDemand PDF Download:
List Price: $37.50


The playing of a musical instrument is one of the most skilled and complex interactions between a human and an artifact. Professional musicians spend a significant part of their lives initially learning their instruments and then perfecting their skills. The production, distribution and consumption of music has been profoundly transformed by digital technology. Today music is recorded and mixed using computers, distributed through online stores and streaming services, and heard on smartphones and portable music players. Computers have also been used to synthesize new sounds, generate music, and even create sound acoustically in the field of music robotics. Despite all these advances the way musicians interact with computers has remained relatively unchanged in the last 20-30 years. Most interaction with computers in the context of music making still occurs either using the standard mouse/keyboard/screen interaction that everyone is familiar with, or using special digital musical instruments and controllers such as keyboards, synthesizers and drum machines. The string, woodwind, and brass families of instruments do not have widely available digital counterparts and in the few cases that they do the digital version is nowhere as expressive as the acoustic one. It is possible to retrofit and augment existing acoustic instruments with digital sensors in order to create what are termed hyper-instruments. These hyper-instruments allow musicians to interact naturally with their instrument as they are accustomed to, while at the same time transmitting information about what they are playing to computing systems. This approach requires significant alterations to the acoustic instrument which is something many musicians are hesitant to do. In addition, hyper-instruments are typically one of a kind research prototypes making their wider adoption practically impossible. In the past few years researchers have started exploring the use of non-invasive and minimally invasive sensing technologies that address these two limitations by allowing acoustic instruments to be used without any modifications directly as digital controllers. This enables natural human-computer interaction with all the rich and delicate control of acoustic instruments, while retaining the wide array of possibilities that digital technology can provide. In this chapter, an overview of these efforts will be provided followed by some more detailed case studies from research that has been conducted by the author's group. This natural interaction blurs the boundaries between the virtual and physical world which is something that will increasingly happen in other aspects of human-computer interaction in addition to music. It also opens up new possibilities for computer-assisted music tutoring, cyber-physical ensembles, and assistive music technologies.
Chapter Preview


Music today is produced, distributed and consumed using digital computer technology in each of these stages. In a typical scenario the process starts with musicians recording individual tracks using their respective instruments at a recording studio. These tracks are stored as digital waveforms which are then mixed and processed using digital audio workstation (DAW) software by one or more recording engineers. The resulting music track is then digitally distributed typically through either streaming services like Spotify and Pandora or online music stores like the Apple iStore or Google Play. Finally music listeners hear the music typically using their computers or smart phones. Despite these amazing advances in technology that have made practically all music accessible to anyone with an internet connection, the way musicians typically interact with computers is still primitive and limited in many ways especially when contrasted with how musicians interact with each other.

These limitations in human-computer interaction (HCI) in the context of music making can be broadly be classified as being caused by two factors. The first is related to hardware and is that we still mostly interact with computers using a keyboard and a mouse. The situation in music is not much different with the primary digital instruments being keyboards (the music kind) and other essentially digital controllers such as sliders and rotary knobs. The amount of control and expressivity these digital control afford is nowhere close to that afforded by acoustic instruments. The other major factor limited natural HCI in the context of music making is that computers process music signals as large monolithic blocks of samples without any “understanding” of the underlying content. When musicians listens to music especially when interacting with other musicians in the context of a live music performance they are able to extract an enormous amount of high level semantic information from the music signal such as tempo, rhythmic structure, chord changes, melody, style, and vocal quality. When working with a recording engineer it is possible to say something along the lines of go to the 4th measure of the saxophone solo and she will be able to locate the corresponding segment. However this level of understanding is currently impossible to achieve at least in commercial software systems. Natural human-computer interaction in music will only be achieved when musicians are able to use their instruments to convey performance information to computer systems and that way leverage their incredible abilities and long time investment in learning their instruments. In addition the associated computer systems should be able to “understand” and “listen” to music in similar ways to how human listeners and especially musicians do.

In this chapter an overview of current efforts in creating novel ways of musical human-computer interaction is provided. These efforts have been supported by advances in two important research communities to this work. The first research area is Music Information Retrieval (MIR) which deals with all aspects of extracting information from musical signals in digital form. Although originally the primary focus of MIR was the processing of large collections of recorded music in recent years several of the techniques developed in the field are starting to be used in the context of live music performance. These techniques include monophonic and polyphonic pitch detection, melody extraction, chord recognition, segmentation and structure analysis, tempo and beat tracking, and instrument classification. The second research area is New Interfaces for Musical Expression (NIME) (Miranda & Wanderley, 2006) which deals with new technologies and ways for creating music enabled by computing technology.

Complete Chapter List

Search this Book: