Music Information Retrieval

Music Information Retrieval

Thomas Lidy (Vienna University of Technology, Austria) and Andreas Rauber (Vienna University of Technology, Austria)
DOI: 10.4018/978-1-59904-879-6.ch046
OnDemand PDF Download:


This chapter provides an overview of the relatively young but increasingly important domain of Music Information Retrieval, an Information Retrieval subdomain, which investigates efficient and intelligent methods to analyze, recognize, retrieve and organize music. After describing the background and the problems that are addressed by research in this domain the chapter gives a brief introduction to methods for the extraction of semantic descriptors from music, which are fundamental to a great number of tasks in Music Information Retrieval. In the subsequent sections, music retrieval, music classification and music library visualization systems are described. All of these systems are developed for the purpose of enhancing organization, access and retrieval in potentially large digital music libraries.
Chapter Preview


The term Music Information Retrieval has been first mentioned by Kassler (1966). For a long period, however, there was little research on this topic. First beat detection systems were published in the late 1970s and 1980s. The domain of content-based music retrieval experienced a major boost in the late 1990s when mature techniques for the automated description of the content of music became available. In the 1990s, also first systems for classification and clustering of sound events (Feiten & Günzel, 1994) and discrimination of speech and music (Scheirer & Slaney, 1997) were presented. The first works on music style recognition were using MIDI or other symbolic music as input (Dannenberg, Thom, & Watson, 1997). Then, research on audio-based approaches for music classification became increasingly important (Foote, 1997). Since around 2000, the problem of clustering and visualizing large music libraries and supporting better access to them has been addressed.

The International Conference on Music Information Retrieval (ISMIR) is the most important forum for researchers and people interested in Music IR. In the annual MIREX (Music Information Retrieval Evaluation eXchange) benchmarking event, state-of-the-art approaches for music description, classification and other tasks are evaluated and compared.

Downie (2003) provides a review of nearly all aspects of Music Information Retrieval, including contributions from the pre-digital era. The review covers different classes of music descriptors, describes a range of MIR systems and discusses also the challenges in MIR. Orio (2006) explains and reviews different aspects of music and music processing, discusses the role of the users, and gives an introduction to scientific MIR evaluation campaigns. He also describes several systems for MIR.

Key Terms in this Chapter

Descriptor: Numerical measure intended to describe semantic aspects of music, such as timbre or rhythm. Sometimes, a set of multiple measures is refered to as descriptor.

Music Information Retrieval Evaluation Exchange (MIREX): Annual benchmarking event for comparison of state-of-the-art Music IR approaches.

Feature: Numerical measure describing an aspect of music which is useful for Music IR tasks. Multiple features together are also refered to as feature set. The term descriptor is frequently used as synonym for the terms feature or feature set.

Symbolic Music: Music stored in a notation-based format (e.g., MIDI), which contains excplicit information about note onsets and pitch on individual tracks (for different instruments), but in contrast to Digital Audio no sound.

Digital Audio: Representation of a music recording in digitally sampled wave form in the form of a mixed signal. Information of individual sources of the signal is not explicitly available and can be derived partly by extensive analysis of the signal.

Music Information Retrieval (MIR): Research domain that covers automatic extraction of music descriptors for similarity-based search, retrieval, classification and organization of music in (potentially large) music collections.

Feature Extraction: Method or algorithm which analyses music and computes (extracts) features from it.

Self-Organizing Map (SOM): An unsupervised neural network providing a topology-preserving mapping from a high-dimensional input space onto a two-dimensional output space. Used as algorithm for clustering and organizing music collections, enabling intuitive views of and/or interaction with music collections.

Feature Vector: A feature set, when used as input for classification or clustering methods, is refered to as feature vector. An individual scalar value in this feature vector is also refered to as an attribute or feature.

Complete Chapter List

Search this Book: