Emotion-Based Music Retrieval

Emotion-Based Music Retrieval

C.H.C. Leung (Department of Computer Science, Hong Kong Baptist University, Hong Kong) and J. Deng (Department of Computer Science, Hong Kong Baptist University, Hong Kong)
Copyright: © 2015 |Pages: 11
DOI: 10.4018/978-1-4666-5888-2.ch015

Chapter Preview



With the astounding growth of digital music, music retrieval is extremely important for discovering music that matches listeners taste or preferences. Music is a complex acoustic and physical product, which encompasses mind, feeling, emotion, culture and other aspects of human beings. Therefore, music plays a prominent role in people’s daily lives, not only in relieving stress, but also in cultivating sentiment.

Currently there are numerous digital music services on the Internet. Many music service websites (e.g. Yahoo Music, MySpace) provide music retrieval by meta-data information such as music title, genre, album, lyrics and biography, which are not able to analyze music content and retrieve music by content. However, a few online music providers (e.g. Pandora.com, Musipedia) attempt to retrieve music relying on melody, rhythm, timbre, or harmony, which greatly improves music retrieval results.

In many research studies, many people believe that music can induce emotion, and some psychologists have done experiments using physiological measurements such as heart rate and skin conductance to prove this view (Zentner et al., 2008; Scherer, 2005). In this article, we view music as an art form and soul of language which can engender a feeling or evoke emotions. Naturally, emotional expression in music is the key factor to analyze music emotional content. However, most current music services ignore the emotion and sentiment influence or simply utilize tags to represent some general emotions conveyed in music. Considering that emotions induced by music are significant for deeper analyses of music, this article introduces a method of emotion-based music retrieval, which provides a more natural and humanized way to better experience music.

The aim of emotion-based music retrieval systems is to efficiently retrieve music from a music database by emotional similarity. Therefore, the first task is to define the expression of emotion induced by music. Currently there are many views on emotion models. For example, some researchers view that emotion should be expressed by discrete basic human emotions such as joy, sadness, anger or grief, while others believe that emotion should be depicted in a psychological dimensional space, though there are no consensus on how many dimensions there are. In this article, we review different emotion models and propose to represent emotion by combining discrete emotion model and dimensional emotion model. The second task is to find the relationship between acoustic features and their emotional impacts. We describe music attributes such as pitch, timbre, rhythm, melody, harmony, and then point out their emotional impacts on our applied emotion model. The final task is to retrieve music based on their emotions. We suggest three query methods: query-by-music, query-by-tag, and hybrid. In addition, we also apply some ranking algorithms to return an optimal retrieval list.

The rest of this article is organized as follows: the next section will review some significant emotion models and approaches of emotion-based music retrieval. Then we shall define a hybrid music emotion model combining discrete and dimensional representations. And then the relationship between acoustic features and their emotional impact based on the utilized emotion model will also be described. After that a unified framework for music retrieval by three query methods is presented. Furthermore, an effective ranking algorithm applied to emotion-based music retrieval system is proposed. Finally, some future potential directions and trends for future research are pointed out.

Key Terms in this Chapter

RAV Emotion Space: A three-dimensional musical emotion space, consisting of three aspects: resonance, arousal and valence, which are corresponding to the relevant acoustic feature vectors.

Time Series Analysis: The process of using statistical techniques to model and represent time-dependent series of data in temporal to extract meaningful characteristics.

Musical Emotion: Emotion is a complex psychological and physiological subjective experience, thus musical emotion is referred to as the emotional feelings induced by music.

Graph Embedding: A dimension reduction approach to represent each vertex of a graph as a low-dimensional vector which preserves similarities between the vertex pairs, and the similarity is computed by geometric properties in the graph.

Learning to Rank: A machine learning problem which is conducted by supervised learning or semi-supervised learning is to automatically construct a ranking model from training data.

Dimension Reduction: The goal of dimension reduction is to reduce the dimensionality of dataset form high-dimensional space to a lower dimensional space.

Multitask Learning: This process learns multiple tasks parallel using a shared representation. We implement multitask learning by using acoustic features and emotional tag features to jointly learn an optimal well-reduced space to represent musical emotion.

Complete Chapter List

Search this Book: