Emotional Semantic Detection from Multimedia: A Brief Overview

Emotional Semantic Detection from Multimedia: A Brief Overview

Shang-fei Wang (University of Science and Technology of China, China) and Xu-fa Wang (University of Science and Technology of China, China)
Copyright: © 2011 |Pages: 21
DOI: 10.4018/978-1-61692-797-4.ch007
OnDemand PDF Download:
List Price: $37.50


Recent years have seen a rapid increase in the size of digital media collections. Because emotion is an important component in the human classification and retrieval of digital media, emotional semantic detection from multimedia has been an active research area in recent decades. This chapter introduces and surveys advances in this area. First, the authors propose a general frame of research on affective multimedia content analysis, which includes physical, psychological and physiological space, alongside the relationships between the three. Second, the authors summarize research conducted on emotional semantic detection from images, videos, and music. Third, three typical archetypal systems are introduced. Last, explanations of several critical problems that are faced in database, the three spaces, and the relationships are provided, and some strategies for problem resolution are proposed.
Chapter Preview

1. Introduction

In recent times, with the development of many kinds of multimedia, our moods are continually being influenced by them. A change in mood is particularly anticipated when watching TV, seeing a movie, or playing an electronic game. Interestingly, multimedia may bring pleasure, tension, and even fear that audience members want to enjoy. However, inappropriate audiovisual stimuli may also cause unnecessary harm. For instance, on December 16, 1997, at around 6:50 p.m., 685 Japanese children and some adults fainted suddenly while watching a popular animated TV program called Pocket Monsters. The reason for the fainting was photosensitive epilepsy (Tobimatsu, Zhang, Tomoda, Mitsudome, & Kato, 1999). Therefore, it is important to investigate the relationships between multimedia and users’ emotional response, which calls for the interdisciplinary study of people and multimedia that would embrace psychology, psychophysiology, aesthetics, and information science. This chapter focuses mainly on recent advances from the viewpoint of information science.

With the rapid growth in types of multimedia, an efficient method for organizing, browsing, searching, and retrieving elements such as images, videos, and music becomes crucial. Emotion is an important natural component in human classification of information, beyond feature and cognitive level (Hanjalic, 2001). Therefore, emotional semantic detection from multimedia, known as affective multimedia content analysis, has become a new and promising area of research in the past decades. It will be beneficial to various applications, including the following:

  • Personal recommendation. If we can identify tense, relaxed, sad, or joyful parts of a movie, or the highlight of a soccer game, we may recommend it to users who are interested in these things.

  • Digital entertainment. We can enhance users’ feelings when they play electronic games through music, images, or video with a fixed emotion.

  • Psychotherapy. A tool that would be capable of detecting emotion from multimedia could be helpful for a psychotherapist who seeks music or videos that would motivate a patient who is doing recovery exercises.

  • Green information environment. By removing unpleasant segments from videos, we can provide a healthy environment, for example, avoiding the Pocket Monsters event.

  • Multimedia retrieval. Users typically rely on concepts and semantics to retrieve information. Emotion is one of these ways. For example, a beautiful image, romantic music, or an exciting video segment. We are able to provide emotion-based multimedia retrieval if we can efficiently identify emotion in media.


2. A General Frame Of Research On Affective Multimedia Content Analysis

Because research deals with multimedia and the human emotion induced by them, the framework should consist of physical, psychological, and physiological space, in addition to the relationships between the three (see Figure 1). Physical space is used to represent multimedia, while psychological and physiological space represent subjective and physiological emotional responses, respectively. For each space, we should consider the content of the space. For their relationships, we should focus on users’ modeling and individualizing models.

Figure 1.

A framework of research on affective multimedia content analysis

Complete Chapter List

Search this Book: