Reference Hub5
Indices2
Machine Audition: Principles, Algorithms and Systems

Machine Audition: Principles, Algorithms and Systems

Wenwu Wang (University of Surrey, UK)
Copyright: © 2011 |Pages: 554
ISBN13: 9781615209194|ISBN10: 1615209190|ISBN13 Softcover: 9781616923693|EISBN13: 9781615209200
DOI: 10.4018/978-1-61520-919-4
Cite Book Cite Book

MLA

Wang, Wenwu. "Machine Audition: Principles, Algorithms and Systems." IGI Global, 2011. 1-554. Web. 27 Mar. 2020. doi:10.4018/978-1-61520-919-4

APA

Wang, W. (2011). Machine Audition: Principles, Algorithms and Systems (pp. 1-554). Hershey, PA: IGI Global. doi:10.4018/978-1-61520-919-4

Chicago

Wang, Wenwu. "Machine Audition: Principles, Algorithms and Systems." 1-554 (2011), accessed March 27, 2020. doi:10.4018/978-1-61520-919-4

Export Reference

Mendeley
Favorite Full-Book Download

Machine audition is the study of algorithms and systems for the automatic analysis and understanding of sound by machine. It has recently attracted increasing interest within several research communities, such as signal processing, machine learning, auditory modeling, perception and cognition, psychology, pattern recognition, and artificial intelligence. However, the developments made so far are fragmented within these disciplines, lacking connections and incurring potentially overlapping research activities in this subject area.

Machine Audition: Principles, Algorithms and Systems contains advances in algorithmic developments, theoretical frameworks, and experimental research findings. This book is useful for professionals who want an improved understanding about how to design algorithms for performing automatic analysis of audio signals, construct a computing system for understanding sound, and learn how to build advanced human-computer interactive systems.

Table of Contents

Reset
Front Materials
Title Page
This content has been removed at the discretion of the publisher and the editors.
Copyright Page
This content has been removed at the discretion of the publisher and the editors.
List of Reviewers
This content has been removed at the discretion of the publisher and the editors.
Preface
This content has been removed at the discretion of the publisher and the editors.
Acknowledgment
This content has been removed at the discretion of the publisher and the editors.
Chapters
Audio Scene Analysis, Recognition and Modeling
This content has been removed at the discretion of the publisher and the editors.
Chapter 1
Selina Chu (University of Southern California, USA), Shrikanth Narayanan (University of Southern California, USA), C.-C. Jay Kuo (University of Southern California, USA)
Recognizing environmental sounds is a basic audio sigFnal processing problem. The goal of the authors’ work is on the characterization of unstructured environmental sounds for understanding and predicting the context surrounding of...
Unstructured Environmental Audio: Representation, Classification and Modeling
This content has been removed at the discretion of the publisher and the editors.
Chapter 2
Luís Gustavo Martins (Portuguese Catholic University, Portugal), Mathieu Lagrange (CNRS - Institut de Recherche et Coordination Acoustique Musique (IRCAM), France), George Tzanetakis (University of Victoria, Canada)
Computational Auditory Scene Analysis (CASA) is challenging problem for which many different approaches have been proposed. These approaches can be based on statistical and signal processing methods such as Independent Component...
Modeling Grouping Cues for Auditory Scene Analysis Using a Spectral Clustering Formulation
This content has been removed at the discretion of the publisher and the editors.
Chapter 3
Tariqullah Jan (Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, UK), Wenwu Wang (Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, UK)
Cocktail party problem is a classical scientific problem that has been studied for decades. Humans have remarkable skills in segregating target speech from a complex auditory mixture obtained in a cocktail party environment....
Cocktail Party Problem: Source Separation Issues and Computational Methods
This content has been removed at the discretion of the publisher and the editors.
Chapter 4
Tjeerd C. Andringa (University of Groningen, Netherlands)
This chapter addresses the functional requirements of auditory systems, both natural and artificial, to be able to deal with the complexities of uncontrolled real-world input. The demand to function in uncontrolled environments has...
Audition: From Sound to Sounds
This content has been removed at the discretion of the publisher and the editors.
Audio Signal Separation, Extraction and Localization
This content has been removed at the discretion of the publisher and the editors.
Chapter 5
Syed Mohsen Naqvi (Loughborough University, Leicestershire, UK), Yonggang Zhang (Harbin Engineering University, Harbin, China), Miao Yu (Loughborough University, Leicestershire, UK), Jonathon A. Chambers (Loughborough University, Leicestershire, UK)
A novel multimodal solution is proposed to solve the problem of blind source separation (BSS) of moving sources. Since for moving sources the mixing filters are time varying, therefore, the unmixing filters should also be time...
A Multimodal Solution to Blind Source Separation of Moving Sources
This content has been removed at the discretion of the publisher and the editors.
Chapter 6
Banu Günel (University of Surrey, United Kingdom), Hüseyin Hacihabiboglu (King’s College London, United Kingdom)
Automatic sound source localization has recently gained interest due to its various applications that range from surveillance to hearing aids, and teleconferencing to human computer interaction. Automatic sound source localization...
Sound Source Localization: Conventional Methods and Intensity Vector Direction Exploitation
This content has been removed at the discretion of the publisher and the editors.
Chapter 7
Emmanuel Vincent (INRIA, France), Maria G. Jafari (Queen Mary University of London, United Kingdom), Samer A. Abdallah (Queen Mary University of London, United Kingdom), Mark D. Plumbley (Queen Mary University of London, United Kingdom), Mike E. Davies (University of Edinburgh, United Kingdom)
Most sound scenes result from the superposition of several sources, which can be separately perceived and analyzed by human listeners. Source separation aims to provide machine listeners with similar skills by extracting the sounds...
Probabilistic Modeling Paradigms for Audio Source Separation
This content has been removed at the discretion of the publisher and the editors.
Chapter 8
Saeid Sanei (Cardiff University, UK), Bahador Makkiabadi (Cardiff University, UK)
Tensor factorization (TF) is introduced as a powerful tool for solving multi-way problems. As an effective and major application of this technique, separation of sound particularly speech signal sources from their corresponding...
Tensor Factorization with Application to Convolutive Blind Source Separation of Speech
This content has been removed at the discretion of the publisher and the editors.
Chapter 9
Nilesh Madhu (Ruhr-Universität Bochum, Germany), André Gückel (Dolby Laboratories, Nürnberg, Germany)
Machine-based multi-channel source separation in real life situations is a challenging problem, and has a wide range of applications, from medical to military. With the increase in computational power available to everyday devices...
Multi-Channel Source Separation: Overview and Comparison of Mask-based and Linear Separation Algorithms
This content has been removed at the discretion of the publisher and the editors.
Chapter 10
Andrew Nesbit (Queen Mary University of London, United Kingdom), Maria G. Jafar (Queen Mary University of London, United Kingdom), Emmanuel Vincent (INRIA, France), Mark D. Plumbley (Queen Mary University of London, United Kingdom)
The authors address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse...
Audio Source Separation using Sparse Representations
This content has been removed at the discretion of the publisher and the editors.
Audio Transcription, Mining and Information Retrieval
This content has been removed at the discretion of the publisher and the editors.
Chapter 11
Cédric Févotte (CNRS LTCI, TELECOM ParisTech, France)
Nonnegative matrix factorization (NMF) is a popular linear regression technique in the fields of machine learning and signal/image processing. Much research about this topic has been driven by applications in audio. NMF has been for...
Itakura-Saito Nonnegative Factorizations of the Power Spectrogram for Music Signal Decomposition
This content has been removed at the discretion of the publisher and the editors.
Chapter 12
Music Onset Detection  (pages 297-316)
Ruohua Zhou (Queen Mary University of London, UK), Josh D Reiss (Queen Mary University of London, UK)
Music onset detection plays an essential role in music signal processing and has a wide range of applications. This chapter provides a step by step introduction to the design of music onset detection algorithms. The general scheme...
Music Onset Detection
This content has been removed at the discretion of the publisher and the editors.
Chapter 13
Kristoffer Jensen (Aalborg University Esbjerg, Denmark)
In this work, automatic segmentation is done using different original representations of music, corresponding to rhythm, chroma and timbre, and by calculating a shortest path through the selfsimilarity calculated from each...
On the Inherent Segment Length in Music
This content has been removed at the discretion of the publisher and the editors.
Chapter 14
Thierry Bertin-Mahieux (Columbia University, USA), Douglas Eck (University of Montreal, Canada), Michael Mandel (University of Montreal, Canada & Columbia University, USA)
Recently there has been a great deal of attention paid to the automatic prediction of tags for music and audio in general. Social tags are user-generated keywords associated with some resource on the Web. In the case of music, social...
Automatic Tagging of Audio: The State-of-the-Art
This content has been removed at the discretion of the publisher and the editors.
Chapter 15
Wenwu Wang (University of Surrey, UK)
Non-negative matrix factorization (NMF) is an emerging technique for data analysis and machine learning, which aims to find low-rank representations for non-negative data. Early works in NMF are mainly based on the instantaneous...
Instantaneous Versus Convolutive Non-Negative Matrix Factorization: Models, Algorithms and Applications to Audio Pattern Separation
This content has been removed at the discretion of the publisher and the editors.
Audio Cognition, Modeling and Affective Computing
This content has been removed at the discretion of the publisher and the editors.
Chapter 16
Shlomo Dubnov (University of California in San Diego, CA)
This chapter investigates the modeling methods for musical cognition. The author explores possible relations between cognitive measures of musical structure and statistical signal properties that are revealed through information...
Musical Information Dynamics as Models of Auditory Anticipation
This content has been removed at the discretion of the publisher and the editors.
Chapter 17
Sanaul Haq (University of Surrey, UK), Philip J.B. Jackson (University of Surrey, UK)
Recent advances in human-computer interaction technology go beyond the successful transfer of data between human and machine by seeking to improve the naturalness and friendliness of user interactions. An important augmentation, and...
Multimodal Emotion Recognition
This content has been removed at the discretion of the publisher and the editors.
Chapter 18
Francis F. Li (The University of Salford, Greater Manchester, UK), Paul Kendrick (The University of Salford, Greater Manchester, UK), Trevor J. Cox (The University of Salford, Greater Manchester, UK)
Propagation of sound from a source to a receiver in an enclosure can be modeled as an acoustic transmission channel. Objective room acoustic parameters are routinely used to quantify properties of such channels in the design and...
Machine Audition of Acoustics: Acoustic Channel Modeling and Room Acoustic Parameter Estimation
This content has been removed at the discretion of the publisher and the editors.
Chapter 19
Pedro Gómez-Vilda (Grupo de Informática Aplicada al Procesado de Señal e Imagen Universidad Politécnica de Madrid, Spain), José Manuel Ferrández-Vicente (Grupo de Informática Aplicada al Procesado de Señal e Imagen Universidad Politécnica de Madrid, Spain), Victoria Rodellar-Biarge (Grupo de Informática Aplicada al Procesado de Señal e Imagen Universidad Politécnica de Madrid, Spain), Rafael Martínez-Olalla (Grupo de Informática Aplicada al Procesado de Señal e Imagen Universidad Politécnica de Madrid, Spain), Víctor Nieto-Lluis (Grupo de Informática Aplicada al Procesado de Señal e Imagen Universidad Politécnica de Madrid, Spain), Luis Mazaira-Fernández (Grupo de Informática Aplicada al Procesado de Señal e Imagen Universidad Politécnica de Madrid, Spain), Cristina Miguel Muñoz-Mulas (Grupo de Informática Aplicada al Procesado de Señal e Imagen Universidad Politécnica de Madrid, Spain)
Current trends in the search for improvements in well-established technologies imitating human abilities, as speech perception, try to find inspiration in the explanation of certain capabilities hidden in the natural system which are...
Neuromorphic Speech Processing: Objectives and Methods
This content has been removed at the discretion of the publisher and the editors.
Back Materials
Compilation of References
This content has been removed at the discretion of the publisher and the editors.
About the Contributors
This content has been removed at the discretion of the publisher and the editors.
Index
This content has been removed at the discretion of the publisher and the editors.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.