Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Instantaneous Versus Convolutive Non-Negative Matrix Factorization: Models, Algorithms and Applications to Audio Pattern Separation

Wenwu Wang

Source Title: Machine Audition: Principles, Algorithms and Systems

DOI: 10.4018/978-1-61520-919-4.ch015

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Non-negative matrix factorization (NMF) is an emerging technique for data analysis and machine learning, which aims to find low-rank representations for non-negative data. Early works in NMF are mainly based on the instantaneous model, i.e. using a single basis matrix to represent the data. Recent works have shown that the instantaneous model may not be satisfactory for many audio application tasks. The convolutive NMF model, which has an advantage of revealing the temporal structure possessed by many signals, has been proposed. This chapter intends to provide a brief overview of the models and algorithms for both the instantaneous and the convolutive NMF, with a focus on the theoretical analysis and performance evaluation of the convolutive NMF algorithms, and their applications to audio pattern separation problems.

Chapter Preview

Top

Introduction

Since the seminal paper published in 1999 by Lee and Seung, non-negative matrix factorization (NMF) has attracted tremendous research interests over the last decade. The earliest work in NMF is perhaps by (Paatero, 1997) and is then made popular by Lee and Seung due to their elegant multiplicative algorithms (Lee & Seung, 1999, Lee & Seung, 2001). The aim of NMF is to look for latent structures or features within a dataset, through the representation of a non-negative data matrix by a product of low rank matrices. It was found in (Lee & Seung, 1999) that NMF results in a “parts” based representation, due to the nonnegative constraint. This is because only additive operations are allowed in the learning process. Although later works in NMF may have mathematical operations that can lead to negative elements within the low-rank matrices, their non-negativity can be ensured by a projection operation (Zdenuk & Cichocki, 2007, Soltuz et al, 2008). Another interesting property with the NMF technique is that the decomposed low-rank matrices are usually sparse, and the degree of their sparseness can be explicitly controlled in the algorithm (Hoyer, 2004). Thanks to these promising properties, NMF has been applied to many problems in data analysis, signal processing, computer vision, and patter recognition, see, e.g. (Lee & Seung, 1999, Pauca et al, 2006, Smaragdis & Brown, 2003, Wang & Plumbley, 2005, Parry & Essa, 2007, FitzGerald et al, 2005, Wang et al, 2006, Zou et al, 2008, Wang et al, 2009, Cichocki et al, 2006b).

In machine audition and audio signal processing, NMF has also found applications in, for example, music transcription (Smaragdis & Brown, 2003, Wang et al, 2006) and audio source separation (Wang & Plumbley, 2005, Parry & Essa, 2007, FitzGerald et al, 2005, FitzGerald et al, 2006, Virtanen, 2007, Wang et al, 2009). In these applications, the raw audio data are usually transformed to the frequency domain to generate the spectrogram, i.e. the non-negative data matrix, which is then used as the input to the NMF algorithm. The instantaneous NMF model given in (Lee & Seung, 1999, Lee & Seung, 2001) has been shown to be satisfactory in certain tasks in audio applications provided that the spectral frequencies of the analyzed signal do not change dramatically over time (Smaragdis, 2004, Smaragdis, 2007, Wang, 2007, Wang et al, 2009). However, this is not a case for many realistic audio signals whose frequencies do vary with time. The main limitation with the instantaneous NMF model is that only a single basis function is used, and therefore is not sufficient to capture the temporal dependency of the frequency patterns within the signal. To address this issue, the convolutive NMF (or similar methods called shifted NMF) model has been introduced (Smaragdis, 2004, Smaragdis, 2007, Virtanen, 2007, FitzGerald et al, 2005, Morup et al, 2007, Schmidt & Morup, 2006, O’Grady & Pearlmutter, 2006, Wang, 2007, Wang et al, 2009). For the convolutive NMF, the data to be analyzed are modelled as a linear combination of shifted matrices, representing the time delays of multiple bases. Several algorithms have been developed based on this model, for example, the Kullback-Leibler (KL) divergence based multiplicative algorithm proposed in (Smaragdis, 2004, Smaragdis, 2007), the squared Euclidean distance based multiplicative algorithm proposed in (Wang, 2007, Wang et al, 2009), the two-dimensional deconvolution algorithms proposed in (Schmidt & Morup, 2006), the logarithmic scaled spectrogram decomposition algorithm in (FitzGerald et al, 2005), and the algorithm based on the constraints of the temporal continuity and sparseness of the signals in (Virtanen, 2007).

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Instantaneous Versus Convolutive Non-Negative Matrix Factorization: Models, Algorithms and Applications to Audio Pattern Separation

Abstract

Introduction

Complete Chapter List