Autoencoders in Deep Neural Network Architecture for Real Work Applications: Convolutional Denoising Autoencoders

Autoencoders in Deep Neural Network Architecture for Real Work Applications: Convolutional Denoising Autoencoders

Houda Abouzid (Abdelmalek Essaadi University, Morocco) and Otman Chakkor (Abdelmalek Essaadi University, Morocco)
DOI: 10.4018/978-1-7998-0117-7.ch007

Abstract

The most heard sound exists as a mixture of several audio sources. All human beings have the ability to concentrate on a single source of their interest and ignore the other sources as disturbing background noise. To apply this powerful gift to a machine, it must obligatory pass through the source separation process. If there is not enough information about the process of mixture of those sources and their nature as well, the problem is known by Blind Source Separation BSS. This thesis is dedicated to study the BSS as a solution for human machine interaction. The objective consists in recovering one or several source signals from a given mixture signal. Recently, the science research is towards artificial intelligence and machine learning applications. The proposed approach for the separation will be to apply a Deep Neural Network method based on Keras. Extracting features from the audio with signal processing techniques and machine learning to learn a representation from the audio for the compression tasks and the suppression of the noise will improve the state-of-the-art.
Chapter Preview
Top

Introduction

The Blind Source Separation known as the problem of BSS consists to find statistically independent signals from of their mixtures (observations) and this is done without any prior knowledge of the structure of mixtures or source signals.

Source separation occurs in a variety of applications such as locating and tracking targets in radar and sonar, the separation of speakers (problem called “Cocktail party”), detection and separation in systems of multiple access communication, Independent Component Analysis (ICA) of biomedical signals (e.g., EEG or ECG), etc. This problem has been intensively studied in the literature and many effective solutions have already been proposed (Belouchrani, Abed-Meraim, Cardoso, & Moulines, 1997).

In addition to the separation problem, there is another challenge that should be not ignored during the separation process which is the denoising of the unuseful sources aiming at a high-quality restitution. To do so, there are different techniques used for this purpose and between them the autoencoders.

Autoencoders are unsupervised learning techniques because there is no really need for explicit labels to train the model. The algorithm takes the input data which is in our case represented by audio signals, and try to reconstruct it with only fewer number of bits from the latent space. This operation is done with compression of the data during the time training of the neural network. As the Principle Component Analysis (PCA) (Abouzid & Chakkor, 2018) first goal is to reduce the dimension space, the autoencoders play the same role. In general, the idea is to project the dataset in a smaller space with removing some unuseful parts. As most of the researchers in signal processing field know that the PCA uses linear transformation, but the autoencoder uses the non-linear transformation. This is the big difference between them.

Complete Chapter List

Search this Book:
Reset