Recognition of Musical Instrument Using Deep Learning Techniques

Recognition of Musical Instrument Using Deep Learning Techniques

Sangeetha Rajesh, Nalini N. J.
Copyright: © 2021 |Pages: 20
DOI: 10.4018/IJIRR.2021100103
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The proposed work investigates the impact of Mel Frequency Cepstral Coefficients (MFCC), Chroma DCT Reduced Pitch (CRP), and Chroma Energy Normalized Statistics (CENS) for instrument recognition from monophonic instrumental music clips using deep learning techniques, Bidirectional Recurrent Neural Networks with Long Short-Term Memory (BRNN-LSTM), stacked autoencoders (SAE), and Convolutional Neural Network - Long Short-Term Memory (CNN-LSTM). Initially, MFCC, CENS, and CRP features are extracted from instrumental music clips collected as a dataset from various online libraries. In this work, the deep neural network models have been fabricated by training with extracted features. Recognition rates of 94.9%, 96.8%, and 88.6% are achieved using combined MFCC and CENS features, and 90.9%, 92.2%, and 87.5% are achieved using combined MFCC and CRP features with deep learning models BRNN-LSTM, CNN-LSTM, and SAE, respectively. The experimental results evidence that MFCC features combined with CENS and CRP features at score level revamp the efficacy of the proposed system.
Article Preview
Top

Theoretical Background

Music instrument recognition aims to recognize the instruments played in the music clips. Humans can easily recognize the instruments based on the music characteristics such as pitch, loudness and the playing style. Realizing a system to recognize the instrument from a music clip is a difficult task due to the challenge in the extraction of essential information. According to Giannoulis & Klapuri (2013), timbre information is a major acoustic characteristic which enables human to recognize the sound sources. It is a vital feature to convey the uniqueness in musical instruments. However, the timbre information of a sound cannot be associated with any physical quantity (Agostini, Longari & Pollastri,2001). An exhaustive study made by Hall & Bahoura (2012) evidenced that MFCC and LPCC are a powerful and salient contribution for timbre characterization. They have also demonstrated the high score of instrument recognition using MFCC features. MFCCs are extensively studied and proved as reliable and robust features in speech and speaker recognition tasks. (Al-Ali, Dean, Senadji, Chandran, & Naik, 2017). Efficiency of MFCC is also experimented and evidenced in many of the MIR problems in particular genre classification, artist recognition, and music emotion recognition (Nalini & Palanivel, 2016). Chandwadkar & Sutaone (2013) studied the impact of feature selection and classification techniques for instrument identification. They worked with MFCC along with spectral features and autocorrelation coefficients to identify four instruments using various classifiers such as Decision Tree, K nearest Neighbours, Multi-Layer Perceptron, Sequential Minimal Optimization and meta-classifier. They also demonstrated the efficiency of MFCC features over the spectral features in instrument identification.

Complete Article List

Search this Journal:
Reset
Volume 14: 1 Issue (2024)
Volume 13: 1 Issue (2023)
Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing