Rough and Soft Set Approaches for Attributes Selection of Traditional Malay Musical Instrument Sounds Classification

Rough and Soft Set Approaches for Attributes Selection of Traditional Malay Musical Instrument Sounds Classification

Norhalina Senan (Universiti Tun Hussein Onn Malaysia, Malaysia), Rosziati Ibrahim (Universiti Tun Hussein Onn Malaysia, Malaysia), Nazri Mohd Nawi (Universiti Tun Hussein Onn, Malaysia), Iwan Tri Riyadi Yanto (Universitas Ahmad Dahlan, Indonesia) and Tutut Herawan (Universitas Ahmad Dahlan, Indonesia)
DOI: 10.4018/jssci.2012040102
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Feature selection or attribute reduction is performed mainly to avoid the ‘curse of dimensionality’ in the large database problem including musical instrument sound classification. This problem deals with the irrelevant and redundant features. Rough set theory and soft set theory proposed by Pawlak and Molodtsov, respectively, are mathematical tools for dealing with the uncertain and imprecision data. Rough and soft set-based dimensionality reduction can be considered as machine learning approaches for feature selection. In this paper, the authors applied these approaches for data cleansing and feature selection technique of Traditional Malay musical instrument sound classification. The data cleansing technique is developed based on matrices computation of multi-soft sets while feature selection using maximum attributes dependency based on rough set theory. The modeling process comprises eight phases: data acquisition, sound editing, data representation, feature extraction, data discretization, data cleansing, feature selection, and feature validation via classification. The results show that the highest classification accuracy of 99.82% was achieved from the best 17 features with 1-NN classifier.
Article Preview

Introduction

One of the most common problems encountered in many data mining tasks including music data analysis and signal processing is the issue of ‘curse of dimensionality.’ This problem deals with the high dimensional data with massive amount of attributes. Using the whole set of attribute is inefficient in term of time processing and storage requirements. In addition, it may be difficult to interpret and may decrease the classification performance respectively. The solution of this problem is to remove irrelevant and redundant features and select the most important features that may achieve a better classifier (Liu, Jiang, & Yang, 2010). This process is known as feature selection or attributes reduction.

It has been proven that finding all possible reductions in an information system is NP-hard problem. For that, reduction of data ought to be properly addressed. The theories of rough set proposed by Pawlak in 1980s (Pawlak, 1982) and soft set proposed by Molodtsov (1999) emerge as a powerful tool for dealing with uncertainty that occur from inexact, noisy, or incomplete information. In feature selection problem, rough set is implemented with the aim of finding the minimal subsets of attributes which sufficient to generate the same classification accuracy as the whole set of attributes. This minimal features set is known as reduct. Banerjee et al. (2006) stated that the concept of reduct and core in rough set is relevant in feature selection to identify the essential features amongst the non-redundant ones. Liu, Jiang, and Yang (2010), also claimed that the concept of rough based reduction have been applied by many researchers in handling feature selection problems. In their work, the concept of inconsistency in attribute reduction is proposed. While, soft sets are called elementary neighborhood systems. Molodtsov pointed out that one of the main advantages of soft set theory is that it is free from the inadequacy of the parameterization tools, like in the theories of fuzzy set, probability and interval mathematics (Molodtsov, 1999). The idea of soft set theory as dimensionality reduction methods have been applied (Maji, Roy, & Biswa, 2002; Chen et al., 2003, 2005; Kong et al., 2008). Maji, Roy, and Biswas (2002) applied a soft set theory in the decision making problem with the help of Pawlak’s rough reduct. The reduct soft set algorithm which defined from the rough set theory is employed as a reduction method. Then the weighted choice value is embedded in the algorithm to select the optimal decision. In Chen et al. (2003, 2005) and Kong et al. (2008), studies on parameterization reduction of soft sets and its applications are presented. Two major problems in Maji, Roy, and Biswa (2002) are highlighted in their study which are the result of computing reduction is incorrect and the algorithm to compute the reduction and then to select the optimal objects are not reasonable. To improve these problems, they presented a new definition of parameterization reduction of soft sets with the concepts of attributes reduction in rough set theory.

In this paper, we attempt to apply soft set theory as data cleansing technique. It is based on a data cleansing technique which is developed using matrices computation of multi-soft sets. Further, we propose a feature selection technique using rough set theory based on maximum dependency of attributes proposed by (Herawan, Mustafa, and Abawajy (2010) purposely for Traditional Malay musical instruments sounds classification problem. The main contribution of our work is to delete the irrelevant features using soft set approach and then select the most significant features by ranking the relevant features based on the highest dependency of attributes on the dataset using rough set approach. After that, the redundant features with the similar dependency value are deleted.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2017): 3 Released, 1 Forthcoming
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing