Intuitionistic Fuzzy Neighborhood Rough Set Model for Feature Selection

Intuitionistic Fuzzy Neighborhood Rough Set Model for Feature Selection

Shivam Shreevastava (Indian Institute of Technology, Banaras Hindu University, Varanasi, India), Anoop Kumar Tiwari (Institute of Science, Banaras Hindu University, Varanasi, India) and Tanmoy Som (Indian Institute of Technology, Banaras Hindu University, Varanasi, India)
Copyright: © 2018 |Pages: 10
DOI: 10.4018/IJFSA.2018040104

Abstract

Feature selection is one of the widely used pre-processing techniques to deal with large data sets. In this context, rough set theory has been successfully implemented for feature selection of discrete data set but in case of continuous data set it requires discretization, which may cause information loss. Fuzzy rough set theory approaches have also been used successfully to resolve this issue as it can handle continuous data directly. Moreover, almost all feature selection techniques are used to handle homogeneous data set. In this article, the center of attraction is on heterogeneous feature subset reduction. A novel intuitionistic fuzzy neighborhood models have been proposed by combining intuitionistic fuzzy sets and neighborhood rough set models by taking an appropriate pair of lower and upper approximations and generalize it for feature selection, supported with theory and its validation. An appropriate algorithm along with application to a data set has been added.
Article Preview
Top

1. Introduction

Every day quintillions of bytes of data are created from various sources like sensor data pertaining to climate information, census information and agricultural information etc., that get posted to social media sites, the digital pictures and videos, sale and purchase transaction records and cell phone GPS signals etc. to name a few. Generation of huge amount of data from various sources may create lots of difficulties in learning by the system using various classifiers because of redundant features available in data sets (Jain, Duin & Mao, 2000). It is required to reduce the dimension of data sets in order to get significant and informative features for decreasing the cost, storage and process time for better classification and prediction. Various preprocessing techniques have been proposed by different authors time to time to tackle the big data but many of these techniques have become inadequate due to their own limitations. Feature selection is one of the well-known techniques in data preprocessing for data mining, machine learning, pattern recognition, bioinformatics and medical image processing (Duda, Hart & Stork, 1973; Zhu, Ong & Dash, 2007; Yu & Liu, 2004; Hall, 1999; Dash & Liu, 2003). Feature selection is the process of selecting those input features that are most predictive of a desired outcome. It is better than other dimensionality reduction techniques as it preserves the real meaning of the features after reduction.

Feature selection methods (Guyon & Elisseeff, 2003; Saeys, Inza & Larrañaga, 2007; Dash & Liu, 1997) can be categorized as filter, embedded, wrapper, unsupervised, semi- supervised and supervised (Bhatt &Gopal, 2005; Zhu, Ong & Dash, 2007; Liu & Yu, 2005), etc. Feature selection techniques can be divided into two categories, firstly, symbolic method and secondly, numerical method. Symbolic methods consider all features as categorical variables and numerical methods take all the features as real valued variables. If there exist any heterogeneous features, symbolic methods(such as Rough set (Pawlak, 1982; Pawlak, 2012) based feature selection) use a discretization approach (Ching, Wong & Chan, 1995) and convert them as symbolic features which may lead to some sort of assumption and cause information loss. Discretization may damage two types of structures, firstly neighborhood structure and secondly ordered structure in real space. Latter problem was handled by fuzzy rough set model, but it was not able to tackle the former problem. Very few researches have been proposed to deal with neighborhood structure of the data sets.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 8: 4 Issues (2019)
Volume 7: 4 Issues (2018)
Volume 6: 4 Issues (2017)
Volume 5: 4 Issues (2016)
Volume 4: 4 Issues (2015)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing