Feature Reduction with Inconsistency

Feature Reduction with Inconsistency

Yong Liu, Yunliang Jiang, Jianhua Yang
DOI: 10.4018/jcini.2010040106
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Feature selection is a classical problem in machine learning, and how to design a method to select the features that can contain all the internal semantic correlation of the original feature set is a challenge. The authors present a general approach to select features via rough set based reduction, which can keep the selected features with the same semantic correlation as the original feature set. A new concept named inconsistency is proposed, which can be used to calculate the positive region easily and quickly with only linear temporal complexity. Some properties of inconsistency are also given, such as the monotonicity of inconsistency and so forth. The authors also propose three inconsistency based attribute reduction generation algorithms with different search policies. Finally, a “mini-saturation” bias is presented to choose the proper reduction for further predictive designing.
Article Preview
Top

Some related definitions and concepts are presented as follow:

Definition 1 Positive region, P and Q are two sets in the information system U(C, D),jcini.2010040106.m01, then the positive region of Q in P, denoted asjcini.2010040106.m02, can be calculated as:

Definition 2 Attribute dependency, P and Q are two sets in the information system U(C, D), jcini.2010040106.m04, then the attribute dependency of attribute set Q on attribute set P, denoted as jcini.2010040106.m05,can be calculated as:

The attribute dependency can describe which variables are strongly related to which other variables, for example, ifjcini.2010040106.m07, then jcini.2010040106.m08 can be viewed as the measure between the decision attributes and the condition attributes, which can be implemented in further predictive modeling.

With the definition of attribute dependency, the attribute reduct can be defined as follow:

Definition 3 Attribute reduct, In information system U(C, D), jcini.2010040106.m09, R is the reduct of C if and only ifjcini.2010040106.m10 and jcini.2010040106.m11or equivalently jcini.2010040106.m12 and jcini.2010040106.m13

The essence of attribute reduct is to find a subset P from condition set, and the subset P can maintain the same discriminability under the instance space. So we can judge whether the set is a reduct by its discriminability under the instance space. So the positive region, which calculates the number of instances that can be discriminable with the attribute set, can be used to find the reduct.

From the definition of attribute reduct, we can see the reduct could keep the internal correlation of the attributes. Here we introduce the reduction into the feature selection, as the reduct can maintain the same discriminability as the original data set (Jensen & Shen, 2004).

Definition 4 Inconsistent condition, in information system U(C, D), C is the condition attribute set, D is the decision attribute set,jcini.2010040106.m14, if jcini.2010040106.m15 and jcini.2010040106.m16, then there are inconsistent condition between instance jcini.2010040106.m17 and instance jcini.2010040106.m18.

Definition 5 Inconsistent instance number, in information system U(C, D), C is the condition attribute set, D is the decision attribute set, if jcini.2010040106.m19, jcini.2010040106.m20, jcini.2010040106.m21, the inconsistent instance number of set P is denoted as jcini.2010040106.m22, and calculated as follow:

Complete Article List

Search this Journal:
Reset
Volume 18: 1 Issue (2024)
Volume 17: 1 Issue (2023)
Volume 16: 1 Issue (2022)
Volume 15: 4 Issues (2021)
Volume 14: 4 Issues (2020)
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing