Feature selection is a classical problem in machine learning, and how to design a method to select the features that can contain all the internal semantic correlation of the original feature set is a challenge. The authors present a general approach to select features via rough set based reduction, which can keep the selected features with the same semantic correlation as the original feature set. A new concept named inconsistency is proposed, which can be used to calculate the positive region easily and quickly with only linear temporal complexity. Some properties of inconsistency are also given, such as the monotonicity of inconsistency and so forth. The authors also propose three inconsistency based attribute reduction generation algorithms with different search policies. Finally, a “mini-saturation” bias is presented to choose the proper reduction for further predictive designing.
TopSome related definitions and concepts are presented as follow:
Definition 1 Positive region, P and Q are two sets in the information system U(C, D),, then the positive region of Q in P, denoted as, can be calculated as:
Definition 2 Attribute dependency, P and Q are two sets in the information system U(C, D), , then the attribute dependency of attribute set Q on attribute set P, denoted as ,can be calculated as:
The attribute dependency can describe which variables are strongly related to which other variables, for example, if, then can be viewed as the measure between the decision attributes and the condition attributes, which can be implemented in further predictive modeling.
With the definition of attribute dependency, the attribute reduct can be defined as follow:
Definition 3 Attribute reduct, In information system U(C, D), , R is the reduct of C if and only if and or equivalently and
The essence of attribute reduct is to find a subset P from condition set, and the subset P can maintain the same discriminability under the instance space. So we can judge whether the set is a reduct by its discriminability under the instance space. So the positive region, which calculates the number of instances that can be discriminable with the attribute set, can be used to find the reduct.
From the definition of attribute reduct, we can see the reduct could keep the internal correlation of the attributes. Here we introduce the reduction into the feature selection, as the reduct can maintain the same discriminability as the original data set (Jensen & Shen, 2004).
Definition 4 Inconsistent condition, in information system U(C, D), C is the condition attribute set, D is the decision attribute set,, if and , then there are inconsistent condition between instance and instance .
Definition 5 Inconsistent instance number, in information system U(C, D), C is the condition attribute set, D is the decision attribute set, if , , , the inconsistent instance number of set P is denoted as , and calculated as follow: