Filter-Wrapper Incremental Algorithms for Finding Reduct in Incomplete Decision Systems When Adding and Deleting an Attribute Set

Filter-Wrapper Incremental Algorithms for Finding Reduct in Incomplete Decision Systems When Adding and Deleting an Attribute Set

Nguyen Long Giang, Le Hoang Son, Nguyen Anh Tuan, Tran Thi Ngan, Nguyen Nhu Son, Nguyen Truong Thang
Copyright: © 2021 |Pages: 24
DOI: 10.4018/IJDWM.2021040103
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The tolerance rough set model is an effective tool to solve attribute reduction problem directly on incomplete decision systems without pre-processing missing values. In practical applications, incomplete decision systems are often changed and updated, especially in the case of adding or removing attributes. To solve the problem of finding reduct on dynamic incomplete decision systems, researchers have proposed many incremental algorithms to decrease execution time. However, the proposed incremental algorithms are mainly based on filter approach in which classification accuracy was calculated after the reduct has been obtained. As the results, these filter algorithms do not get the best result in term of the number of attributes in reduct and classification accuracy. This paper proposes two distance based filter-wrapper incremental algorithms: the algorithm IFWA_AA in case of adding attributes and the algorithm IFWA_DA in case of deleting attributes. Experimental results show that proposed filter-wrapper incremental algorithm IFWA_AA decreases significantly the number of attributes in reduct and improves classification accuracy compared to filter incremental algorithms such as UARA, IDRA.
Article Preview
Top

1. Introduction

Attribute reduction is a crucial problem in the data preprocessing step of data mining process (Song, Li & Qu, 2018; You et al., 2020). The objective of attribute reduction is to find a reduced attribute set, known as reduct, in order to improve the efficiency of data mining model (Liu & Wang, 2020). Rough set theory (Pawlak, 1991) is considered as an effective tool to solve attribute reduction problem in decision systems. In practical problems, “missing values” always appear in the value domain of conditional attribute set of decision systems. Then, they are called incomplete decision systems. To get reduct direct on incomplete decision systems without preprocessing missing values, Kryszkiewicz (1998) introduced a tolerance rough set model based on the extension of traditional rough set theory. Many algorithms have been proposed to get reduct of these systems based on tolerance rough set model. Today, by the rapid growth of big data, incomplete decision systems with high dimension or very large number of attributes are popular in many fields. For example, some decision systems in bioinformatics data have millions of attributes. Furthermore, they are always changed or updated on different time scales (Wang, 2020), especially in the cases of changing attributes or dimensions (Atay & Garani, 2019). For example, in medical diagnosis, clinical symptoms are considered as initial attributes for diagnosis by a physician. After that, the test indicators are used as the next attributes that are constantly added and updated to assist the doctors in improving the diagnostic accuracy.

In such decision systems, the algorithms for finding a reduct according to traditional approaches had two major challenges. Firstly, in decision systems with the change of attribute sets, because these algorithms must re-compute reduct on the entire decision system after each changing, so the time computing increases significantly. Secondly, in large decision systems, these algorithms meet difficulties because of limited memory space and computational speed. To overcome these two challenges, this paper proposes incremental algorithms to get reduct of incomplete decision systems in the case of adding or deleting conditional attributes. Incremental algorithms only update reduct on the changed attribute set, not on the entire of original incomplete decision systems. Therefore, these algorithms greatly reduce the runtime. Moreover, incremental algorithms can be implemented on high dimension incomplete decision systems by splitting a high dimension system into separated parts according to attribute, and the final reduct is computed by adding each part one by one.

In dynamic complete decision systems, many incremental algorithms for finding reduct have been proposed based on traditional rough set and some extended rough set so far. When adding or deleting objects, researchers proposed many incremental attribute reduction algorithms based on different measures such as distance measure (Demetrovics et al., 2014; Huong & Giang, 2016), information granularity (Jing, Li, Fujita, Yu, & Wang, 2017; Jing, Li, Huang, Chen, & Hong, 2017), fuzzy attribute dependency (Liu et al., 2017), fuzzy discernibility relation (Yang et al., 2017), extended discernibility matrix (Lang et al., 2017; Yang et al., 2017; Wei et al., 2018; Ma et al., 2019), positive domain (Das, Sengupta & Bhattacharyya, 2018; Lang et al., 2018; Hao et al., 2019), membership function (Shua, Qian & Xie, 2019), simplified discernibility matrix (Yang et al., 2019), indiscernibility relation (Nandhini & Thangadurai, 2019), covering granularity (Cai et al., 2019), information entropy (Shu, Qian & Xie, 2020), key instance set in fuzzy rough set (Ni et al., 2020), fuzzy rough set based information entropy (Zhang et al., 2020), discernibility matrix (Liu et al., 2020).

When adding or deleting attributes, many incremental attribute reduction algorithms were also proposed based on measures such as tolerance matrix in set-valued decision systems (Zhang, Li & Ruan, 2012), information entropy (Wang, Liang & Qian, 2013), fuzzy dependency function in fuzzy rough set (Zeng et al., 2015), distance measure (Demetrovics et al., 2016), dependency function (Raza & Qamar, 2016), knowledge granularity (Jing, Li & Huang, 2016), approximations sets in covering decision systems (Cai, Li & Ma, 2017), knowledge granularity with a multi-granulation view (Jing et al., 2017), compressed binary discernibility matrix (Ma et al., 2019), discernibility matrix in compacted decision system (Wei et al., 2019), indiscernibility relation (Nandhini & Thangadurai, 2019), discernible relations (Chen, Dong & Mi, 2020).

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 6 Issues (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing