Efficient Algorithms for Dynamic Incomplete Decision Systems

Efficient Algorithms for Dynamic Incomplete Decision Systems

Nguyen Truong Thang, Giang Long Nguyen, Hoang Viet Long, Nguyen Anh Tuan, Tuan Manh Tran, Ngo Duy Tan
Copyright: © 2021 |Pages: 24
DOI: 10.4018/IJDWM.2021070103
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Attribute reduction is a crucial problem in the process of data mining and knowledge discovery in big data. In incomplete decision systems, the model using tolerance rough set is fundamental to solve the problem by computing the redact to reduce the execution time. However, these proposals used the traditional filter approach so that the reduct was not optimal in the number of attributes and the accuracy of classification. The problem is critical in the dynamic incomplete decision systems which are more appropriate for real-world applications. Therefore, this paper proposes two novel incremental algorithms using the combination of filter and wrapper approach, namely IFWA_ADO and IFWA_DEO, respectively, for the dynamic incomplete decision systems. The IFWA_ADO computes reduct incrementally in cases of adding multiple objects while IFWA_DEO updates reduct when removing multiple objects. These algorithms are also verified on six data sets. Experimental results show that the filter-wrapper algorithms get higher performance than the other filter incremental algorithms.
Article Preview
Top

1. Introduction

Knowledge discovery, data mining and decision making have wide applications in solving practical problems in many fields (Cao et al., 2007; Dat et al., 2019; Mohamed et al., 2020; Patro et al., 2020; Tey et al., 2019). With the growth of data volume, the data sets become larger and more complex. The implementation of data mining algorithms on such large data sets is very challenging, typically classification and clustering algorithms (Kwok, Smith, Lozano & Taniar, 2002; Tan & Taniar, 2007). To improve the efficiency of data mining algorithms, it is necessary to reduce redundant attributes (Liu & Wang, 2020; You et al., 2020). This is called as attribute reduction. One of the most effective tools to deal with attribute reduction problem on decision systems is rough set theory (Pawlak, 1991). However, rough set based algorithms are all implemented on complete decision systems which contain full values. In practical problems, the values on the attribute domain are often missing, namely as “missing value”. These systems are named as incomplete decision systems. Kryszkiewicz (1998) constructed a tolerance relation as an extension of the equivalent relation in the traditional rough set and introduced tolerance rough set model. This model was applied into attribute reduction and rule extraction directly on incomplete decision systems without pre-processing missing values.

In many applications, the size of decision systems is often very large and changes over time (Atay & Garani, 2019), especially in the cases of adding or deleting object sets. For example, in online data analysis systems, data is always added and removed over time. In some others forecasting systems, we have already collected data in many years. In fact, we need to build data analysis model in each day or week. So it is necessary to remove out-of-date data as useless data. This leads to two main difficulties in solving attribute reduction problems on such the dynamic decision systems. Firstly, the size of decision system is extreme large so that it is hard to obtain reduct because of the limitation of memory size and computing capacity. Secondly, the data in decision system cannot be provided at once. When data changes, the traditional methods have to repeat every step on the entire decision system. This makes time consuming. To deal with those difficulties, incremental approach was proposed to decrease execution time of attribute reduction techniques.

It was known that the incremental algorithms only update reduct on the changed data without re-computing reduct in the whole decision tables. As the result, the computation time of incremental algorithms decreases significantly compared to non-incremental algorithms. Consequently, they can be implemented on large, changeable decision tables.

The motivation of this paper is based on the fact that available incremental algorithms using tolerance rough set did not get the best result in term of classification accuracy and the cardinality of reduct. Because these are filter algorithms, final reduct includes conditional attributes maintaining the original measure. After obtaining the reduct, classification accuracy is calculated. Consequently, the result is not optimized both in the number of attributes and the accuracy of classification. It leads to the decrease of performance of classifiers. KGIRA-M and KGIRD-M (Zhang, Dai & Chen, 2020) are two incremental algorithms used to calculate reduct used in the case of adding or deleting objects in incomplete decision systems. But they are also filter algorithms. In (Giang et al., 2020), we have proposed filter-wrapper incremental algorithms based on fuzzy partition distance to get reduct in cases of adding and deleting multiple objects. However, the proposed algorithms in (Giang et al., 2020) used fuzzy rough set to get reduct of complete decision tables.

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 6 Issues (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing