A Comparative Study Based on Rough Set and Classification Via Clustering Approaches to Handle Incomplete Data to Predict Learning Styles

A Comparative Study Based on Rough Set and Classification Via Clustering Approaches to Handle Incomplete Data to Predict Learning Styles

Hemant Rana (School of Computer and Information Sciences, Indira Gandhi National Open University, New Delhi, India) and Manohar Lal (School of Computer and Information Sciences, Indira Gandhi National Open University, New Delhi, India)
Copyright: © 2017 |Pages: 20
DOI: 10.4018/IJDSST.2017040101
OnDemand PDF Download:
$37.50

Abstract

Handling of missing attribute values are a big challenge for data analysis. For handling this type of problems, there are some well known approaches, including Rough Set Theory (RST) and classification via clustering. In the work reported here, RSES (Rough Set Exploration System) one of the tools based on RST approach, and WEKA (Waikato Environment for Knowledge Analysis), a data mining tool—based on classification via clustering—are used for predicting learning styles from given data, which possibly has missing values. The results of the experiments using the tools show that the problem of missing attribute values is better handled by RST approach as compared to the classification via clustering approach. Further, in respect of missing values, RSES yields better decision rules, if the missing values are simply ignored than the rules obtained by assigning some values in place of missing attribute values.
Article Preview

1. Introduction

In many real life applications, for example, in the case of large enrolments of students (as is the case with Indira Gandhi National Open University (IGNOU), New Delhi, India, which enrolls around 3 million students), missing data in respect of values of salient features, is a common occurrence. The problem necessitates the study of methods for handling information with missing attribute values. The reason for missing attribute values in the data set may be lost values and don’t care condition values (Grzymala-Busse, 2000; Grzymala-Busse, 2004). One of the reasons for missing values may be that the students may not know or may not understand the query properly. Sometimes a student forgets to answer the query or refuses to answer queries. Also, in some cases, it may be answered, but later on gets mistakenly erased by an operator (Grzymala-Busse & Zdzislaw, 2013). Such a missing value will be called lost value. For example, some irrelevant attribute values are not recorded like age, sex of the student while designing timetable. Such missing attribute values will be called don’t care condition values.

For handling this type of problems, there are some well-known approaches, including RST and classification via clustering. In the work reported here, RSES (Rough Set Exploration System) one of the tools based on RST approach, and WEKA (Waikato Environment for Knowledge Analysis), a data mining tool—based on classification via clustering — are used for predicting learning styles from given data, which possibly has missing values.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing