Withdrawal Prediction Framework in Virtual Learning Environment

Withdrawal Prediction Framework in Virtual Learning Environment

Fedia Hlioui, Nadia Aloui, Faiez Gargouri
DOI: 10.4018/IJSSMET.2020070104
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Making the most from virtual learning environments captivates researchers, enhancing the learning experience and reducing the withdrawal rate. In that regard, this article presents a framework for a withdrawal prediction model for the data of the Open University, one of the largest distance-learning institutions. The main contributions of this work cover two main aspects: relational-to-tabular data transformation and data mining for withdrawal prediction. This main steps of the process are: (1) tackling the unbalanced data issue using the SMOTE algorithm; (2) voting over seven different features' selection algorithms; and (3) learning different classifiers for withdrawal prediction. The experimental study demonstrates that the decision trees exhibit better performance in terms of the F-measure value compared to the other tested models. Furthermore, the data balancing and feature selection processes show a crucial role for guiding the predictive model towards a reliable module.
Article Preview
Top

1. Introduction

By virtue of the exponential evolution of Information and Communication Technologies, the web-based learning has become commonplace in higher education institutions and organisations. It ranges from Massive Open Online Courses (MOOCs) to Virtual Learning Environment (VLE) and Learning Management System (LMS). Over one hundred higher ranked academic institutions are partnering with the VLE providers to deliver open, and in many cases, free education (Baporikar & Sauti, 2019). Despite the fact that the VLE is breaking the barriers to achieve a high-quality education, the learners’ dropout rates remain significantly high. Studies on distance learning assert that the completion rate is usually less than 7% (Jiang & Kotzias, 2016). For instance, the dropout rate ranges between 91% and 93% for Coursera (Li et al., 2016) and reaches 78% in Open University of the United Kington (Tan & Shao, 2015). Actually, such dropout rates are even higher than the corresponding the traditional learning’s rates (Jiang & Kotzias, 2016). Such percentages seriously doubt the reliability of VLEs and puts into question the efficacy of this online learning technology (Kloft et al., 2014). Consequently, rigorous efforts are put for designing advanced methods, in order to augment the learners’ commitment to the course. Many solutions offer a user-friendly, sometimes interactive, tool for monitoring learners' progress. These tools often use the log files in order to provide relatively interesting statistics. Yet, their output is not reliable for predicting the learners’ course withdrawal. Accordingly, the data mining methods constitute alternatives that are more efficient. These cognitive techniques are applied in many domains like economic (Jmaii, 2017), social networks (Truta et al., 2018), organizational communication (Baporikar, 2017), biology (Njah et al., 2016), etc. These are often used for ensuring descriptive and predictive analyses, which actually go beyond the information visualization to a knowledge establishment. The data mining methods mainly focus on gathering the learners’ data and on applying computational methods in order to process this data.

In the context of processing VLE learners’ data, many platforms put a considerable effort for gathering their data. The available datasets usually consider various aspects such as the demographic data, the forum/discussion data, the number of clicks, etc. Nevertheless, the availability of the data depends on the used VLE, the data availability (protected or open source) and on the application’s context and aim. Each proposed dataset emphasizes on one information side of learner’s features and ignores the other sides. For example, the KDD Cup dataset extracted from XuetangX MOOC platform (Cao & Zhang, 2015), does not include any demographic or historical data from past courses. Some datasets does not cover the behavioral aspect of learners such as the Academic Performance dataset (Bharara et al., 2018). In Coursera platform, some researchers investigated only in the discussion forums for analysing the cognitive process (Wang et al., 2015), but others dealt with learners’ clickstreams in videos for predicting the learners’ future behavior (Shridharan et al., 2018). Although this dataset is very rich, it remains not open-access for scientists due to many privacy issues. VLEs are often reluctant to publish the data due to confidentiality and privacy concerns (Dalipi et al., 2018). In the work done by May et al. (May et al., 2016), it was proved that it is not always straightforward or simple to promise absolute privacy, confidentiality and anonymity when using open VLE. In fact, we aim to use a dataset, which the privacy levels are clearly identified, and their protection measures allow us to set rules and policies in terms of learner tracking. Accordingly, we choose to work with the Open University Learning Analytics Dataset (OULAD) because it combines useful characteristics: free, open and anonymized (Kuzilek et al., 2017). Additionally, this dataset covers all the learners’ individual differences: demographics and behavioral data.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 6 Issues (2022): 2 Released, 4 Forthcoming
Volume 12: 6 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing