Educational Data Mining Applied to a Massive Course

Educational Data Mining Applied to a Massive Course

Luis Naito Mendes Bezerra, Márcia Terra Silva
Copyright: © 2020 |Pages: 14
DOI: 10.4018/IJDET.2020100102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In the current context of distance learning, learning management systems (LMSs) make it possible to store large volumes of data on web browsing and completed assignments. To understand student behavior patterns in this type of environment, educators and managers must rethink conventional approaches to the analysis of these data and use appropriate computational solutions, such as educational data mining (EDM). Previous studies have tested the application of EDM on small datasets. The main contribution of the present study is the application of EDM algorithms and the analysis of the results in a massive course delivered by a Brazilian University to 181,677 undergraduate students enrolled in different fields. The use of key algorithms in educational contexts, such as decision trees and clustering, can reveal relevant knowledge, including the attribute type that most significantly contributes to passing a course and the behavior patterns of groups of students who fail.
Article Preview
Top

Introduction

In higher education, the number of students currently enrolled in distance learning has grown significantly in recent years (Allen & Seaman, 2015). Since the emergence of massive open online courses (MOOCs), the average number of students enrolled per course has increased. MOOCs are defined as a model for delivering learning content fully online to any person, with no restrictions nor participants limit. They have no prerequisites and require no registration fees, thereby attracting a high volume of learners from different geographical points (Lee, Watson, & Watson, 2019). The massive aspect of the acronym can be observed, for example, in the course Introduction to Computer Science I, offered by Harvard University in partnership with the provider edX. This MOOC reached 150,349 enrolled students. Although courses with more than 100,000 students are not common, a typical MOOC has, on average, 25,000 enrolled students, which hinders the use of traditional teaching tools and demands more autonomy from the e-learners (Jordan, 2015; Lee et al., 2019).

In MOOCs, commercial and open-source learning management systems (LMSs), as well as virtual environments used by large providers, such as Coursera and edX, are the central element of any project. Those courses are taught “automatically” because they are based on prerecorded video lectures, self-graded assignments and, peer-reviewed projects, reading assignments, and forums. Message boards are important for supporting peer collaboration, allowing students to obtain information and to interact socially with other students. (Lee et al., 2019; You, 2016). Despite the existence of a previously defined learning path, students can manage their learning, and for that reason, measuring the growth of the students’ proficiency is essential to teachers, so that they can decide when to interfere, helping those learners with difficulties (Abbakumov, Desmet, & Van den Noortgate, 2019).

In these massive courses, a vast amount of data on web browsing is recorded and collected, being assignments completed or interaction with teaching materials and with other students, enabling the analysis of student behavior patterns in the environment. Currently, LMSs include modules that automatically record every event in the environment. These analyses enable the investigators to describe, understand, and predict the behavior of the learners, better targeting the student’s relationship with the course, also providing reinforcement when necessary. (Lee et al., 2019; Luna, Fardoun, Padillo, Romero & Ventura, 2019; You, 2016).

Typically, the data generated by LMSs cannot be adequately analyzed using basic software applications, such as a spreadsheet, or using traditional statistical analysis mechanisms or tools for accessing transactional databases due to a number of factors, including the vast number of records, the high number of attributes, missing values and the presence of qualitative rather than quantitative data. Data collected from massive courses enable educators and managers to rethink traditional analytical approaches, and computational solutions have been increasingly used as the most appropriate path (Dutt, Ismail, & Herawan, 2017; Luna et al., 2019; Romero & Ventura, 2013).

The development and use of computational tools for data analysis, such as data mining and learning analytics, in the field of education were rather late in comparison with sciences such as biology and physics, as well as marketing, manufacturing, and finance. The application of such techniques has enormous transformational potential, for example, in predicting student performance and in understanding student behavior in the teaching and learning process. This is the domain of a research field known as educational data mining (EDM) (Baker, 2014; Burgos et al., 2018; Campagni, Merlini, Sprugnoli, & Verri, 2015)

Complete Article List

Search this Journal:
Reset
Volume 22: 1 Issue (2024)
Volume 21: 2 Issues (2023)
Volume 20: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 19: 4 Issues (2021)
Volume 18: 4 Issues (2020)
Volume 17: 4 Issues (2019)
Volume 16: 4 Issues (2018)
Volume 15: 4 Issues (2017)
Volume 14: 4 Issues (2016)
Volume 13: 4 Issues (2015)
Volume 12: 4 Issues (2014)
Volume 11: 4 Issues (2013)
Volume 10: 4 Issues (2012)
Volume 9: 4 Issues (2011)
Volume 8: 4 Issues (2010)
Volume 7: 4 Issues (2009)
Volume 6: 4 Issues (2008)
Volume 5: 4 Issues (2007)
Volume 4: 4 Issues (2006)
Volume 3: 4 Issues (2005)
Volume 2: 4 Issues (2004)
Volume 1: 4 Issues (2003)
View Complete Journal Contents Listing