Best Practices in Dropout Prediction: Experience-Based Recommendations for Institutional Implementation

Best Practices in Dropout Prediction: Experience-Based Recommendations for Institutional Implementation

Juan J. Alcolea (DIMETRICAL, The Analytics Lab, Spain), Alvaro Ortigosa (Universidad Autonoma de Madrid, Spain), Rosa M. Carro (Universidad Autonoma de Madrid, Spain) and Oscar J. Blanco (DIMETRICAL, The Analytics Lab, Spain)
DOI: 10.4018/978-1-7998-5074-8.ch015
OnDemand PDF Download:
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This chapter focuses on the key practical aspects to be considered when facing the task of developing predictive models for student learning outcomes. It is based on the authors' experience building and delivering dropout prediction models within higher education contexts. The chapter presents the information used to generate the predictive models, how this information is treated, how the models are fed, which types of algorithms have been used, and why and how the obtained results have been evaluated. It recommends best practices for building, training, and evaluating predictive models. It is hoped that readers will find these recommendations useful for the design, development, deployment, and use of early warning systems.
Chapter Preview
Top

Background

With digitalization and the rise of e‐learning, a range of computational tools and approaches have emerged, which allow educators to better support the learner’s experience in schools, colleges, and universities (Freitas et al., 2015). One of the key benefits of digital tools is that a large amount of student data can help course administrators gain insight into student online-learning behavior, thus enabling administrators to answer important questions about students’ learning habits, effective and poor teaching practices, etc. (Nunn, Avella, Kanai, & Kebritchi, 2016).

Key Terms in this Chapter

Return on Investment (ROI): The profit from an activity for a particular period compared with the amount invested in it.

Receiver Operating Characteristic (ROC) Curve: A graphical plot that illustrates the predictive capacity of a binary classifier for distinguishing between classes at various thresholds settings.

Sensitivity: The performance metric for binary classifiers indicating the percentage of true positive cases correctly labelled as positive by the system.

Specificity: The performance metric for binary classifiers indicating the percentage of true negative cases correctly labelled as negative by the system.

Accuracy: The performance metric for classifiers indicating the percentage of correctly classified cases regardless of the class to which they belong.

Holdout Method: The simplest kind of cross-validation, in which the data set is separated into two sets, the training set and the testing set, which are not swapped.

Over/Under Sampling Techniques: Techniques used to adjust the class distribution of a data set (i.e., the ratio between the different classes/categories represented), in which new data points are added/removed.

K-Fold Cross-Validation: A statistical method for cross-validation, used to estimate the skill of machine learning models.

Complete Chapter List

Search this Book:
Reset