Article Preview
TopIntroduction
The number of students in both traditional and online teaching has grown considerably over the last decades. At the college level, there has been an increase from 8.5% (in 1970) to 24.7% (in 2006) in the percentage of people that have at least one degree (Gao, 2010). Online learning in particular has seen a huge boost given the emergence of Massive Open Online Courses (MOOCs), in which people from all over the world can enroll and participate. As a result, Course Management Systems (CMS) and Learning Management Systems (LMS) have become very popular and have now a large impact on distance learning (Kay, 2013).
To understand the effectiveness of a learning program or a particular course, a number of indicators must be considered, the not least important of which being student success. This in itself includes several factors such as drop-out rates and final grades but is, nevertheless, a simple indicator of how well things are going and can provide early warning signs for systemic problems that must be corrected. Also important are the inter-relations between courses in a same degree. Understanding them will give us insights into critical paths and bottlenecks in a curriculum. All of this can be used to fine tune curricula, leading to an increase in teaching quality, better learning outcomes and student satisfaction. Small student/course samples are not enough to elicit any meaningful, generalizable information. However, given the aforementioned increase in numbers, coupled with the development of increasingly sophisticated LMS and the school’s own information systems, a wide and encompassing range of data is now within reach. Its timely analysis can be crucial for the improvement of the teaching-learning process.
Alas, the available data is seldom in a format that is amenable to analysis and exploration. Of real help for decision makers would be meaningful patterns that can be elicited from that data, rather than having to go through thousands of individual records. Only in that way can the overall picture be known and understood. The application of data mining techniques in this context is an emerging research field. It provides the means to analyze educational data, from student behaviors to teaching strategies and course coordination. For this domain, it takes the moniker of Educational Data Mining (Romero, 2010). EDM provides relevant patterns based on the available data and is a tool both for the study of what went on in the past and to make informed predictions for the future. However, the results from an EDM process usually consist of extensive sets of behaviors, described in the form of technical patterns that are represented textually or symbolically. While much more meaningful than the original data, the amount of patterns can still be daunting and not conducing to an encompassing analysis. Furthermore, the understanding of the patterns themselves often requires a reasonable knowledge of the underlying data mining algorithms and statistical models, which many analysts probably do not possess.
Globally, it is important to provide an overall view of all patterns as a coherent whole, allowing the identification of commonalities and differences. It is also necessary to interpret individual patterns and establish relations between them. Furthermore, a tool that allows this should involve the users and allow them to take advantage of their creativity, flexibility and domain-specific knowledge (Keim, 2002). One possible approach to this problem is to use Information Visualization. One of its aims is to help understand large amounts of data by leveraging on the capacity of the human visual system to discover trends, patterns and outliers (Heer, 2010). Furthermore, a well-designed visualization can effectively represent large amounts of data and alleviate the cognitive load associated with interpreting it (Ware 2004). A visualization of the results from EDM has the potential to provide the insights it needs.