A Dynamic and Scalable Decision Tree Based Mining of Educational Data

A Dynamic and Scalable Decision Tree Based Mining of Educational Data

Dineshkumar B. Vaghela (Parul University, India), Priyanka Sharma (Raksha Shakti University, India) and Kalpdrum Passi (Laurentian University, Canada)
Copyright: © 2017 |Pages: 26
DOI: 10.4018/978-1-5225-0613-3.ch010
OnDemand PDF Download:
List Price: $37.50


The explosive growth in the amount of data in the field of biology, education, environmental research, sensor network, stock market, weather forecasting and many more due to vast use of internet in distributed environment has generated an urgent need for new techniques and tools that can intelligently automatically transform the processed data into useful information and knowledge. Hence data mining has become a research are with increasing importance. Since continuation in collection of more data at this scale, formalizing the process of big data analysis will become paramount. Given the vast amount of data are geographically spread across the globe, this means a very large number of models is generated, which raises problems on how to generalize knowledge in order to have a global view of the phenomena across the organization. This is applicable to web-based educational data. In this chapter, the new dynamic and scalable data mining approach has been discussed with educational data.
Chapter Preview


Web usage mining refers to non-trivial extraction of potentially useful patterns and trends from large web access logs. In the specific context of web-based learning environments, the increasing proliferation of web-based educational systems and the huge amount of information that has been made available has generated a considerable scientific activity in this field. As an increasingly powerful, interactive, and dynamic medium for delivering information, the World Wide Web in combination with information technology has found many applications. One popular application has been for educational use, as in Web-based, distance or distributed learning. The use of the Web as an educational tool has provided learners and educators with a wider range of new and interesting learning experiences and teaching environments that were not possible in traditional education. These platforms contain a considerable amount of e-learning materials and provide some degree of logging to monitor the progress of learning keeping track of learners’ activities including content viewed, time spent at a particular subject and activities done. This monitoring trawl provides appropriate data for many different contexts in universities, like providing assistance for a student at the appropriate level, aiding the student’s learning process, allocating relevant resources, identifying exceptional students for scholarships and weak students who are likely to fail. This can be possible by processing and analyzing the data using various classification techniques. Decision trees are simple yet effective classification algorithms. One of their main advantages is that they provide human-readable rules of classification. Decision Tree Induction algorithm (Quinlan J. R.-1986) is used for classification by constructing a decision tree. The algorithm constructs decision tree recursively using depth– first divide and conquer approach. At any given node, to further split up the dataset towards identification of a class, the algorithm chooses the most suitable attribute based on the information gain value of the attributes. The information gain of an attribute is a measure of the ability of an attribute to minimize the information needed to classify the given entity in the resulting sub-trees.

Complete Chapter List

Search this Book: