Article Preview
Top1. Introduction
Data mining is the process which makes future predictions based on the existing data (Arroyo et al., 2000). Data mining is as a process of analyzing large amount of data stored in the form of Data Warehouse and deriving knowledge out of it (Romero & Ventura, 2007). Fundamentally, data mining is used to identify similar patterns and current statistical information as knowledge, which can be judged or decided by a person (Chen, Han, & Yu, 1996). The EDM (Educational Data Mining) (Romero & Ventura, 2007; Baker & Yacef, 2009) is a developing system and it has upgraded methods for accessing different type of data (Tolias & Panas, 1998). It helps in providing educational guidance in an organization with a large number of students. Various organizations follow different course structures. There are some institutions that follow flexible credit system whereas others use a fixed educational scheme. The mixed set of information clearly explains why the data analysis, model building, discovery processes are repetitive. Data Mining is widely used to analyze how to associate, relate, cluster and associate between data and retrieve the same as results (Chen, Han, & Yu, 1996). The formatter will need to create these components, incorporating the applicable criteria that follow (Stanković et al., 2012).
Key techniques of data mining are as follows.
The Data Mining Concept has few techniques that include the following (Romero & Ventura, 2007):
1.1. Association
Association correlates data of same type (Agrawal & Srikant, 1994) where data mining is applied, For example when tracking student’s course details if a student always takes university elective with program elective then suggest a program elective and university elective together the next semester.
Apriori Algorithm is widely used to implement association in data mining. Association rule using Apriori Algorithm is discussed in Shah (2016), Patil, Shubhangi, Ratnadeep, Deshmukh, & Kirange (2016), and Le et al. (2017).
1.2. Classification
Classification is used to identify the type of object and its class. For example, students can be classified in many different types by identifying different attributes (Agrawal & Srikant, 1994) name, age, register number, and department. Classification is basically a machine learning technique which classifies the data objects as classes (Tolias & Panas, 1998). This method works with mathematical techniques like linear programming, induction based decision trees and statistics. With this, similar data can be grouped into classes by using classification algorithms in various domains like Cancer Survivability (Delen, Walker, & Kadam, 2005), Wireless Sensor Networks (Stanković et al., 2012), learning (De Fortuny & Martens, 2015), etc.
1.3. Clustering
All the attributes are examined and correlated and attributes which are similar are grouped as individuals and grouped together to form a structure. It creates meaningful cluster of objects which have same data type or features using automatic technique. It defines the classes and each class has objects, whereas in classification objects are assigned to each class. Clustering Algorithms are used in the research areas like Image Processing (Tolias & Panas, 1998) Networking (Carlsson et al., 2017).
1.4. Prediction
Prediction is a deep topic where it predicts failures, identifies fraud and profits. Prediction involves classification, pattern matching, analyzing trends (Agrawal & Srikant, 1994). Information are gathered in analyzing events with which can be predicted about the event. It discovers relationship between independent variables and the relationship between dependent and independent variables (Tolias & Panas, 1998). For example, it is used to predict the profit for the future if we consider the sale as an independent variable, profit maybe a dependent variable (Romero & Ventura, 2007). Prediction Algorithm, Surveys its application in smart homes (Wu et al., 2017), Big Data Environment (Chen et al., 2016), Wireless Sensor Networks (Kosunalp, 2016).