Article Preview
Top1. Introduction
The process of discovering unknown but obviously hidden information from huge sets of data is the terminology called Data Mining. Mathematical computations and analysis is used in Data mining technique to obtain relevant and hidden patterns and trends present in the data (Tomar, & Agarwal, 2013). Discovering such trends and patterns from the data is not an easier task since the associations present among the data patterns are too complex. This problem gets even severe if the scalability of the data gets increased dynamically from time to time (Balamurugan & Kumar,2014). Such data patterns, their relationships and analysis can be computed and defined as a data mining model. Building a mining model is part of a larger process that includes several important activities like analysis of the data, learning the relationships in data, creating a learning model to answer some queries, deploying the model into a working environment and testing the learning model for a new test data. There are several such learning models like classification, clustering and association rules mining (Han & Kamber,2001).
Classification is one of the data mining methodologies that is the predominant learning model and used to predict and classify the predetermined data for the specific class often called as the training dataset. Classification learning model is often called as the supervised learning model since the specific class to which the data belongs to is often estimated using a predefined known data set called as the training set. Several classification learning models are proposed by eminent researchers in the past. Some of the well-known classification models present in the literature are naïve Bayes classifier, decision trees, neural networks, SVM based classifier, Fuzzy classifiers, etc. Such classification learning models are used for various datasets like breast cancer (Abdelaal & Sena & Farouq & Salem, 20104), liver-disorders, lung cancer, diabetes, heart surgery, loan processing queries, educational forums, etc. (Aneeshkumar & Venkateswaran, 2015; Dangare & Apte, 2012).
Supervised learning model also called as Classification is one of the data analysis algorithm used to predict the categorical data. Classification model is generally a two phase learning model. These phases involve the training phase and the testing phase. The training phase contributes in major to the development of the learning model. In the training phase, the pre-determined data also called as the training data is associated with the appropriate class label and is used in generating the classification algorithm. The tuples used in training phase is often called as the training tuples. The generic working of the classification algorithm can be understood from the data mining tutorials and standard book references. In the Testing phase, a new data often called as the Test data tuple which do not contain the class label is given to the classification learning model. The algorithm runs and provides the appropriate class label to the test data tuple. The accuracy of the learning algorithms varies with respect to several factors like nature of the dataset, scalability, performance measures like training time, test time, prediction accuracy and error rate calculation (Vanaja & Rameshkumar, 2015; Rajeswara & Vidyullata & SathishTallam & Ramya,2015). The remainder of this paper is organized as follows. Section 2 provides the literature survey of this paper. Section 3 explains the proposed working module for the working of the learning models and their respective performance analysis comparison. Section 4 presents the experimental results and the subsequent discussions. The final section covers the concluding remarks with some of the important references cited in this paper.