Comparative Performance Analysis of Various Classifiers for Cloud E-Health Users

Comparative Performance Analysis of Various Classifiers for Cloud E-Health Users

T. MuthamilSelvan (School of Information Technology & Engineering, VIT University, Vellore, India) and B. Balamurugan (School of Information Technology & Engineering, VIT University, Vellore, India)
Copyright: © 2019 |Pages: 16
DOI: 10.4018/IJEHMC.2019040105
OnDemand PDF Download:
No Current Special Offers


Several classifiers are prevalent which act as a major drive for almost all supervised machine learning applications. These classifiers, though their objective working looks similar, they vary drastically in their performances. Some of the important factors that cause such variations are the scalability of the dataset, dataset nature, training time estimation, classifying time for the test data, prediction accuracy and the error rate computation. This paper focuses mainly on analyzing the performance of the existing four main classifiers: IF-THEN rule, C4.5 decision trees, naïve Bayes, and SVM classifier. The objective of this research article is to provide the complete statistical performance estimates of the four classifiers to the authenticated cloud users. These users have the access facility in obtaining the essential statistical information about the classifiers in study from the cloud server. Such statistical information might be helpful in choosing the best classifier for their personal or organizational benefits. The classifiers follow the traditional underlying algorithms for classification that is performed in the cloud server. These classifiers are tested on three different datasets namely PIMA, breast-cancer and liver-disorders dataset for performance analysis. The performance analysis indicators used in this research article to summarize the working of the various classifiers are training time, testing time, prediction accuracy and error rate computation. The proposed comparative analysis framework can be used to analyze the performances of the classifiers with respect to any input dataset.
Article Preview

1. Introduction

The process of discovering unknown but obviously hidden information from huge sets of data is the terminology called Data Mining. Mathematical computations and analysis is used in Data mining technique to obtain relevant and hidden patterns and trends present in the data (Tomar, & Agarwal, 2013). Discovering such trends and patterns from the data is not an easier task since the associations present among the data patterns are too complex. This problem gets even severe if the scalability of the data gets increased dynamically from time to time (Balamurugan & Kumar,2014). Such data patterns, their relationships and analysis can be computed and defined as a data mining model. Building a mining model is part of a larger process that includes several important activities like analysis of the data, learning the relationships in data, creating a learning model to answer some queries, deploying the model into a working environment and testing the learning model for a new test data. There are several such learning models like classification, clustering and association rules mining (Han & Kamber,2001).

Classification is one of the data mining methodologies that is the predominant learning model and used to predict and classify the predetermined data for the specific class often called as the training dataset. Classification learning model is often called as the supervised learning model since the specific class to which the data belongs to is often estimated using a predefined known data set called as the training set. Several classification learning models are proposed by eminent researchers in the past. Some of the well-known classification models present in the literature are naïve Bayes classifier, decision trees, neural networks, SVM based classifier, Fuzzy classifiers, etc. Such classification learning models are used for various datasets like breast cancer (Abdelaal & Sena & Farouq & Salem, 20104), liver-disorders, lung cancer, diabetes, heart surgery, loan processing queries, educational forums, etc. (Aneeshkumar & Venkateswaran, 2015; Dangare & Apte, 2012).

Supervised learning model also called as Classification is one of the data analysis algorithm used to predict the categorical data. Classification model is generally a two phase learning model. These phases involve the training phase and the testing phase. The training phase contributes in major to the development of the learning model. In the training phase, the pre-determined data also called as the training data is associated with the appropriate class label and is used in generating the classification algorithm. The tuples used in training phase is often called as the training tuples. The generic working of the classification algorithm can be understood from the data mining tutorials and standard book references. In the Testing phase, a new data often called as the Test data tuple which do not contain the class label is given to the classification learning model. The algorithm runs and provides the appropriate class label to the test data tuple. The accuracy of the learning algorithms varies with respect to several factors like nature of the dataset, scalability, performance measures like training time, test time, prediction accuracy and error rate calculation (Vanaja & Rameshkumar, 2015; Rajeswara & Vidyullata & SathishTallam & Ramya,2015). The remainder of this paper is organized as follows. Section 2 provides the literature survey of this paper. Section 3 explains the proposed working module for the working of the learning models and their respective performance analysis comparison. Section 4 presents the experimental results and the subsequent discussions. The final section covers the concluding remarks with some of the important references cited in this paper.

Complete Article List

Search this Journal:
Volume 13: 2 Issues (2022): 1 Released, 1 Forthcoming
Volume 12: 6 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing