Improving Classification Accuracy on Imbalanced Data by Ensembling Technique

Divya Agrawal (Shri Shankaracharya College of Engineering and Technology, Bhilai, India) and Padma Bonde (Shri Shankaracharya College of Engineering and Technology, Bhilai, India)
Copyright: © 2017 |Pages: 49
EISBN13: 9781522533566|DOI: 10.4018/jcit.2017010104
OnDemand PDF Download:
OnDemand PDF Download
Download link provided immediately after order completion


Prediction using classification techniques is one of the fundamental feature widely applied in various fields. Classification accuracy is still a great challenge due to data imbalance problem. The increased volume of data is also posing a challenge for data handling and prediction, particularly when technology is used as the interface between customers and the company. As the data imbalance increases it directly affects the classification accuracy of the entire system. AUC (area under the curve) and lift proved to be good evaluation metrics. Classification techniques help to improve classification accuracy, but in case of imbalanced dataset classification accuracy does not predict well and other techniques, such as oversampling needs to be resorted. Paper presented Voting based ensembling technique to improve classification accuracy in case of imbalanced data. The voting based ensemble is based on taking the votes on the best class obtained by the three classification techniques, namely, Logistics Regression, Classification Trees and Discriminant Analysis. The observed result revealed improvement in classification accuracy by using voting ensembling technique.
InfoSci-OnDemand Powered Search