Comparative Study of Classification Models with Genetic Search Based Feature Selection Technique

Comparative Study of Classification Models with Genetic Search Based Feature Selection Technique

Sanat Kumar Sahu, A. K. Shrivas
DOI: 10.4018/978-1-7998-2460-2.ch040
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Feature selection plays a very important role to retrieve the relevant features from datasets and computationally improves the performance of a model. The objective of this study is to evaluate the most important features of a chronic kidney disease (CKD) dataset and diagnose the CKD problem. In this research work, the authors have used a genetic search with the Wrapper Subset Evaluator method for feature selection to increase the overall performance of the classification model. They have also used Bayes Network, Classification and Regression Tree (CART), Radial Basis Function Network (RBFN) and J48 classifier for classification of CKD and non-CKD data. The proposed genetic search based feature selection technique (GSBFST) selects the best features from CKD dataset and compares the performance of classifiers with proposed and existing genetic search feature selection techniques (FSTs). All classification models give the better result with proposed GSBFST as compared to without FST and existing genetic search FSTs.
Chapter Preview
Top

Literature Survey

This part consists of reviews of various technical and related articles on machine learning techniques applied to predict kidney disease.

The two types (Polat et al., 2017) of feature selection methods, i.e., wrapper and filter approach have been used to diagnose CKD. In wrapper approach, a classifier subset evaluator with the greedy stepwise search engine and wrapper subset evaluator with the Best First Search(BFS) engine were used. In filter approach, correlation feature selection subset evaluator with a greedy stepwise search engine and filtered subset evaluator with the BFS engine were used. Results showed that the Support Vector Machine (SVM) classifier has used filtered subset evaluator with the BFS engine feature selection method gives a higher accuracy rate (98.5%) in the diagnosis of CKD.

A number of different ML classifiers (Subas et al., 2017) Artificial Neural Network(ANN), SVM, k-Nearest Neighbor, C4.5 and Random Forest (RF) have experiment validated to a real data set, taken from the UCI Machine Learning Repository. The result reveals that the random forest (RF) classifier reaches the maximum performances on the classification of CKD.

ML techniques (Charleonnan et al., 2016) were introduced to estimate CKD. Four ML methods were used, including near-neighbors (KNN), SVM, logistic regression (LR), and decision tree. These models consist of CKD data series, the performance of these models was compared to each other, in which the best classifier was selected to predict CKD.

Complete Chapter List

Search this Book:
Reset