A Hybrid GSA-K-Mean Classifier Algorithm to Predict Diabetes Mellitus

A Hybrid GSA-K-Mean Classifier Algorithm to Predict Diabetes Mellitus

Rojalina Priyadarshini (School of Computer Science & Engineering, KIIT University, Bhubaneswar, India), Rabindra Kumar Barik (School of Computer Application, KIIT University, Bhubaneswar, India), Nilamadhab Dash (Department of Information Technology. C.V. Raman College of Engineering, Bhubaneswar, India), Brojo Kishore Mishra (Department of Information Technology, C. V. Raman College of Engineering, Bhubaneswar, India) and Rachita Misra (Department of Information Technology, C. V. Raman College of Engineering, Bhubaneswar, India)
DOI: 10.4018/978-1-7998-2460-2.ch030


Lots of research has been carried out globally to design a machine classifier which could predict it from some physical and bio-medical parameters. In this work a hybrid machine learning classifier has been proposed to design an artificial predictor to correctly classify diabetic and non-diabetic people. The classifier is an amalgamation of the widely used K-means algorithm and Gravitational search algorithm (GSA). GSA has been used as an optimization tool which will compute the best centroids from the two classes of training data; the positive class (who are diabetic) and negative class (who are non-diabetic). In K-means algorithm instead of using random samples as initial cluster head, the optimized centroids from GSA are used as the cluster centers. The inherent problem associated with k-means algorithm is the initial placement of cluster centers, which may cause convergence delay thereby degrading the overall performance. This problem is tried to overcome by using a combined GSA and K-means.
Chapter Preview

1. Introduction

Diabetes mellitus is a chronic disease, whose root cause is insufficient production of insulin in a patient’s body. It is a type of metabolic diseases differentiated by high blood sugar (glucose) levels that result from flaws in insulin hormone emission, or its action or both. Three types of Diabetes Mellitus are found and it is being recorded from a study done by Public Health Foundation in India gives the information that nearly 44lakh Indians between the age group from 20 to 79 years is not aware of the fact that they are suffering from Diabetes. The statistics strained by International Diabetes Foundation informs that, about 50 million Diabetic patients exist in India (Alice & Balachandran, 2015). Diabetes is a serious disease that reduces the level of insulin which helps to communicate glucose into the blood platelets. As a result, some serious difficulties may arise in the human body and may lead to stroke, heart disease, kidney failure, retinopathy, paralysis and nephropathy by which the vision of a patient is affected. The consequences of diabetes are loss of weight, obscured vision, infections, frequent urination etc.

Experimental methods have proved to be complex and expensive and time consuming for this work. So now days different soft computing approaches are used for this work (Nurhayati et al.). In the past, a lot many heuristic optimization algorithms (Das et al., 2011; Geem et al., 2001; Yang et al., 2009) and machine learning approaches are widely used for diabetes detection (Sudharsan et al., 2015). The learning and training in machine learning techniques can be broadly classified into two basic types; supervised and unsupervised. In supervised learning the output is priory known to the network. Whereas, in an unsupervised learning’ the output is not known beforehand. Both supervised and unsupervised algorithms are being extensively experimented to accomplish the same task. The Artificial Neural Network (ANN) (Scott et al., 2008) Support Vector Machine (SVM) (Vijayan and Anjali, 2015), and Extreme Learning Machine (ELM) are being used by many researchers for the same problem. But all these methods have their inherent disadvantages. High Time Complexity, slow convergence, getting stuck into local optima, are some inherent problems associated with the classical techniques (h. Navarro et al.(2014)). To improve the efficiency of classical methods, hybrid algorithms are used to avoid the limitations of individual algorithms used in isolation.

Gravitational Search algorithm (Rashedi et al., 2009) is a recent algorithm that has been motivated by the Newtonian’s law of gravity and motion. GSA has already been explored in many areas and found to be efficient in various applications (Mohd Sabri et al., 2013, Eldos & Al Qasim, 2009). At present, there are various variants of GSA (Precup, 2012; Rashedi et al., 2010; Purcaru, 2013), which have been developed to enhance and improve the original version.

In this paper, basic GSA is used along with k-means algorithm, which is a simple and powerful clustering algorithm to determine the classes of people whether they are diabetic or not. The proposed work uses the unsupervised learning model to accomplish the work. Many existing optimization algorithms are present in current scenario, that are used for doing classification and prediction tasks in different domains. The examples could be ant colony optimization (ACO) for classifying gene expression micro array data (Schaefer et al., 2016) and for automation of blood vessel segmentation used in retinal diagnostic system (Asad et al., 2014)), Gaussian kernels for texture detection (Hemalatha et al., (2016)), Fire fly algorithm for designing low noise amplifier (R. Kumar et al. (2016), and for obtaining optimized scaling factor to be embedded for medical data exchange (Dey et al., 2014), Artificial immune system for anomaly detection in the area of ambient assisted living(Bersch, 2013), fuzzy logic based ACO to calculate trust value of different entities (Sarkar et al., 2015), cuckoo search to find the best qualitative solution from initial random partitioned images by using a correlation function (Samantaa et al., 2013), Artificial Bee Colony Algorithm (ABC) for enhancing traffic surveillance images (Aparna et al., 2016).

Complete Chapter List

Search this Book: