An Efficient Diagnosis System for Detection of Liver Disease Using a Novel Integrated Method Based on Principal Component Analysis and K-Nearest Neighbor (PCA-KNN)

An Efficient Diagnosis System for Detection of Liver Disease Using a Novel Integrated Method Based on Principal Component Analysis and K-Nearest Neighbor (PCA-KNN)

Aman Singh (Lovely Professional University, India) and Babita Pandey (Lovely Professional University, India)
DOI: 10.4018/978-1-5225-5643-5.ch042
OnDemand PDF Download:
No Current Special Offers


Talk about organ failure and people immediately recall kidney diseases. On the contrary, there is no such alertness about liver diseases and its failure despite the fact that this disease is one of the leading causes of mortality worldwide. Therefore, an effective diagnosis and in time treatment of patients is paramount. This study accordingly aims to construct an intelligent diagnosis system which integrates principle component analysis (PCA) and k-nearest neighbor (KNN) methods to examine the liver patient dataset. The model works with the combination of feature extraction and classification performed by PCA and KNN respectively. Prediction results of the proposed system are compared using statistical parameters that include accuracy, sensitivity, specificity, positive predictive value and negative predictive value. In addition to higher accuracy rates, the model also attained remarkable sensitivity and specificity, which were a challenging task given an uneven variance among attribute values in the dataset.
Chapter Preview


For last few decades, liver disease has shown an extreme presence worldwide. Liver has a vital importance to human body as it performs numerous key bodily functions that are chemical detoxification, protein production, drug metabolizing, blood clotting, glucose storage, cholesterol production and bilirubin clearance (Bucak and Baki, 2010). Improper working of any of these functions leads to liver disease. Nausea, fatigue, energy loss, weight loss, poor appetite, and upper right quadrant abdominal pain are some of the early symptoms of the disease. These symptoms may come slowly but can get worse after a period of time depending upon an individual life style. Severe symptoms may include memory confusion, abnormal bleeding, easy bruising, redness on the palms of hands, jaundice, edema and ascites. Common causes of the disease are hepatitis A, B, C, D, E, inherited abnormal genes, Epstein Barr virus, iron overloading and alcohol abuse (Chuang, 2011; Lin and Chuang, 2010; Lin, 2009). More than hundred types of liver diseases exist, out which most prevalent are autoimmune hepatitis, neonatal hepatitis, primary biliary cholangitis, liver fibrosis, liver cirrhosis, liver cancer, alcoholic liver disease and nonalcoholic fatty liver disease (Singh and Pandey, 2014).

Individual and integrated computer-aided models have been widely used to evaluate liver disease and its types. Literature study shows considerable applicability of artificial neural network (ANN), fuzzy logic (FL), decision trees, ANN-CBR (case-based reasoning), ANN-FL, AIS (artificial immune recognition)-FL, ANN-GA (genetic algorithm), FL-GA, AIS-ANN-FL and ANN-GA-RBR (rule-based reasoning) to build liver diagnostic systems. ANN based frameworks showed high reliability, robustness and accuracy. These systems generally take less learning time even in case of large size growing problems (Autio et al., 2007; Azaid et al., 2006; Bucak and Baki, 2010; Elizondo et al., 2012; Hashem et al., 2010; Içer et al., 2006; Lee et al., 2005; Ozyilmaz and Yildirim, 2003). ANN based models were developed to forecast timely prediction of patient with hepatectomised (Hamamoto et al., 1995), to classify hepatobiliary disorders (Hayashi et al., 2000), to detect hepatitis disease (Ozyilmaz and Yildirim, 2003; Sartakhti et al., 2015), to diagnose liver disease (Revett et al., 2006). FL based methodologies were used for performing semi-automatic liver tumour segmentation (Li et al., 2012), for identifying hepatitis disease (Obot and Udoh, 2011) and for classifying hepatobiliary disorders (Ming et al., 2011). C5.0 decision tree and boosting were employed to categorize liver viruses as chronic hepatitis C and B (Floares, 2009), C4.5 decision tree was applied to examine liver cirrhosis (Yan et al., 2008). Similarly, in integration, ANN-CBR was built to study presence of liver disease and to detect its types (Chuang, 2011; Lin and Chuang, 2010). ANN-FL hybridization was used to identify liver disorders (Celikyilmaz et al., 2009; Çomak et al., 2007; Kulluk et al., 2013; Li and Liu, 2010; Neshat and Zadeh, 2010), to enhance classification accuracy rates for liver disease (Li et al., 2010). AIS-FL integration was used to categorize liver disorders and to evaluate prediction accuracy of hepatitis disease (Mezyk & Unold, 2011; Polat, Şahan, Kodaz, & Gunes, 2007). ANN-GA was used to detect liver disorders and to grade liver fibrosis stabilization in chronic hepatitis C (Dehuri and Cho, 2010; Gorunescu et al., 2012). FL-GA was used to discover liver disorders (Luukka, 2009; Torun and Tohumoğlu, 2011). AIS-ANN-FL was used to classify hepatitis disease (Kahramanli and Allahverdi, 2009) and ANN-GA-RBR was used to take decision on liver transplantation (Aldape-Perez et al., 2012).

Complete Chapter List

Search this Book: