A Novel Approach for Computer-Aided Diagnosis for Distinction Between Benign and Malignant of Lung Nodules Based on Machine Learning Techniques

A Novel Approach for Computer-Aided Diagnosis for Distinction Between Benign and Malignant of Lung Nodules Based on Machine Learning Techniques

Shashidhara Bola (Dayananda Sagar College of Engineering, India)
DOI: 10.4018/978-1-5225-5152-2.ch014


A new method is proposed to classify the lung nodules as benign and malignant. The method is based on analysis of lung nodule shape, contour, and texture for better classification. The data set consists of 39 lung nodules of 39 patients which contain 19 benign and 20 malignant nodules. Lung regions are segmented based on morphological operators and lung nodules are detected based on shape and area features. The proposed algorithm was tested on LIDC (lung image database consortium) datasets and the results were found to be satisfactory. The performance of the method for distinction between benign and malignant was evaluated by the use of receiver operating characteristic (ROC) analysis. The method achieved area under the ROC curve was 0.903 which reduces the false positive rate.
Chapter Preview

1. Introduction

Automated segmentation and classification of lung tumors into harmless or harmful is a challenging task and is of vital interest for medical applications like diagnosis and surgical planning. It improves the accuracy and assist radiologist for better diagnosis.

The difficult task in Computer-Aided Diagnosis (CAD) system is to improve the accuracy in grouping lung nodules as benign (harmless tumor) and malignant (harmful tumor). Classification process is useful for the early diagnosis of lung cancer which in turn increases the survival rate of patient. The proposed method addresses the grouping of lung nodules as harmless or harmful with less number of false positives by extracting shape, margin and textural features based on ANN. ANN has several advantages such as the generalization and the capabilities of learning from training data without knowing the rules in priori.

Al-Kadi and Watson (2008) have discussed the fractal analysis of time sequenced contrast enhanced CT images to discriminate harmful or harmless tumors. The accuracy of their proposed system showed up to 83% in differentiating benign and malignant. Xiaoguang (Lu, Wei, Qian, & Jain, 2001) tried with Support Vector Machine (SVM) to divide the type of the lung cancer.

Wei-Chih Shen et al (2011) have used Tumor Disappearance Ratio (TDR) and density features to design a computer-aided diagnosis system to assist radiologist. The accuracy of designed classification model is 70.97%. Iwano et al (2005) proposed a system to automatically divide nodules identified on High Resolution CT (HRCT) and compared the accuracy with the radiologists.

Lee et al (2010) proposed a two-step supervised learning scheme based on image based gray level, texture and shape features with random space method and genetic algorithm. El-Baz et al (2012; 2010) used 2D approach for early evaluation of harmful tumors based on the intensity in HU with a 2D rotationally invariant second-order Markov Gibbs Random Field(MGRF). S.K. vijay Anand (Anand, 2010) has used features like area, solidity, eccentricity, energy, contrast, homogeneity with Artificial neural network.

Suzuki et al (2005) have proposed a multiple Massive-Training Artificial Neural Networks (MTANN) for the classification of nodules as harmless or harmful. C Henschke et al (2005) have used (ANN) to differentiate harmless or harmful tumors.

As per the survey, the accuracy of grouping the nodules as harmless or harmful is approximately 83 to 87%. In this direction an attempt is made to improve the accuracy rate for the classification.

The rest of the paper is organized as follows: In Section 2, theory is discussed. Section 3 gives the methods. Results and Discussion is given by Section 4. Conclusion is given in Section 5.


2. Background

The datasets were taken from LIDC database which contains benign and malignant lung nodules. 39 CT images were used to classify the lung nodules. The CT images are in Digital Imaging and Communications in Medicine (DICOM) format and is measured in Hounsfield Units (HU). To convert HU to gray level, 1024 value to be added to HU. 17 CT images were taken for training the Artificial Neural Network and 22 images used for testing of the Artificial Neural Networks.

The lung nodules are classified based on the features like shape, margin and texture (calcification pattern) as shown in the Figures 1, 2 and 3.

Complete Chapter List

Search this Book: