Prediction of Chronic Obstructive Pulmonary Disease Stages Using Machine Learning Algorithms

Prediction of Chronic Obstructive Pulmonary Disease Stages Using Machine Learning Algorithms

Israa Mohamed
Copyright: © 2022 |Pages: 13
DOI: 10.4018/IJDSST.286693
Article PDF Download
Open access articles are freely available for download

Abstract

Identifying chronic obstructive pulmonary disease (COPD) severity stages is of great importance to control the related mortality rates and reduce the associated costs. This study aims to build prediction models for COPD stages and to compare the relative performance of five machine learning algorithms to determine the optimal prediction algorithm. This research is based on data collected from a private hospital in Egypt for the two calendar years 2018 and 2019. Five machine learning algorithms were used for the comparison. The F1 score, specificity, sensitivity, accuracy, positive predictive value, and negative predictive value were the performance measures used for algorithms comparison. Analysis included 211 patients' records. The results show that the best performing algorithm in most of the disease stages is the PNN with the optimal prediction accuracy, and hence, it can be considered as a powerful prediction tool used by decision makers in predicting severity stages of COPD.
Article Preview
Top

1. Introduction

Chronic Obstructive Pulmonary Disease (COPD) may be defined as a group of progressive lung diseases recognized by emphysema, chronic bronchitis and airflow fettering (Singh et al., 2019). It was estimated that around 30 million people in the US have COPD, with about half of them are unaware of having it. Undiscovered and untreated COPD may lead to faster progression of disease, heart problems, and worsening respiratory infections. Universally, COPD has been considered as a leading cause of higher rates of death. It was reported that 3.17 million deaths were caused by the CODP in 2015 (i.e., 5% of all deaths in that year), (Rodriguez-Roisin et al., 2017). The total costs of lung diseases in the EU (European Union) has been estimated to be about 6% of the total healthcare costs, and COPD was reported as taken the largest percentage (56%) of these costs (Singh et al., 2019). Thus, early diagnosis, controlling and prediction of COPD is of utmost importance for reducing its associated mortality rates and improve its financial consequences. Estimating the disease current stage and predicting the disease progression is one of the most crucial tasks done by clinicians during the patients’ treatment journey. With accurate and timely prediction of disease stages, proper interventions and treatment plans may then be applied to prevent disease degradation. Clinicians use the GOLD staging or grading system to decide the severity stage of patients. The grade will affect the treatment a patients receive. The GOLD system checks many things, for example, symptoms, how many times COPD has gotten worse, any times patient had to stay in the hospital because of COPD degradation, results from spirometry (i.e. a test that checks the amount of air and speed that patients can exhale) which is based on are based on two measurements: 1) Forced vital capacity (FVC): the largest amount of air patients can breathe out after breathing in as deeply as they can, 2) Forced expiratory volume (FEV-1): shows how much air patients can exhale from their lungs in 1 second.. GOLD stands for the Global Initiative for Chronic Obstructive Lung Disease. The National Heart, Lung, and Blood Institute, National Institutes of Health, and the World Health Organization started it in 1997. The GOLD system defines four grades (stages) of COPD severity, grade1, grade2, grade3 and grade4.

Data mining and machine learning have widely been used in the healthcare sector as an efficient tool for extracting hidden knowledge from available datasets. For example,

(Yu et al., 2010) classified and predicted diabetes patients using SVM. (Magnin et al., 2009) employed SVM to classify Alzheimer’s disease using brain anatomical MRI. PRNNs, DTs, NB have been used by (Dessai et al., 2013) for predicting heart diseases. (Cao et al., 2013) predicted HBV- induced liver cirrhosis using MLP algorithm. Concerning COPD related studies, (Guillamet et al., 2018) applied clustering algorithms to EMRs to determine relevant phenotypes of COPD. There are also many studies that compared predictive models based on their predicted output (Demir, 2014; Futoma et al., 2015 and Austin,2007). However, most of these studies suffer from poor prediction quality, as the Area Under the Curve (AUC) ranged from 0.57 to 0.74, with only one excepted study of (Coleman et al., 2004), who reported an AUC value of 0.83. (Amarala et al., 2012) evaluated the performance of different ML algorithms in developing a COPD classifier using forced vacillation measurements. Their results outweighed the performance of KNN, SVM and ANNs. While in their later study (Amarala et al., 2015), KNN and RF classifiers were suggested to have accurate diagnosis of early obstruction of respiratory. (Wang et al., 2020) were the first to use classification models to identify AECOPD on a large scale. However, to the best of our knowledge, prediction of COPD severity stages has not yet been investigated. In this work, we aim to develop prediction models of different COPD severity stages and analyse and compare the performance of different ML algorithms to identify the optimal prediction algorithm. Five different ML algorithms have been evaluated, namely: Support Vector Machine (SVM), Naïve Bayes (NB), Boosted Decision Tree (BDT), Probabilistic Neural Networks (PRNN), and Logistic Regression (LR). The choice of these algorithms was based on their characteristics diversity (Kuncheva, 2014) and their popularity in research (Wu et al., 2019; Nijeweme-d’Hollosy et al., 2018; Prashanth et al., 2016 and Cui et al., 2018). We hypothesize that the application of the mentioned algorithms may be used in the prediction of COPD severity stages and hence it will add value for the management of COPD.

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 2 Issues (2023)
Volume 14: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing