Article Preview
TopIntroduction
Coronary Artery Disease (CAD) is developed by the formation of plaques inside the walls of coronary arteries, resulting in the narrowing of lumens of coronary arteries (Pal & Chakraborty, 2011). CAD is a serious health issue in human life. Today, a variety of medical technology available in healthcare industries provide improved diagnosis of heart and coronary diseases. But, in some cases, the diagnosis is beyond the scope of common man. The healthcare industry nowadays produces a huge amount of unreadable and complex data related to patients, hospital resources, disease diagnosis, electronic patient records, medical devices, etc. This data is the main resource to be processed and analysed for knowledge extraction and used as guidance for decision-making and cost-savings (El-bialy, Salamay, Karam, & Khalifa, 2015).
In this study, we have created a model based on the Data Mining and Machine Learning (ML) techniques that can handle the problem of CAD classification. Data mining is the method of finding previously unknown patterns and developments in databases and making analytical models with that information. In healthcare, data mining is an area of high significance and has become more effective and essential (Han, Kamber, & Pei, 2012; Pujari, 2013). A lot of techniques have been introduced in the field of data mining and ML classification task; but it is more complicated to enhance the performance of an individual classifier significantly. Nowadays researchers are taking interest in combining many classifiers to achieve improved performance. One such method is the Ensemble, also called the amalgamation method. In earlier researches, ensembles have been verified theoretically and empirically for performing precisely than any individual classifier.(Chen, Wong, & Li, 2014). To make a perfect ensemble to reach the expected results, two necessary elements have to be considered carefully. The first is to initiate sufficient variety into the elements of an ensemble. The second is to decide an appropriate combining technique to combine the various outputs to a single output (Polikar, 2006). Variety is the foundation of an ensemble. In this study, we have used a voting based ensemble technique with two combination rules (Average of Probabilities and Majority Voting). The classifiers namely NB, RF and NBTree were used to create eight ensemble models (Ensemble1 to Ensemble8) and one extra DNN has been used for comparing with other classifiers. We have proposed a new feature optimization technique PSO-Ensemble1 model to optimize the features from CAD dataset. The main objective of this research work is to develop a computationally efficient and robust model using ensemble classifier with the proposed new PSO-Ensemble1 model which gives a better accuracy compared with other existing classifiers. The novelty of this research work is to optimize the feature subset using PSO with base ensemble classifier that is PSO-Ensemble1 model. This paper contributes in the following three ways:
- 1)
Examines the performance of the individuals and ensemble classifiers in terms of accuracy, sensitivity, specificity and F1-score.
- 2)
Use the PSO in different configurations and proposePSO-Ensemble1 model for feature optimization.
- 3)
Compare and analyse the performance of individuals and ensemble classifiers with optimized features in terms of accuracy, sensitivity, specificity and F1-Score.