Developing an Explainable Machine Learning-Based Thyroid Disease Prediction Model

Developing an Explainable Machine Learning-Based Thyroid Disease Prediction Model

Siddhartha Kumar Arjaria, Abhishek Singh Rathore, Gyanendra Chaubey
Copyright: © 2022 |Pages: 18
DOI: 10.4018/IJBAN.292058
Article PDF Download
Open access articles are freely available for download

Abstract

Healthcare and medicine are key areas where machine learning algorithms are widely used. The medical decision support systems thus created are accurate enough, however, they suffer from the lack of transparency in decision making and shows a black box behavior. However, transparency and trust are significant in the field of health and medicine and hence, a black box system is sub optimal in terms of widespread applicability and reach. Hence, the explainablility of the research make the system reliable and understandable, thereby enhancing its social acceptability. The presented work explores a thyroid disease diagnosis system. SHAP, a popular method based on coalition game theory is used for interpretability of results. The work explains the system behavior both locally and globally and shows how machine leaning can be used to ascertain the causality of the disease and support doctors to suggest the most effective treatment of the disease. The work not only demonstrates the results of machine learning algorithms but also explains related feature importance and model insights.
Article Preview
Top

Introduction

Accurate decision-making for a given situation serves as a benchmark for human intelligence and combined with critical reasoning catalyzes social change. In the current era of human-machine interaction, machine learning algorithms are being used for decision-making by computational machines (Piano, 2020). These decision support systems effectively make accurate choices in many domains. Although these machine algorithm-based decision support systems are very accurate, they suffer from a lack of interpretability as important structure and relationship-related information of data and models are hidden in these systems, effectively making these systems into black boxes. These black-box systems fail to answer important questions like the effect of each feature in the final decision and model behavior issue. Since the process of result generation is shrouded in mystery, it remains unclear if the optimization was done locally or globally. Thus, the outcomes generated by these systems may be stand-alone in terms of results and systems are unable to give the correct basis behind this outcome.

To deal with these issues, it is necessary to make the systems explainable. These explainable systems are white boxes because they transparently explore various aspects of decision making like the importance of features, model, and the predictions of decision support systems (Felzmann, Villaronga, Lutz, & Larrieux, 2020). This is especially useful in the field of medicine, where machine learning algorithms are being used for the effective diagnosis of disease. Artificial Intelligence plays a supportive role in this scenario to find out the signed quantitative impact of each feature on the result (Rong, Mendez, Assi, Zhao, & Sawan, 2020). Transparency in the system can provide additional information that may assist doctors in making the best decision for the treatment of disease. This would enhance the understanding and acceptability of AI by making it explainable as is the case in the scenario where AI is used to must make practical decisions (autonomous car driving) (Samek & Muller, 2019). It covers medical, bioinformatics, banking, insurance, cognitive, space, education, psychology, chemical, and many more domains.

In machine learning, linear systems are simple enough to be transparent. The linear relationship between the cause (Features) and effect(output) can be understood easily. But the scenario becomes complicated when the complex and highly nonlinear environment persists. The complexity of the problems increases with an increase in the number of features and parameters taken into consideration (Schmidt, Marques, Botti, & Marques, 2019).

Thyroid disease is one of the most prominent diseases in the world. More than 12 percent of the U.S. population is affected by thyroid problems throughout their life (World Thyroid Day is Heralded by International Thyroid Societies, 2015). The main cause of the disease is the lack of iodine. Thyroid disproportionately affects women between the age of 17-54. In extreme cases, complications associated with thyroid also cause cardiovascular problems, blood pressure, high cholesterol levels, depression, anxiety, and decreased fertility.

The thyroid gland produces two active hormones: Total Serum Thyroxin (T4) and Total Serum triiodothyronine (T3). These are important to maintain thyroid metabolism in the body and any misbalance in these hormones causes thyroid disease. The thyroid gland is responsible for controlling the metabolism of the body (Teixeira, Santos, & Pazos-Moura, 2020). Thyroid activity in our body is divided into three different sects as euthyroidism, hyperthyroidism, and hypothyroidism. Euthyroidism represents the normal case while hyper and hypo represent the abnormal hormonal situation. Hyperthyroidism is the situation when body cells produce an extra amount of thyroid than needed while the hypothyroid situation is created in case of deficiency. The results produced by machine learning algorithms in differentiating between these cases are very good, but present algorithms are not able to explain these findings (Temurtas, 2009).

Complete Article List

Search this Journal:
Reset
Volume 11: 1 Issue (2024)
Volume 10: 1 Issue (2023)
Volume 9: 6 Issues (2022): 4 Released, 2 Forthcoming
Volume 8: 4 Issues (2021)
Volume 7: 4 Issues (2020)
Volume 6: 4 Issues (2019)
Volume 5: 4 Issues (2018)
Volume 4: 4 Issues (2017)
Volume 3: 4 Issues (2016)
Volume 2: 4 Issues (2015)
Volume 1: 4 Issues (2014)
View Complete Journal Contents Listing