A Surrogate Data-Based Approach for Validating Deep Learning Model Used in Healthcare

A Surrogate Data-Based Approach for Validating Deep Learning Model Used in Healthcare

Meenakshi Srivastava (Amity University Uttar Pradesh, Lucknow, India)
DOI: 10.4018/978-1-7998-2101-4.ch009
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

IoT-based communication between medical devices has encouraged the healthcare industry to use automated systems which provide effective insight from the massive amount of gathered data. AI and machine learning have played a major role in the design of such systems. Accuracy and validation are considered, since copious training data is required in a neural network (NN)-based deep learning model. This is hardly feasible in medical research, because the size of data sets is constrained by complexity and high cost experiments. The availability of limited sample data validation of NN remains a concern. The prediction of outcomes on a NN trained on a smaller data set cannot guarantee performance and exhibits unstable behaviors. Surrogate data-based validation of NN can be viewed as a solution. In the current chapter, the classification of breast tissue data by a NN model has been detailed. In the absence of a huge data set, a surrogate data-based validation approach has been applied. The discussed study can be applied for predictive modelling for applications described by small data sets.
Chapter Preview
Top

Introduction

Pathological reports are one of the most significant sources for diagnosis of diseases, but it depends on human interpretation. With the growth in various digital medium generating diverse electronic reports/data the need for specialists who can interpret these reports/ data has drastically increased specially for the rural areas. Automated diagnosis from available reports through machine learning, specifically the deep learning, may well address this problem. IoT based communication between medical devices has encouraged the healthcare industry to use automated systems which provide effective insight from the massive amount of gathered data. AI and deep learning have played a major role in the design of such systems. Deep learning enables analysis of information to improve and optimize decisions and performance. Deep learning-based models have also transformed computer vision and significantly improved machine translation. It is now being used to guide and enhance all sorts of key processes in medicine, finance, marketing and beyond.

In the past decade Deep learning has been used by many researchers for designing computational system which can analyze this massive amount of data in time efficient manner. These deep learning based computational systems were not only able to diagnose the disease, but their performance were identical or surpassing the human’s diagnostic also. This success has produced significant excitement and enthusiasm in the community of researchers. These systems provide a foundation for evidence-based clinical diagnosis. With the help of evidence based diagnostic systems physicians can easily monitor and forecast patient diagnoses as well as these systems are able to improve outcomes in terms of optimized pricing, streamlining the claim process, control costs and improve operational efficiency.

One of the major issues why deep learning is gaining popularity over other machine learning algorithms is its ability to perform self-feature engineering. Unlike other supervised and un-supervised machine learning algorithms, deep learning algorithm scan the data set by its own, for corelated feature. Once the corelated features are identified, the algorithm combines them so that fast learning is enabled. Valuable insights that are suitable for the training deep learning model can be drawn from the dataset even if the dataset is in a different format.

Besides being so robust and efficient at delivering high quality results performance of neural network is constrained by few factors. The most stated challenges in the literature, for neural network based deep learning may be summarized as-

  • Firstly, the requirement of high volume of data set to train the model

  • Second, sequence of input of trained data may result in different speed of learning

  • Third, no clarity about how the neural network has predicted a result

  • Finally, a neural network cannot provide solution to a problem for which it was not trained

In the present study first two issues, mentioned above i.e. requirement of huge volume of data set to train the network and effect of change in sequence of input of trained data speed of learning have been addressed.

Top

Background Of Study

In order, the objective of a classification algorithm is to foresee the objective class by investigation the preparation dataset. This should be possible by finding legitimate limits for each objective class. By and large of saying, Use the training dataset to show signs of correlation between the fields of data set as well as to limit the conditions which could be utilized to decide each objective class (Jiawei & Micheline, 2016). When the limit conditions are decided, the next step is to foresee the objective class. This entire procedure is known as classification. Classification algorithm presents a clear picture of boundary levels by which a data set falls under a specific class/ label. Classification algorithms have its use in many areas like art and science, medicine, education, information retrieval, bioinformatics, etc. In bioinformatics, the investigation of quality expression information from DNA-microarrays is of extraordinary enthusiasm, since it enables us to break down expression levels in countless qualities in a living organism sample (Rong et al., 2009). This element makes quality articulation investigation a key instrument of research for human wellbeing (Bazi & Melgani, 2006). It gives recognizable proof of new genes playing key role in the beginning of a disease.

Complete Chapter List

Search this Book:
Reset