Diabetic Retinopathy Severity Prediction Using Deep Learning Techniques

Diabetic retinopathy is one of the leading causes of visual loss and with timely diagnosis, this condition can be prevented. This research proposes a transfer learning-based model that is trained using retinal fundus images of patients whose severity is graded by trained ophthalmologists into five different classifications. The research uses transfer learning based on a pre-trained model that is ResNet 50, thus it is possible to train the model with the limited amount of labeled training data. The model has been trained and its accuracy has been analyzed using different metrics namely accuracy score, loss graph and confusion matrix. Such deep learning models need to be transparent for approval by the regulatory authorities for clinical use. The clinical practitioner also needs to have information about the working of the classification method to make sure that he/she understands the decision making process of the model.


INTRoDUCTIoN
Diabetes is becoming more common over the world, and diabetic patients have a higher risk of having Diabetic retinopathy. Thus, Diabetic Retinopathy has become one of the leading causes of blindness in the world. Moreover, India has the second highest number of patients suffering from diabetes, in the world. Here there is a need to develop a computer-based system to assist with the clinical diagnosis of Diabetic Retinopathy. Such a system will help attain accurate diagnosis for Diabetic Retinopathy in remote regions of the world, where access to healthcare is not available. The proposed model aims to use a publicly available dataset, containing 5590 Retinal Fundus images obtained from Aravind Eye Hospital in India (Krishna Adithya et al., 2021). The blurred and duplicated images are removed and image preprocessing is done on the images, using label preserving transformations, to obtain enhanced retinal images. Different models are trained using the enhanced retinal images to study and improve the accuracy of severity grading of Diabetic Retinopathy in Fundus Retinal Images. The research is associated with severity prediction of Diabetic Retinopathy in patients by training Deep Learning models using Retinal Images of Patients taken using a technique called Fundus Photography. Diabetic retinopathy is an eye disease that can cause diabetics to lose their eyesight and become blind. It affects the blood vessels in the retina (the light-sensitive layer of tissue in the back of the eye). According to the National Institute of Health (NIH), this condition has no early symptoms and, if left untreated, can cause blurry vision, floating spots, and possibly blindness.
Diabetic retinopathy develops as a result of high blood sugar levels. Too much sugar in the blood can damage the retina, which detects light and sends signals to the brain through a nerve at the back of the eye (optic nerve). Fundus photography is the process of photographing the fundus, or rear of the eye. Fundus photography is done with fundus cameras, which combine an intricate microscope with a flash-enabled camera. On a fundus shot, the central and peripheral retina, optic disc, and macula are the primary components seen. Figure 1 represents Fundus Camera that is used to perform Fundus Photography that obtains Retinal Fundus images shown in Figure 2 The ophthalmologist can keep track of the eyes' health in the early stages of diabetic retinopathy, using comprehensive eye exams like fundus photography analysis. Treatment should be started as soon as possible if the condition has progressed to the point where it is affecting the patient's vision. To prevent surgery, the patient must take steps to control his/her diabetes, blood pressure, and cholesterol.
To grade the severity of Diabetic Retinopathy in the Retinal images, expert ophthalmologists have clinically examined the images and classified them into 5 different categories. The categories of severity are Proliferative Diabetic Retinopathy, Acute Diabetic Retinopathy, Moderate Diabetic Retinopathy, Mild Diabetic Retinopathy, and images with no Diabetic Retinopathy. This problem that requires accurate classification of Retinal Fundus Photographs based on the severity of Diabetic Retinopathy can be tackled using Deep Learning techniques. Deep Learning allows the machine to learn certain characteristics and trends from the enhanced Retinal fundus photographs, using repetitive training. This technique can also identify characteristics that are not visible to the naked eye. Later these Deep Learning models can be used for accurate classification of retinal images based on the severity of Diabetic Retinopathy. The dataset used may contain images of the retina captured using techniques like optical coherence tomography and retinal fundus photography. The dataset used in this research contains images captured using retinal fundus photography. The dataset contains the retinal image of one eye and its corresponding severity grading. It must be noted that the degree of severity of Diabetic Retinopathy may vary between the two eyes of the same person.

ReLATeD woRK
Deep convolutional neural network-based early automated detection of diabetic retinopathy using fundus image uses data augmentation to enlarge the dataset. Here the detection of diabetic retinopathy is achieved by training a convolutional neural network using the retinal images from the enlarged dataset. This model contains a pooling layer after every convolutional layer and uses different filters. It is followed by a fully connected layer that is activated by softmax. This network is optimized using backpropagation and stochastic gradient descent (Xu et al., 2017). Simple methods for the lesion detection and severity grading of diabetic retinopathy by image processing and transfer learning use image preprocessing to enhance the retinal images (Sugeno et al., 2021). The image transformations used in image preprocessing are resizing and grayscale conversion. The model uses transfer learning based on EfficientNet-B3 to achieve better efficiency in Diabetic Retinopathy severity classification. The model is also able to identify simple lesions in the retina, which are tumours in the retina that can be treated with radiation therapy.
Ensemble Deep Learning for Diabetic Retinopathy Detection Using Optical Coherence Tomography angiography uses OCT retinal images which are used to train a Deep Neural Network to achieve blood vessel segmentation. This is done to improve feature extraction. Three different Convolutional Neural Network architectures are trained using Transfer Learning based on VGG19, ResNet50, and DenseNet. It is followed by a dense layer that provides Diabetic Retinopathy classification (Heisler et al., 2020). Development and Validation of a Deep Learning System for Diabetic Retinopathy and Related Eye Diseases Using Retinal Images From Multiethnic Populations With Diabetes uses a Convolutional Neural Network trained using 1,12,648 retinal images for Diabetic Retinopathy detection and 71,896 images for Glaucoma detection (Ting et al., 2017). The dataset is classified based on diverse population groups from different parts of the world and the model is trained using the different sets of retinal images. The performance of the model, when trained using retinal images of different population groups, is analyzed.
Automated Identification of Diabetic Retinopathy Using Deep Learning uses extensive image preprocessing to enhance the image (Gargeya & Leng, 2017). The image preprocessing is done using resizing and other label-preserving image transformations, to improve feature extraction. The model uses two convolutional blocks followed by a visualization layer. The visualization layer outputs a heatmap that highlights the relevant image features that impact Diabetic Retinopathy classification. The final softmax layer outputs the Diabetic Retinopathy classification. A lightweight CNN for Diabetic Retinopathy classification from fundus images uses a convolutional neural network for feature extraction. This Convolutional neural network contains six convolutional layers and two fully connected layers. This network is used to reduce the input retinal images into features that are then used to train machine learning classifiers like Support Vector Machine, Random Forest, etc., to obtain a classification for Diabetic Retinopathy (Gayathri et al., 2020).
Diabetic retinopathy detection using red lesion localization and convolutional neural networks uses image preprocessing using grayscale transformation and other image transformations that make the retinal lesions more visible (Zago et al., 2020). Then a convolutional neural network model is built using Transfer Learning based on VGG16 is then trained using the enhanced images. This trained model can detect and localize retinal lesions and perform Diabetic Retinopathy detection. Deep learning for diabetic retinopathy detection and classification based on fundus images: A review uses image preprocessing for image enhancement. The techniques used for image preprocessing are resizing, denoising, and grayscale transformation. The same dataset is used to train different machine learning models. This model uses Transfer Learning based on InceptionV3, VGGNet, GoogLeNet and other ensemble-based approaches and the accuracies are compared (Tsiknakis et al., 2021). Identification of suitable fundus images using automated quality assessment methods looks into various techniques for image preprocessing to make important anatomical structures in the retinal images more visible. Later Machine learning based algorithms grade the preprocessed images to scale them based on the visibility of vital features of the retinal image (Sevik et al., 2014).
Application of random forest methods to diabetic retinopathy classification analysis uses random forest classifier and logistic regression classifier. The impact of sample size on the classifier performance was studied and it has been established that random forest-based methods have better accuracy than logistic regression based methods (Casanova et al., 2014). Tear fluid proteomics multi markers for diabetic retinopathy screening proposes a novel method for diabetic retinopathy detection based on changes in biomarker changes in tear fluid. The number of patient data available for training the model is limited due to the nature of the data. Various Machine Learning algorithms namely Support Vector Machine, Recursive Partitioning, Random Forest, Naive Bayes, Logistic Regression, and K-Nearest Neighbor were trained using the data and the Recursive Partitioning based model turned out to be most accurate at 65% (Torok et al., 2013).
Automated detection of diabetic retinopathy on digital fundus images implements an automated screening system to detect features that point to Diabetic Retinopathy. The colour retinal images are preprocessed to make the biological features like optic discs, blood vessels and fovea. The computerbased algorithms were able to accurately identify the features of the retinal like hard exudates and haemorrhages that lead to Diabetic Retinopathy with an accuracy of 77.5% (Sinthanayothin et al., 2002). Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs implement deep learning based methods to detect diabetic retinopathy and macular edema in retinal images. The preprocessed data is trained using a Convolutional Neural Network and gave an accuracy of 80.5% (Gulshan et al., 2016).
Convolutional Neural Networks for Diabetic Retinopathy implements the diagnosis of diabetic retinopathy using colour fundus images by training a convolutional neural network. Data Augmentation is used to increase the size of the training data and image preprocessing is done using label preserving transformations like colour normalization and resizing all images to a standard size. The model was able to implement diabetic retinopathy detection with an accuracy of 75% (Pratt et al., 2016). Multi-categorical deep learning neural network to classify retinal images: A pilot study employing a small database, implements deep learning based models to detect multiple retinal diseases including glaucoma, diabetic retinopathy and macular degeneration. After Data augmentation and image preprocessing, transfer learning based Convolutional neural networks are trained for multi-disease detection. The transfer learning is implemented based on VGG 19, the pre-trained model, followed by max-pooling and fully connected layers. An accuracy of 72.8% was achieved for multi-disease classification by the trained model (Choi et al., 2017).
Automated analysis of retinal images for detection of referable diabetic retinopathy implements detection of referable diabetic retinopathy. The model examines each pixel of the image and the surrounding pixels to detect multiple lesions or retinal structures including the size, shape, location and type of such structures. Based on the number and type of these retinal structures, the final output of the Diabetic Retinopathy index is obtained, a dimensionless quantity between 0 and 1, which expresses the likelihood of Diabetic Retinopathy (Abràmoff et al., 2013). According to the literature review, substantial image transformations are performed as part of image preprocessing to get improved feature extraction. Contrast enhancement, scaling, denoising, and grayscale transformation are some of the methods used to achieve this. To shorten training time and improve accuracy and efficiency, Convolution Neural Networks that use Transfer Learning based on ResNet50, VGGNet, InceptionV3, and EfficientNet are employed. Data augmentation is done to enlarge the dataset due to a shortage of substantial publicly available retinal image datasets. Label-preserving image transformations of retinal images, using methods like rotation, flipping, shearing, and translation, are used to perform data augmentation.
A computer-based system can considerably reduce the arduous manual work needed in diagnosing large volumes of retinal pictures in remote areas where there are a limited number of medical personnel. The models that are trained with a dataset that has not undergone data augmentation perform worse than models trained with data augmentation (Xu et al., 2017). The reason for this is that data augmentation involves label-preserving image transformations like resizing, rotation, flipping, etc. It is to be noted that the retinal images in the dataset have different lighting conditions and are captured from different angles. Thus data augmentation helps the network deal with small rotation or translation differences between the images of the dataset. The training process as well as the model's performance can be harmed by the poor image quality of the dataset. The images taken using fundus retinal photography vary in lighting conditions and image quality. At an early stage, there are subtle symptoms of retinopathy. On a low contrast or hazy image, the characteristics showing the early stage of Diabetic Retinopathy can be easily concealed (Tsiknakis et al., 2021). This problem can be tackled by rejecting poor quality and hazy images from the dataset.
Many datasets are made up of high-quality retinal fundus photographs that were taken in controlled conditions with very expensive equipment. The models trained using these highquality datasets perform poorly in practical conditions where images may be of different lighting conditions and may be captured using less sophisticated equipment (Tsiknakis et al., 2021). This is tackled by using datasets that contain retinal fundus images that are taken in different lighting conditions and varied camera hardware. The dataset used in this research, which is captured from patients of Aravind Eye Hospital, has images captured in varied lighting conditions and contains noise and variations in retinal fundus images, that reflect the practical conditions. Deep Learning models trained using such datasets will perform better when deployed in clinical settings where the retinal images may not be of the highest quality. Deep Learning-based algorithms have the potential to improve and speed up the grading of Diabetic Retinopathy severity. However, to allow Deep Learning model deployment in clinical settings, several fundamental restrictions must be addressed. Apart from improving the accuracy and efficiency of the models, there are major restrictions barring the acceptance of Deep Learning models through the regulatory processes (Tsiknakis et al., 2021). Transparency in how such models work is a critical factor in their acceptability and integration into clinical practice. The clinical operator must comprehend the model's decision-making process, which should ideally include explanations for its Diabetic Retinopathy severity prediction. Figure 3 represents the proposed architecture of the deep learning model that is trained using retinal fundus images to perform diabetic retinopathy severity grading. The input for the model is 5,590 retinal fundus images that are obtained using various fundus cameras. The input dataset contains images of various lighting conditions and brightness as established during the exploratory and statistical data analysis. The images are thus transformed using label preserving transformations to improve the visibility of the various retinal features during data preprocessing.

PRoPoSeD ARCHITeCTURe
A pre-trained Model, namely ResNet 50 is used to implement Transfer Learning, so that the model can be trained with comparatively less labelled data and still yield better accuracy and efficiency. This layer is followed by a Global Average Pooling Layer that averages all the feature map values. A Dropout Layer follows the global average pooling layer which randomly drops or omits some neurons in hidden or visible layers. This layer regularizes the neural network thus preventing overfitting. The Dense Layers activated by ReLu and Softmax result in probabilities which lie between the range of 0 and 1, corresponding to the 5 different severity grading for Diabetic Retinopathy. This model is trained using the preprocessed dataset and the performance of the model is evaluated using various evaluation metrics namely Accuracy, Precision, Recall, Loss Curve, Confusion Matrix and F1 score.

Input
The datasets contain retinal images taken with Fundus Photography, each of which is analyzed by an expert ophthalmologist and rated according to the severity of Diabetic Retinopathy. After expert examination, the degree of severity was attributed to 5,590 retinal pictures in the dataset. These images have varied lighting conditions and brightness, which need to be improved using image processing techniques. The retinal images with diabetic retinopathy are caused by elevated blood sugar. The retina, which detects light and delivers information to the brain via a nerve in the back of the eye, can be damaged by too much sugar in the blood (optic nerve). The severity of diabetic retinopathy is classified into 5 categories, namely Proliferative Diabetic Retinopathy, Acute Diabetic Retinopathy, Moderate Diabetic Retinopathy, Mild Diabetic Retinopathy, and the Normal Eye. The severity of Diabetic Retinopathy depends on the obstruction of the macular region of the fundus image, due to elevated blood sugar levels. Figures 4, 5

exploratory and Statistical Data Analysis
The dataset contains images that are of different sizes and will need resizing during image preprocessing. The differences in image size are visible in the sample images of various degrees of severity. The count plot for the number of retinal fundus images in different levels of severity is also visualized.  Figure 9 visualizes the count plot for the distribution of fundus images in the dataset across different classes of Diabetic Retinopathy severity. The chart shows that the dataset is unbalanced, with more normal retinal images than all the varying degrees of diabetic retinopathy severity. To compensate for the dataset's unbalanced nature, data augmentation and the use of other evaluation metrics might be used.

Data Preprocessing
The difference in sizes of images in the dataset is taken care of using image preprocessing. The preprocessing is implemented using label preserving transformations on the images. The steps are taken as a part of preprocessing include loading the image, conversion to grayscale, resizing the image and adding weights.
The images are loaded and the difference in brightness and contrast of the images are noted. Figure 10 represents the loaded retinal images. The loaded images are converted to grayscale as the important features of the images are better visible without all three colour channels. The difference in the size of the images is noted and further preprocessing is done to resolve that. Figure 11 represents the retinal images which are converted to grayscale. The grayscale images are converted into a standard size of 521 pixels by 512 pixels and thus all images of the dataset are of a uniform size. The grayscale images need to be further processed to make the features around the macular region more visible. Figure 3.12 represents the retinal images which are standardized to the uniform size of 521 pixels by 512 pixels.
A Gaussian filter is used to blur or smooth the input fundus images. The Gaussian filter functions similarly to the average filter, except that it uses a different kernel. This preprocessing results in images where the macular region is much more visible with a clear distinction between the macular region and other structures that lead to Diabetic Retinopathy. Figure 3.13 represents the preprocessed retinal images which had the Gaussian filter applied to them to improve feature visibility.

Model Building and Training
This research aims to achieve enhanced images from the dataset to make image features more visible using label-preserving image transformations. This preprocessed dataset is used to train a Deep Learning model to analyze accuracy and efficiency. Transfer learning is used to reduce training time and improve accuracy. A pre-trained model is reused on a new problem in Transfer Learning. This approach has the advantage of being less computationally expensive. When there isn't enough labelled data to train a Convolutional Neural Network from the ground up, it's also useful. When there is comparatively less labelled data for training, transfer learning can be employed to train an accurate model. The model that is used to implement Transfer Learning for this research is ResNet 50. Figure 14 represents the different layers of the model that include ResNet 50, Global Average Pooling layer, Dense Layer, and the final output. The ResNet 50 model is made up of five stages, each comprising a convolutional and identity block. There are three convolution layers in each convolution block, and three convolution layers in each identity block. It can learn complex functions and identify features in the input. Also, the architecture can handle over 25 million parameters, resulting in improved performance. As more layers are added to a Deep Convolutional Neural Network, to extract features from the images, this results in the vanishing gradient problem. Residual connections enable the training of very deep convolutional models. Moreover, it also improves the training speed greatly. The model used in this research consists of the ResNet50 followed by a Global Average Pooling layer, a dropout layer, a Dense Layer activated by Relu and another Dense Layer activated by softmax that gives the final output that is the severity grading for Diabetic Retinopathy. Global Average Pooling is a pooling technique used to substitute fully connected layers in Convolutional Neural Networks. It combines vectorized feature maps linearly as a Fully Connected Layer. A transformation matrix, on the other hand, distinguishes them. All feature map values were averaged in global average pooling. As there is no parameter to optimize, this layer helps to reduce overfitting in the model.
The dropout layer is used as a regularization technique to prevent the overfitting problem in the model. The dropout technique involves randomly dropping or omitting some neurons in hidden or visible layers. Experiments show that this dropout strategy regularizes the neural network model, resulting in a robust model that is resistant to overfitting. Each neuron in the Dense layer receives input from all neurons of its previous layer. Dense Layer performs matrix-vector multiplication, with the values in the matrix being parameters that can be trained and updated using backpropagation. The first Dense layer is activated by ReLu, followed by another dense layer that is activated by softmax. The final layer results in probabilities which lie between the range of 0 and 1, corresponding to the 5 different severity grading for Diabetic Retinopathy. The model is trained in 2 phases where the top layers are trained for 2 epochs at a higher learning rate and the model is then fine-tuned for Diabetic Retinopathy Severity prediction by training the model for 20 epochs at a reduced learning rate. Diabetic retinopathy is a dangerous complication that causes gradual retinal degeneration and, in severe cases, blindness. It is critical to recognize and assign a severity grading for the retinal fundus image, so that further treatment can be done to prevent negative health outcomes.

evaluation Metrics
The model trained for Diabetic Retinopathy severity grading is evaluated using the accuracy metric. Accuracy is defined as the percentage of the total number of correct predictions of the data. It is calculated using the total number of correct predictions (True positives and True negatives) divided by total predictions (true positives, true negatives, false positives and false negatives). Equation 1 shows the Accuracy, The harmonic mean of a class's precision and recall is the F1 score, which is an overall assessment of the quality of a classifier's predictions. Because it captures both precision and recall, it is frequently the metric of choice for most individuals. Equation 4 shows the F1 score,

experimental Setup
Google colab and Jupyter notebook are the software that is used in this research. Jupyter is a local software which is used to compile low computations and colab is used to run most of the major computations. By default, the colab platform offers Linux operating system, with 12GB RAM and Nvidia Tesla T4 GPU with 16GB

Results and Discussion
The dataset that contained 5,590 retinal fundus images was preprocessed to make the image features more visible. This has been done by converting the images to grayscale as the features vital for diabetic retinopathy severity grading were visible in grayscale versions of the images. These images were then resized to a standard size that is 512 pixels by 512 pixels and these images were further preprocessed using Gaussian filters that make the image features very distinct. The model that consisted of the based model of ResNet50 was followed by a Global Average Pooling layer, a dropout layer, a Dense Layer activated by Relu and another Dense Layer activated by softmax, and this model was trained using the preprocessed dataset of retinal images. The training of the model was visualized using the loss graph and accuracy graph. Figures 15 and 16 represent the loss curve and accuracy curve for the trained Model. The consistent downward trend of the loss function points to the fact the model was trained correctly. Furthermore, an accuracy score of 76% was obtained by the model for Diabetic Retinopathy severity grading. The percentage of the total number of correct diabetic retinopathy classification predictions is the accuracy score of the model. The confusion matrix was also visualized for the model. This was done to give a visual representation of the performance of the trained Deep Neural Network and Figure  17 visualized the confusion matrix for the model. The confusion matrix gives a summary of the performance of the model used for Diabetic Retinopathy severity prediction. All the possible combinations of the actual and predicted values are detailed in the confusion matrix, clearly indicating that the model is viable for diabetic retinopathy severity prediction. The classification report is also visualized to display precision, recall and F1 scores for the model. Table 4.1 represents the classification report for the trained model.

CoNCLUSIoN AND FUTURe woRK
In this research, a Deep Neural Network was trained using Transfer Learning based on ResNet50 to implement Diabetic Retinopathy Severity Grading. The model is trained using 5,590 retinal fundus images that needed to be preprocessed before they could be used to train the model. This has been done using label preserving image transformations like resizing, grayscale conversion, and Gaussian filters that make the features in the images more visible. This preprocessed dataset is used to train the model and the efficiency of the models was analyzed using different methods like accuracy score, loss graph and confusion matrix.
The future works for this research could be using transfer learning based on other pre-trained models like EfficientNet, InceptionNetV3, etc. These models can be trained using larger datasets containing more images and better accuracy and efficiency can be attained. The model was implemented with an accuracy score of 76% for Diabetic Retinopathy severity grading. Deep Learning algorithms have the potential to improve and speed up Diabetic Retinopathy severity grading. However, several fundamental constraints must be overcome before Deep Learning models can be used in clinical settings. Apart from enhancing model accuracy and efficiency, there are significant barriers to Deep Learning models being accepted through regulatory processes (Tsiknakis et al., 2021). Transparency in how these models work is crucial to their acceptance and integration into clinical practice. The clinical operator must understand the model's decision-making process, which should ideally offer reasons for the severity forecast of Diabetic Retinopathy.