COVID-CLNet: COVID-19 Detection With Compressive Deep Learning Approach

One of the most serious global health threats is the COVID-19 pandemic. The emphasis on increasing the diagnostic capability helps stopping its spread significantly. Therefore, to assist the radiologist or other medical professional to detect and identify the COVID-19 cases in the shortest possible time, the authors propose a computer-aided detection (CADe) system that uses the computed tomography (CT) scan images. This proposed boosted deep learning network (CLNet) is based on the implementation of deep learning (DL) networks as a complementary to the compressive learning (CL). They utilize their inception feature extraction technique in the measurement domain using CL to represent the data features into a new space with less dimensionality before accessing the convolutional neural network. All original features have been contributed equally to the new space using a sensing matrix. Experiments performed on different compressed methods show promising results for COVID-19 detection.


INTRODUCTION
Coronavirus Disease 2019 (COVID-19) is a novel (new) virus that first identified in Wuhan, Hubei Province, China in December 2019.COVID-19 is contagious respiratory illnesses that is caused by infection with a new coronavirus (called SARS-CoV-2), which affects different people in different ways.The centers for disease control and prevention (CDC) are closely monitoring the spread of cases caused by this disease.As of the best of our knowledge while we write this article and according to the World Health Organization (WHO), more than 60 million confirmed cases glob-ally, and more than 1 million deaths.The current tests are mostly based on reversetranscription polymerase chain reaction (RT-PCR), which looks for bits of the virus's genetic material in the patient's blood or sputum sample.The testing may not be sensitive enough to detect COVID-19 in people with the infection.In addition, during the peak time of COVID-19 outbreak, RT-PCR test kits were in shortage (Yang et al., 2020).To overcome of RT-PCR limitation, many imaging techniques can be widely used to examine patient with COVID-19 such as Chest x-ray (CRX) and CT scan are (Rubin et al, 2020;Cohen et al., 2020).In this study, the assessment or examination processes to identify COVID-19 is the chest CT, which is recommended to be used as the primary screening or diagnostic method.Chest CTs are fast and relatively easy to perform and undergo.They are also demonstrated more sensitive to COVID-19 infection and better performance to detect the positive cases than CRX (Benmalek et al., 2021;Borakati et al., 2020).Therefore, the CAD systems are recommended to detect the earliest signs of groundglass nodules in thoracic CT that are caused by this disease, which may not be detected by the medical professionals at the early times.In Fig. 1, image A shows that COVID-19 causes multiple peripheral ground-glass opacities in lung that did not spare the subpleural regions, while image B shows progressive produced pulmonary opacities after 3 days (J.Lei et al., 2020).
The main motivation of this research is to assist accelerating the diagnostic process and help stopping this widespread pandemic.Therefore, we introduce the CAD system that applies the advanced deep learning-based radiology image analysis methods as a complementary to the com-pressive learning (CL), which is based on different sensing matrices weighted strategy.This CAD system could outperform many state-of-the-art methods.

Deep Learning
There are many Deep Learning (DL) methods have been applied to diagnosis COVID-19 based on radiology medical images.Many of those approached have been explored the use of CT images extensively and shown promising detection accuracy of COVID-19 (X.Xu et al. 2020;O. Gozes et al., 2020;S. Wang et al., 2020;H. Ko et al., 2020).DL approaches like machine learning can be categorized as follows: supervised, semi-supervised, and unsupervised learning.Also, there is one more category of DL that is called reinforcement learning or deep reinforcement learning which are often considered to be a special case of semi supervised or sometimes unsupervised learning approaches (M.Z. Alom et al., 2018).

Deep Unsupervised Learning
Unsupervised learning is a type of learning algorithm that allows the model to work on its own to discover the hidden patterns within the input data.Often clustering, dimensionality reduction, and generative techniques are considered as unsupervised learning approaches, such as Encoders (AE), Restricted Boltzmann Machines (RBM), and the recently developed Generative Adversarial Networks (GAN).

Deep Semi-Supervised Learning
Semi-supervised learning provides powerful framework for leveraging unlabeled data when labels are limited by combining that limited number of labels and a large number of unlabeled datasets (partially labeled datasets) to construct a model or classifier feature.Semi-supervised learning is between supervised and unsupervised learning.In some cases, Deep Reinforcement Learning (DRL) and GAN are used as semi-supervised learning techniques.

Deep Supervised Learning
Supervised learning is a learning technique that uses labeled data to infer the relationship between the observed data and a predetermined dependent variable.In the case of supervised DL approaches, the predetermined dependent variable has a set of inputs and corresponding outputs.After successful training, and the goal is to learn a general rule that maps inputs to outputs.There are different supervised learning approaches for deep leaning including Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN) including Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU).

Deep Reinforcement Learning (DRL)
DRL is a learning technique for use in unknown environments where a system interacts with a dynamic environment in which it must perform a certain goal.In this case, there is a straight forward loss function that controls the convergence to the optimal action value function.Therefore, the system is provided feedback in terms of rewards and punishments as it navigates its problem space.DLR is the appropriate way to go if the problem has a lot of parameters to be optimized.

Compressive Sensing (CS)
One of the very powerful signal processing techniques is the Compressive Sensing (CS), which has provided fast and efficient data acquisition in many applications.Based on the assumption that each data has a sparse representation in some basis (E.J. Candes et al., 2006;D. L. Donoho et al., 2006;K. Awedat et all., 2017;K. Awedat et al., 2017).The sparse signals can be recovered with high accuracy by projecting or sensing the data into the measurement domain.The sensing data can be achieved using sensing matrix which should satisfy the incoherent, restricted isometry property (RIP) (E.J. Candes and T. Tao, 2005).Most of CS works have been focused on providing theories for reconstruction the original sparse data (A.Draganic et al., 2017;G. Pope, 2009).Mathematically, for a signal x ∈ R N is called sparse if it contains only a small number of non-zero elements compared with its dimension s = ǁxǁ 0 ∈ R N (whereǁxǁ 0 is number of nonzero entries).Due the sparsity x can be manipulated in a new domain y ∈ R M where M< N by linear system transformation as: where φ ∈ R M×N is sensing (measurements) matrix.Decoding process focuses on finding back the sparse signal x from a given measurement y.For this purpose, the optimization method must be applied as: However, this problem is NP-hard (Knuth, Donald, 1974).Instead, the reconstruction can be done using L1-minimization as: min ǁyǁ 1 subject to φ.x = y (3) Generally, the CS can be optimized in the coding procedure by implementing different coding matrices (Y.Arjoune et al., 2018), or using different optimization methods to reconstruct the original signals (Knuth, Donald, 1974).While the reconstruction sparse signal is the main objective of CS, the compressed signal can be very useful in the applications that detecting certain patterns or features for classification (R. Calderbank and S. Jafarpour, 2012;J. Wright et al. 2008).Moreover, in some scenarios related to information privacy, reconstruction is undesirable. 23Therefore, (R. Calderbank and S. Jafarpour, 2012;M. A. Davenport et al., 2007;M. A. Davenport et al., 2010) have been proposed the Compressive Learning (CL), where the system is built based on the com-pressed measurements without the reconstruction step.Since CL has been built based on all features of data that are combined to reduce the dimensionality, it still can be used for learning task.In other words, since all the original features have been involved in projected domain, the new low dimension projected features can be applied to distinguish the original pattern or class (K.Awedat et al., 2020).
This study is inspired by (K.Awedat et al., 2020), and based on the observation that the new low dimension projected features which can be obtained by CL are great source of information to pass through advanced deep learning methods.We propose the CL to employ our inception feature extraction technique in the measurement domain for representing the data features into a new space with less dimensionality before accessing the deep learning network.The novel scientific contributions in this paper are summarized as follows: • It introduces a computer-aided detection (CAD) system based on the boosted deep learning network (CLNet), which uses compressive learning based deep learning approach.• The evaluation process of the proposed CLNet technique has been conducted on raw CT images without any preprocessing and has shown signs of high performance.
The rest of this paper is organized as the following: The related work is covered in Section 2. The proposed approach is presented in Section 3. Section 4 discusses the experimental setup and data description.The obtained results, detailed discussion, and the work limitations are presented in Section 5. Section 6 concludes the work and introduces the future work suggestions.

ReLATeD WORK
The effort of developing deep learning technique to diagnose COVID-19 has been gradually in-creased since the outbreak.To illustrate the importance of early detection and management of COVID19 patients, a detailed study has been conducted in (W.Yang and F. Yan, 2020;M. Z. Alom et al., 2020).Some literature reviews demon-strate that the multiple peripheral ground-glass opacities in lung which are caused by COVID-19 disease are clearly appeared on CT images, while sometimes are not appeared on the chest X-ray (CXR) at all (Yang et al., 2020;M.-Y. Ng et al., 2020).Due to superior ability of deep learning of image classification, there are several Artificial Intelligence (AI) systems that have been proposed for COVID-19 detection based on medical imaging.The authors of J. Civit-Masot et al. (2020) and A. Narin et al. (2020) have proposed convolutional neural network (CNN) for the detection of coronavirus pneumonia infected patient using chest X-ray radiographs.The output of learning method is a classification between Pneumonia, COVID19 or Healthy.The data need to be preprocessed and calibrated to reduce the variation of the histogram of the images.Comparison CT and CXR images with deep learning model for COVID-19 diagnosis had been issued in (Benmalek et al., 2021;Borakati et al., 2020).The results showed that imaging techniques was faster rate than the RT-PCR method.In addition, CT scanning had demonstrated excellent sensitivity and should strongly be considered in the initial assessment of COVID-19.While some literature reviews show that DL methods using CT images have been achieved promising detection accuracy of COVID-19, the DL based approaches also have been utilized extensively on CXR images and successfully have provided high performance (Ozturk et al., 2020).The authors of Song et al. (2021) have proposed a method to accurate identification of COVID-19 in human samples.However, there were many preprocessing steps such as lung segmentation, extracting main regions of lung, filled the blank segmentation with the lung itself and aggregation of prediction images.On the other hand, the authors of C. Butt et al. (2019) have applied 2D and 3D deep learning to classify CT samples with COVID-19.The authors of X. Yang et al. (2020) have used the influenza viral pneumonia cases and no-infection cases to build a database of COVID-19 based on CT images, where the diagnosis method was based on concatenating both lung masks and lesion masks.The authors of A. Jaiswal et al. ( 2020) have proposed DenseNet201 based deep transfer learning (DTL) to classify the patients as COVID infected.There are many studies have been proposed to diagnosis coronavirus using deep learning (X.Xu et al., 2020;S. Wang et al., 2020;H. Ko et al., 2020;T. Ozturk et al., 2020).According to various studies presented in the literature, the learning framework has been applied on medical images with different approaches for preprocessing methods.Generally, most of proposed methods the images need a certain preprocessing.These preprocessing steps depend on the images features that required for the classifications.In many cases, the proposed method was designed for a certain dataset.In our study, by taking the advantage of CL that the images can be represented in new domains with small number of features, and then combining these features would be useful to improve the diagnosis accuracy.

PROPOSeD MeTHOD
To investigate the appearance of coronavirus (COVID-19) on CT images, we have utilized our inception feature extraction network based on the compressing learning (CL) to represent the data features into a new space with less dimensionality before accessing the advanced deep learning network.The endto-end training pipeline of the proposed CLNet is shown in Fig. 2.

Feature extraction
In this stage, the appropriate features that are required for an accurate distinguishing between infected and non-infected images will be extracted.Our method is based on the principle of CL where all features of the input image will be preserved in low dimension representation.Since the compressing procedure is done using a sensing matrix, we claim that different sensing matrices will hold the original features in different weights.Fig. 3 shows an example for applying different measurement matrices on an image.
It is obviously that the image has been represented in a new domain differently in every single matrix.The authors of (K.Awedat et al., 2020) have addressed this issue and proved that the classification performance of the classifier is varied based on the sensing matrix and the compression ratio.In our technique, we went further and stated that the selected features for classification would be composed from three different manipulation matrices.The input features to the classifier contain three channels and every channel is a representation of images under one sensing matrix Φ.Each image has been represented by one channel.The three channels produce concatenating image under Φ 1 , Φ 2 , and Φ 3 .These matrices could be any combination of sensing matrices that proposed or applied for compressing sensing.Basically, the raw data has been directly compressed and forwarded as features for classification purpose.In this study, we applied Gaussian matrix, Circulant matrix, and Toeplitz matrix.This selection is not unique, but it is just to confirm the effectiveness of our technique.

Data Classification
Once the features of input CT images are extracted, the next step in our method is the classification process, which has been carried out by adjusting the CNN model structure of the well-known deep learning network (LeNet) (Y.LeCun et al., 1998).The LeNet or LeNet-5 architecture is made up of 7 layers that include 3 convolutional layers, 2 subsampling layers, and 2 fully connected layers.Generally, it utilizes two significant types of the layer block, a convolutional encoder block and dense block.The basic units in the convolutional block are convolution layer, an activation function, and a subsequent average pooling operation.The convolutional layers are used to identify the spatial patterns in the image, while the pooling layer is used for dimensionality reduction.Even though there are several versions of LeNet that have been successfully developed for different applications, in this work and based on the nature of the classification process the LeNet has been modified to accommodate the two classes binary classification, which can be seen in Fig. 2. The input images were processed to 64 × 64 × 3 to maintain the prominent image features.The convolutional cores kernel size is set to 3 × 3 with valid padding and the max pooling is utilized with kernel size of 2 × 2 to minimize the size of the convolved features.Then the output passed to the dense block which contains two fully connected layers to reduce the training parameters from 128 to 64 neurons.The number of images of each input layer can be adjusted where the adjustment parameter is called batch size.In our experiment, the batch size has been set to 40 which means 40 images are used each time for training, based on the observation that helps to save the machine memory and to maintain detecting prominent image features.Additionally, there is no need to resize the classifier input images since they are already compressed to the required size using sensing matrix.

eXPeRIMeNTAL SeTUP
The proposed CADe system is developed using deep learning based on compressing learning models for classification of the raw data without any kind of preprocessing.We design the network based on modified LeNet.Table 1 shows the main parameters for the network layers.For the hyper parameters, epochs = 40, patch = 10 and Adam optimizer.The implementation process was conducted using Python programming language on 24 Intel(R) Xeon(R) CPU E5-4607 0 @ 2.20GHz, 377G memory and two Quadro P2000.

Dataset (CT Images)
The COVID-CT dataset which has been used in this study is publicly available (X. Yang et al., 2020).There are 349 images of COVID-19 collected from 216 patients.The non-COVID-19 data contains 397 samples.The images collected from four sources: • MedPix website 1: A free online Medical Image Database with over 59,000 indexed and curated images, from over 12,000 patients.Some papers contain CT images.
Fig. 4 shows some positive and negative samples of the CT images.In this study, the collected CT images have different sizes.The minimum and maximum height are 153, 1853.The minimum and maximum width are 124, 1485.Since the resolution variation is very high, the first step was to resize the entire images into one scale.To make sure that all images are included, the minimum size (153×124) need to be selected.Just for simplicity, all input images have been resized to 120 × 120, which should be very efficient and accurate size for compressing images using a single sensing matrix.The main advantage for our proposed technique is not require many preprocessing for the images.Even the dataset contains very low-resolution images, the compressing features combination are enough for classification.The proposed approach depends on the use of more than one sensing matrix, the entire images are represented in the grayscale domain.Then the three sensing matrices set to have the same size, which accordingly produces output compressed images with final size of 64 × 64 (all images are scaled by compressing ratio is CR= 54%).In addition, our technique is flexible.The size of the images and compressing ratios can be selected arbitrary.

Data Augmentation (DA)
Since the dataset is considerably small, we applied a data augmentation (DA) to add more samples.The augmented data will represent a more comprehensive set of possible data points.The DA approach is built based on assumption that more information can be extracted from the original dataset through augmentations (Elgendiet al., 2021).In this work, we consider the geometric augmentation.This approach is using simple image transformation, such as rotation, flipping zooming and padding  (Shorten et al., 2019).We applied DA on the training set features after the compressing the images and concatenate the three channels.

ReSULTS AND DISCUSSION
After the model has been successfully built and to avoid any bias, the dataset was randomly split into two independent parts for training and testing respectively.Then K-fold cross validation method was applied to obtain several results according to each observation from the raw dataset.Basically, each sample could be considered in both cases training set and testing set.We divided the testing images equally between two categories at each fold.The positive and negative COVID images are randomly mixed.Fig. 6 displays how testing and training sets are selected.

Results
All trained models are evaluated using the accuracy and validation loss (val-loss).The starting point is that testing the CL technique for the classification.We have applied three sensing matrices Gaussian, Circulant, and Toeplitz to manipulate the images into the size of 64 × 64.Then apply a quick comparison with the original images where there is not any kind of compression sensing (No CS).In the classifier, the original images resized to the same size of compressed images.The CT images dataset was randomly split into two independent parts with 80% and 20% for training and testing respectively.The quantitative results based on k-fold cross validation method and according to 5 different k values (k = 1 − 5) show around 86.08% testing accuracy on the overall completely different testing samples.Table 2 shows the experimental result comparisons where it is obviously when the CL has been applied the classification accuracy is improved at all three different matrices comparing to the case of no compression sensing is applied.
The experimental results also show that the classifier performance can be improved with a minimum margin around 15%, which means the evaluation parameters would be affected by the  compression sensing method.As we can see that Circulant matrix outperforms other matrices with a small margin around 1.5%.Notice that the input features to the classifier are different from three sources of compression.
In addition, for quantitative justification after confirming that the CL can be involved to improve the performance of the classifier, we have investigated the combination of these three different methods for classification extensively.As shown in Fig. 2, the three compressed features from every image are concatenated into one channel.Every channel has a size 64 × 64.First, we investigate all concatenation options to identify which one provides the best performance.For performance assessment of the classifier, the confusion matrix has been used.The sensitivity (Sen), Specificity (Speci), Positive Predictive Value (PPV) and Negative Predictive Value (NPV) have been calculated for all data combinations.Table3 shows the average accuracy and val_loss as assessment performance for the classifier.Then in the second part, we compare our results with other methods that have been listed in. 35Tables 3 and 4, and Fig. 7 show the experimental results for two testing sets.In the first set, we have left 6.5% of the data samples for testing and used the rest training.While in the second set, we have utilized 10% for testing and the rest for training.
In general, the quantitative analysis that has been applied using all different concatenation options shows promising results that higher performance than the case of single channel method even though when the three channels are from the same sensing matrix (TTT, GGG, and CCC) Toeplitz, Gaussian, and Circulant respectively.All possible combinations of these three channels can be seen in Tables 3  and 4. Overall, the best performance could be achieved when the three channels are totally different.The average accuracies are 91.98% ± 2.77 and 91.96% ± 2.09 for 6.5% and 10% testing set respectively.Of critical importance, PPV is significantly higher when the three channels are different which is particular clinical important to diagnosis of COVID-19 disease.The main reason behind that is the selected features after representing the images in low dimensions are promoted by the combination of the three different channels.We also observed that increases the accuracy and validation loss as long as the selected features have been increased.

Discussion
To evaluate our proposed method, we have made a comparison with the DenseNet-169, G. Huang et al. (2017) which has been trained under different pretraining methods according to X. Yang et al. (2020), named random initialization, Transfer learning (TL), and TL with contrastive self-supervised learning (CSSL).More details can be found in (X.Yang et al. 2020).To avoid any bias, we select the same number of CT images for testing and 16-fold cross validation has been applied.Just to confirm that our method flexible and effective, we applied two different compressing ratios (r/n) 54% and 83% (The size of sensing matrices 64×64, 100×100).The proposed method shows average 91.98% and 94.48% testing accuracy whereas the highest accuracy of the comparison papers TL-CSSL shows 89.1% testing accuracy.Thus, our COVID-CLNet based detection model shows around 2.88% higher testing accuracy than the mentioned comparison methods.Although all these methods need some preprocessing steps such as lesion segmentation and lung mask, our proposed approach does not need any kind of preprocessing.The main observation of these qualitative results is shown in Table 5.

Limitations
Even though our proposed CLNet has shown signs of high performance using raw CT images with-out any preprocessing, the main prevalent challenge of this work is to access a big data.Currently, most of COVID-19 datasets are limited due to the nature of the disease, patient privacy, and the requirements of the radiologist or other medical professional to data labeling.From our point of observation, data augmentation could be an option to improve the system performance and to avoid the overfitting.In our technique, there are more options to expand the dataset where augmentation could be performed on either the original dataset images or on the CL features level.

CONCLUSION
The proposed CAD system for COVID-19 detection could be a great and inexpensive tool to as-sist the radiologists or other medical professionals to detect and identify the COVID-19 cases at early infection stages and in very short possible time that is about 0.288 seconds for the 64×64 image size which excludes any preprocessing time.Our improved deep learning network model based on the compressive learning (COVID-CLNet) is applied on computed tomography (CT) images directly and without any kind of preprocessing.The observed results show very promising detection precision with 91.98% testing accuracy.Combining the CL with different deep learning networks could be one of the future work suggestions.

DISCLOSUReS
The authors would like to declare that there is no conflicts of interest, financial or otherwise.

Figure 1 .
Figure 1.Unenhanced CT images.According to (J.Lei et al., 2020), image A shows multiple ground-glass opacities in bilateral lungs.While image B which obtained 3 days after follow-up shows progressive ground-glass opacities in the posterior segment of right upper lobe and apical posterior segment of left superior lobe.

Figure 2 .
Figure 2. Block diagram of the proposed COV ID − CLNet method for COV ID − 19 detection

Figure 3 .
Figure 3. Feature extraction of input image using three different measurement matrices.Where Φ G is Gaussian matrix, Φ C is Circulant matrix, and Φ T is Toeplitz matrix.The compression ratio is 30%.

•
LUNA website 2: The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI).The database contains 888 CT scans.• PubMed Central (PMC) website 3: A free full-text archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health's National Library of Medicine (NIH/ NLM).• Radiopaedia website 4: A free full-text archive of biomedical and life sciences journal literature.
Fig 5 shows a simple example of sensing features combinations and five transformations.

Figure 4 .
Figure 4. CT positive and negative samples from the dataset for COVID-19 diagnosis

Figure 6 .
Figure 6.K-fold cross validation for the input CT images.Every fold contains the same number of images from each class.

Figure 7 .
Figure 7.The validation loss comparison for different combination methods