Novel Hybrid Genetic Arithmetic Optimization for Feature Selection and Classification of Pulmonary Disease Images

The difficulty in predicting early cancer is due to the lack of early illness indicators. Metaheuristic approaches are a family of algorithms that seek to find the optimal values for uncertain problems with several implications in optimization and classification problems. An automated system for recognizing illnesses can respond with accuracy, efficiency, and speed, helping medical professionals spot abnormalities and lowering death rates. This study proposes the Novel Hybrid GAO (Genetic Arithmetic Optimization algorithm based Feature Selection) (Genetic Arithmetic Optimization Algorithm-based feature selection) method as a way to choose the features for several machine learning algorithms to classify readily available data on COVID-19 and lung cancer. By choosing just important features, feature selection approaches might improve performance. The proposed approach employs a Genetic and Arithmetic Optimization to enhance the outcomes in an optimization approach. KeywORDS Arithmetic Optimization, CNN Features, Computed Tomography, Genetic Algorithm, Metaheuristic Approaches, Novel Hybrid GAO


INTRODUCTION
The most common cancer that claims lives in both men and women is lung cancer. According to American Cancer Society statistics, there are 220,000 new cases each year, 160,000 people die from the disease, and 15% of people with all stages of the disease survive for 5 years. However, the localized stage has a 5-year longevity rate of roughly 50%. In the localized stage, cancer does not spread outside the body, such as to lymph nodes (Aboamer et al., 2019;Ajeil et al., 2020a;Habibifar et al., 2019) The specific kind of tumors as well as additional factors like prognostics general health, etc., all have an impact on the 5-year survival rate. The main determinant of lung cancer survival rate is early recognition. Before lung cancer spreads to other parts of the body, symptoms do not manifest in the lung. Lung cancer is detected using various techniques, including microarray data analysis, sputum analysis, Computed Tomography (CT) scans, and chest radiography. Lung cancer identification with widespread chest CT screening is a promising technique.
A recent infectious disease called COVID-19 has been circulating all over the world Abbas et al., 2022;Abdelmalek et al., 2018). The pandemic has had several health impacts, including economic loss, disruption of communication and information systems, social distancing, and quarantine procedures. Quarantine is an example of a confinement policy used to maintain a safe distance between people. The isolation step involves the treatment of people with suspected symptoms so that they can return to normal conditions; governments maintain medical facilities during this phase.
Along with the extensive usage of Machine Learning and Deep Learning approaches, the various feature selection methods have been utilized in statistics and pattern recognition for several years. (Waleed et al., 2022;Wen et al., 2022;Aboamer et al., 2014b;Acharyulu et al., 2021;Ahmadian et al., 2021;Ajeil et al., 2020b;Hamida et al., 2022a;Azar, 2020a,b). When there was an excessive amount of data that needed to be processed quickly, feature selection techniques were necessary (Azar et al., 2023d). These feature selection techniques were utilized to achieve the objectives of increasing classifier accuracy, decreasing dimensionality, removing superfluous and unrelated data, and more. It also aided in enhancing data comprehension and reducing the time required to run learning algorithms.
Deep learning techniques make very small elements in images visible that would not otherwise exist. "Convolutional Neural Networks (CNNs)" are the top excellent among academics for classification-related tasks in medical imaging issues because of their prowess in deep feature extraction and learning. CNNs are useful for detecting the features that discriminate different objects from each other. (Aboamer et al., 2014a;Ali et al., 2022b). However, CNNs are not suitable for applications with high learning capacity and large amounts of data as they are very sensitive to hyperparameters. Moreover, it is needed to consider the amount of data, since neural networks have a large complexity and require much time to process a dataset. These factors can make it challenging for practitioners to manually adjust these hyper-parameters so that they can be optimized effectively.
A heuristic is a method designed to solve a problem more quickly when more conventional methods are inefficient. (Ajeil, et al., 2020a;Al-Qassar et al., 2021a;Amara et al., 2019;Elkholy et al., 2020a;Azar & Banu, 2022). A black-box optimizer known as a meta-heuristic algorithm is given a collection of issue variables, including some restrictions in the form of limitations. The optimizer changes these variables by performing an updating procedure up until it finds an objective function's optimal value. The result is a close-to-optimal solution that has the objective function's maximum and minimum values. The objective is to find the best answers in a fair amount of time with the least amount of computational complexity.
Combinations of Genetic Algorithms and Arithmetic Optimisation Algorithms-based feature selection approaches are used in this work to increase the efficiency of machine learning algorithms for lung cancer and COVID classification. Genetic Algorithm (GA) and Arithmetic Optimisation Algorithm (AOA) are two methods that can be used in combination to solve optimization problems. AOA focuses on performing mathematical operations to optimize each solution, whereas GA uses population-based search and genetic operators like mutation and crossover. In this hybrid approach, a population of individuals representing potential solutions is formed using a genetic algorithm as the fundamental framework. The hybrid GA with the AOA technique can be a potent method for resolving challenging optimization issues, using the advantages of both methods to produce superior results.

Problem Formulation and Motivation
Image processing approaches that include Deep Learning aid in the detection of cancer utilizing image processing technologies Chowdhuri et al.,2014a;Inbarani et al., 2015a;Mjahed et al., 2020;Kumar et al., 2018). The optimization of complicated, high-dimensional problems is the issue that the hybrid technique combining Genetic Algorithm (GA) and Arithmetic Optimisation Algorithm (AOA) seeks to address. Finding the ideal set of characteristics or variables to maximize or minimize a specific objective function is a common task in these issues. However, it might be difficult to effectively examine the enormous search field for these issues.
1. Accuracy and diagnosis errors in medical applications: Lower accuracy, increase Classification accuracy, and lower diagnostic errors. 2. Avoiding local minima and overfitting: In integrated optimization designs, avoid overfitting, get to the optimum local optima without becoming stuck, and strike a balance between exploitation and exploration. 3. Optimise parameters: To obtain global optimality and steer clear of suboptimal solutions when solving optimization problems. 4. Scalability for complex issues: To efficiently address high-dimensional and large-scale problems, develop scalable optimization strategies. 5. Time complexity and convergence speed: Improve convergence speed while keeping the quality of the solutions.

Motivation
1. Develop a hybrid framework: Propose a hybrid framework that combines Genetic Algorithm (GA) and Arithmetic Optimization Algorithm (AOA) to enhance optimization performance. 2. Feature selection: Incorporate an approach for choosing the finest subclass of features, leveraging metaheuristic approaches to efficiently handle the NP-hard task of feature selection. 3. Enhance performance: Improve solution quality, accelerate convergence, and enable effective exploration of the solution space through the integration of GA and AOA.

Novel Hybrid GAO Framework:
The research proposes a hybrid algorithm that combines "Genetic Algorithm (GA)" and "Arithmetic Optimization Algorithm (AOA)" to improve the optimization progression. 2. Feature Subset Selection: The algorithm incorporates feature subset selection to identify the most relevant features for improving solution quality. 3. Enhanced Solution Quality: The hybrid approach enhances the solution quality by leveraging the fine-tuning capabilities of AOA and the exploration capabilities of GA. 4. Avoiding Local Optima: The hybrid algorithm overcomes the issue of getting trapped in local optima by combining GA and AOA, allowing for the exploration of global solutions. 5. Improved Convergence Speed: The hybrid algorithm achieves faster convergence by leveraging AOA's convergence properties and GA's population-based search. 6. On the COVID dataset and the lung cancer dataset: The proposed technique yields the best results.
To provide a clear grasp of the existing methodology's examine in Section 2. Section 3 explains Our suggested technique and provides details on its conceptual foundation and layout. The full explanation of Feature Extraction Using CNN is provided in Section 4 followed by explanation of essential methods and important components of our proposed approach. The subtleties of hyperparameter tweaking are explained in Section 5, offering light on the process of fine-tuning that improves the performance of proposed model. In Section 6, highlights the useful implications of our proposed research, describing the experimental results and outcomes of applying our recommended methodology. Summarization of the proposed research work and conclusion is presented in Section 7.

ReLATeD wORK
Artificial Intelligence (AI) stands as the overarching field that encompasses computational intelligence, metaheuristic algorithms, control and robotics, forming a dynamic and interconnected landscape of technological advancement (Ahmed et al., 2023a(Ahmed et al., ,b,c, 2022aSayed et al., 2023;Sergiyenko et al., 2023;Vaidyanathan et al., 2023Vaidyanathan et al., , 2019Vaidyanathan et al., , 2018aVaidyanathan et al., ,b, 2017aVaidyanathan et al., ,b, 2015Kengne et al., 2023a,b;Azar et al., ,d, 2022aAzar et al., ,b, 2020aHashim et al., 2023;Hameed et al., 2023;Fekik et al., 2023aFekik et al., ,b,c, 2022aFekik et al., ,b,c, 2021aZhang et al., 2023;Wang et al., 2023;Dendani et al., 2023;Bousbaine et al., 2023;Hasan et al., 2023;Hamida et al., 2023Hamida et al., , 2022bNaoui et al., 2023). Artificial intelligence (AI) is rapidly transforming the field of robotics, enabling robots to perform more complex tasks and operate in more challenging environments. AI techniques such as machine learning, computer vision, and natural language processing are being used to develop robots that can learn, adapt, and make decisions on their own. This is making robots more versatile and reliable, and opening up new possibilities for their use in a wide range of applications, such as manufacturing, healthcare, and customer service.
For example, AI-powered robots are being used in factories to automate tasks such as welding, assembly, and painting. These robots can learn to perform these tasks more efficiently than human workers, and they can operate 24/7 without getting tired. AI-powered robots are also being used in healthcare to perform tasks such as surgery and rehabilitation. These robots can be more precise and gentler than human surgeons, and they can provide personalized care to patients. AI is also playing a major role in the development of new control systems for robots. Traditional control systems are based on pre-programmed instructions, which can be limiting in complex and dynamic environments (Vaidyanathan et al., 2021a,b,c,d,e,f;Sambas et al., 2021a,b,c;Drhorhi et al., 2021;Alimi et al., 2021;Kumar et al., 2021;Bansal et al., 2021;Singh et al., 2021aSingh et al., , 2018Singh et al., , 2017Gorripotu et al., 2021;Ouannas et al., 2021Ouannas et al., , 2020aOuannas et al., ,b,c, 2019Ouannas et al., , 2017aKhennaoui et al., 2020). AI-based control systems can learn and adapt to changes in the environment, making them more robust and reliable (Pham et al., 2018;Shukla et al., 2018;Vaidyanathan & Azar, 2016a,b,c,d,e,f,g, 2015aAzar & Serrano, 2015). This is making it possible to develop robots that can operate in more challenging environments, such as those that are hazardous or unpredictable.
For example, AI-based control systems are being used to develop robots that can navigate through cluttered or unstructured environments (Najm et al., 2021a,b). These robots can learn to avoid obstacles and plan their own paths, making them ideal for applications such as search and rescue. AI-based control systems are also being used to develop robots that can interact with humans in a safe and effective way. These robots can learn to recognize human gestures and respond appropriately, making them suitable for applications such as customer service and education.
In recent years, computer vision research has grown to be a significant area of study in the biomedical field (Anter et al., 2020;Azar et al., 2019b;Chowdhuri et al.,2014b;Inbarani et al., 2014a;Kumar et al., 2014a). X-ray images and Computed Tomography (CT) scan images are both kinds of images employed in COVID-19 detection studies. X-ray images and Computed Tomography (CT) scan images are used for different purposes. X-ray images capture dense tissues in addition to soft tissue and bones, while CT scans capture many details of the body in one image. In addition to these two types of images, some studies use only one type or the other, while others use both.
A dataset's behavior is determined by its collection of characteristics, and in the case of image datasets, some features are very important. (Ananth et al., 2021;Anter et al., 2013;Azar et al., 2018a;. To correctly classify an image, it is essential to consider its size, color values, intensity, and presence of distinct shapes. Images have thousands of features because of the enormous number of pixels they contain, which increases computing complexity. It is therefore crucial to reduce the number of features in image data before submitting it to a classification system. The core traits must be kept; hence it is equally crucial to protect the fundamental patterns of each class throughout this feature reduction procedure. Additionally, the quantity of pixels in an image directly correlates with its quality. Algorithms are essential in the fields of machine learning and deep learning for locating specific patterns or recurring structures within data points, particularly in image datasets (El-Shorbagy et al., 2023;Anter et al., 2014Anter et al., , 2015Ashfaq et al., 2022a;Aslam et al., 2021;Azar et al., 2019c;Boulmaiz et al., 2022;Cheema et al., 2020). However, the raw data needs to be thoroughly cleaned to remove any extraneous information before beginning the discovery process. Here, the feature extraction stage comes in handy since it draws out key details from the data, including edges, corners, or distinct visual areas. This stage is constantly emphasized in the literature as coming first after preprocessing. The Convolutional Neural Network (CNN) stands out as the most extensively used approach among the numerous feature extraction techniques because it effectively extracts useful characteristics from images. Using CNN, the algorithm can quickly recognize important patterns and information necessary for accurate classification tasks (Ashfaq et al., 2022b;Azar et al., 2019a;Inbarani et al., 2014c ;Soliman et al., 2020). Chen and Chiang (2023) proposed a method for optimizing the hyperparameters of a CNN model for COVID-19 diagnosis. The method uses a genetic algorithm to search for the best combination of hyperparameters. The model was trained on a dataset of 5,000 CXR images, and it achieved an accuracy of 97.56% in classifying COVID-19, normal, and pneumonia patients.
In Nitha et al. (2023), Transfer learning is used to create the ExtRanFS framework for automated lung cancer malignancy diagnosis. The dataset's CT images have a slice thickness of 1mm and are made up of 80 to 200 slices that were collected from various perspectives and sides. The "DICOM" format is used to store all images. The suggested solution used a pre-trained "VGG16" model based on convolution as the feature extractor and an "Extremely Randomised Tree Classifier" as the feature selector. The "Multi-Layer Perceptron (MLP)" Classifier uses the chosen features to determine if lung cancer is "benign, malignant, or normal". The proposed framework has an "accuracy, sensitivity, and F1-Score of 99.09%, 98.33%, and 98.33%", respectively.
In Prasad et al. (2023) the study's solution to feature selection issues used the Hybrid Spotted Hyena Optimisation with the Seagull Algorithm, which successfully produced the ideal subset with the greatest number of pertinent features. To do data augmentation, we used DCGAN, a generative modeling technique. Utilizing a "hybrid CNN-LSTM" that discovered both standard and aberrant structures in biological lung data, the selected lung characteristics are assessed. The structure's effectiveness is evaluated by looking at its "accuracy, precision, recall, specificity, and sensitivity". The suggested classifier achieved 99.8% sensitivity, 99.3% specificity, 99.14% precision, and 99.6% accuracy in the "LIDC/IDRI" database. The classifier attained a "sensitivity" of 99.62%, "specificity" of 97.8%, "precision" of 97.5%, and "accuracy" of 99.7% in the chest X-ray dataset. Deepa and Shakila (2022) proposed a CNN-based model for classifying COVID-19 X-ray images. The model is optimized using a hybrid optimization algorithm that combines the Firefly Algorithm and Particle Swarm Optimization. The model achieves an accuracy of 93.33% on a test dataset of 100 images. Canayaz (2021) used images from three kinds of patients: pneumonia patients, COVID patients, and normal patients. Deep learning models like "AlexNet", "VGG19", "GoogleNet", and "ResNet" were used to complete the feature extraction process from this data set. Two metaheuristic algorithms "Binary Particle Swarm Optimization" and "Binary Grey Wolf Optimization" were applied to choose the best possible features in the classification pipeline. Using SVM, these chosen features were categorized. It was successful in achieving 99.38% accuracy. Goel et al. (2021) proposed a CNN model that is optimized using Grey Wolf Optimization (GWO). The model was trained on a dataset of 3,000 CXR images, and it achieved an accuracy of 97.78% in classifying COVID-19, normal, and pneumonia patients.
In Singh et al., (2021b) "binary class classifier", feature selection was carried out using "HSGO (Hybrid Social Group Optimisation)", and the classifier was trained on "chest X-rays". SVM (Support Vector Machine) surpassed all other classifiers in tests using these chosen features, obtaining a remarkable accuracy of 99.65%. Multiple classifiers used the relevant features that were found through feature extraction from "CXR images" to help them classify the images. Surprisingly, this proposed pipeline outperformed other cutting-edge deep learning methods for both "binary and multi-class classification", obtaining a remarkable "Support Vector Classifier" classification accuracy of 99.65%.
ResNet18 (CNN) remained by Chattopadhyay et al. (2021) to feature extraction, and the most pertinent characteristics were then chosen from the extracted features using the CGRO. A subset of the crucial features was chosen, and SVM was then applied to carry out the classification. On both CT and X-ray pictures, this model was put to the test. The studies used the "SARS-COV-2 dataset", "the Chest X-Ray dataset", and the "CT dataset" to produce outcomes with accuracy levels of 98.65%, 99.44%, and 99.31%, respectively.
A "Convolutional Neural Networks (CNN)" approach for identifying COVID-19 patients based on chest X-ray images was introduced by Shukla et al. (2021) in their work. They used "GoogLeNet", a pre-trained model with part of its final CNN layers altered, to implement transfer learning. A method called 20-fold cross-validation was proposed to solve overfitting issues. The suggested COVID-19 detection model for chest X-ray pictures hyperparameters was also fine-tuned using a "multi-objective genetic algorithm". This model's testing and training accuracy reached astounding results, coming in at 98.3827% and 94.9383%, respectively. Yousri et al. (2021) used "discrete and Gabor wave transformations" which resulted in the computation of the "Grey Level Co-occurrence Matrix (GLCM)". An enhanced "Cuckoo Search optimization method (CS)" replaces the "Levy flight with four separate heavy-tailed distributions" to enhance the performance of the system when handling the COVID-19 multiclass classification optimization job. "18 UCI data sets" were used as the initial series of tests to validate the suggested FO-CS variants. Two data sets, COVID-19 for X-ray pictures are taken into consideration for the second series of tests. The findings of the suggested approach have been contrasted with those of reputable optimization algorithms. On dataset 1 they were able to reach 84.67%, and on dataset 2 98.95%.
In Iraji et al. (2021) study, a hybrid method grounded on "Deep Convolutional Neural Networks" powerful tools for picture classifying is presented. "Deep Convolutional Neural Networks" were utilized to excerpt feature vectors from the images, and the "binary differential metaheuristic algorithm" was then applied to choose the utmost advantageous features. These improved characteristics were then applied to the SVM classifier. A repository of 1092 X-ray samples from three categories-"COVID-19", "pneumonia", and a "healthy category" was used in the investigation. With "accuracy, sensitivity, and specificity" reaching 99.43%, 99.16%, and 99.57%, respectively, the suggested technique performed quite well. Interestingly, Our results showed that the suggested strategy outperformed current studies on "COVID-19 recognition using X-ray imaging". Lakshmanaprabu et al. (2019) proposed "Linear Discriminate Analysis (LDA)" and "Optimal Deep Neural Network (ODNN)" to analyze the CT scan lung pictures. LDR is used to lower the dimensionality of the deep features retrieved from "CT lung images" before classifying lung nodules as either benign or malignant. To classify lung cancer, the ODNN is applied to CT images and then optimized using the "Modified Gravitational Search Algorithm (MGSA)". According to comparison data, the proposed classification has a "sensitivity of 96.2%", "a specificity of 94.2%", and an "accuracy of 94.56%". Pradhan et al. (2023) proposed a Convolution Neural Network (CNN) to determine if a chest X-ray (CXR) image exhibits pneumonia (Normal) or COVID-19 disease. Furthermore, in order to improve the CNN classifier's performance, a nature-inspired optimisation approach known as the Hill-Climbing Algorithm based CNN (CNN-HCA) model has been presented to improve the CNN model's parameters. Shan & Rezaei (2021) proposed Lung cancer automatic and optimized computer-aided detection. The preprocessing step of the procedure involves normalizing and denoising the input images. After that, lung region segmentation is carried out using mathematical morphology and Kapur entropy maximization. The segmented pictures are then used to obtain 19 GLCM features for the final analyses.
To reduce system complexity, higher-priority images are then chosen. This feature selection is centered on a novel optimization method called "Improved Thermal Exchange Optimisation (ITEO)" and aims to increase exactness and convergence. The imageries are then categorized into cancerous or healthy instances using an optimized "Artificial Neural Network".
An enhanced method for premature lung cancer analysis utilizing image processing, deep learning, and metaheuristics was suggested in a recent work of Lu et al. (2021). They used the marine predator's method to improve organization and network accuracy. On the "RIDER dataset", the approach was assessed and contrasted with several pre-trained deep networks, such as "CNN ResNet-18", "GoogLeNet", "AlexNet", and "VGG-19". The outcomes distinctly showed that the proposed strategy performed better than the compared approaches, highlighting its superiority in lung cancer diagnosis.

PROPOSeD MeTHODOLOGy
In this study, a COVID and Lung Cancer classification model is anticipated that encompasses four essential phases: "Preprocessing", "Feature Extraction", "Feature Selection", and "Classification". The approach is designed to leverage the capabilities of Computed Tomography (CT) imaging and Deep Learning (DL) features for accurate and effective classification. The initial phase involves preprocessing the lung CT images to enhance their quality and reduce noise. This prepares the images for subsequent analysis. At the next step, perform feature extraction to capture relevant information from the imageries, aiming to excerpt discriminative features that are indicative of lung cancer and COVID. To address the challenge of feature selection, introduce a "Novel Hybrid algorithm called GAO". By combining the strengths of the "Genetic Algorithm (GA)" and "Arithmetic Optimization Algorithm (AOA)", the selection of informative features is enhanced, leading to improved classification performance. Studies show that the suggested methodology is good at correctly diagnosing COVID and lung cancer. By automatically extracting relevant attributes and harnessing the power of deep learning techniques and meta-heuristics, our proposed approach offers a promising solution for the precise and efficient classification of lung cancer, contributing to improved cancer evaluation and treatment decision-making.

Preprocessing
Preprocessing is a critical stage in preparing medical images, including those related to lung cancer and COVID-19, for analysis and classification. Among the various preprocessing techniques available, the median filter is commonly employed to diminish distortion and improve image quality. (Ben Abdallah et al., 2018Elfouly et al., 2021;Elshazly et al., 2013a;Kumar et al., 2015a). The median filter is a nonlinear cleaning method that replaces each pixel value with the median estimate of its neighboring pixels. It is particularly effective in mitigating salt-and-pepper noise, a common occurrence in medical images. Lung cancer and COVID-19 images often suffer from different types of noise, such as random noise and artifact noise. The median filter considerably decreases noise by substituting noisy color numbers with the neighborhood's average significance, producing smoother images with improved visual clarity. (Bouakrif et al., 2019;ElBedwehy et al., 2014;Inbarani et al., 2018;Kumar et al., 2017, Sundaram et al., 2021Zhu & Azar, 2015). This technique effectively removes noise while preserving crucial image structures, ultimately improving the overall image quality and facilitating subsequent analysis and interpretation. Table 1 depicts the "Peak Signal Noise Ratio (PSNR)" and "Structure Similarity Index Method (SSIM)", "Mean Square Error (MSE)", "Features Similarity Index Matrix (FSIM)" for lung cancer and COVID datasets.

Dataset 1: Lung Cancer
For the analysis of the presented work, a set of CT scan pictures from an Iranian hospital are employed, with a particular emphasis on patients with lung cancer. This dataset includes images not only of lung cancer cases but also of individuals with COVID-19 and non-cancerous lung conditions. The images are divided into two classes: those with cancerous conditions, specifically lung cancer, and

Dataset 2: COVID-19
COVID: The "COVID-CT" dataset is a valuable collection of Computed Tomography (CT) scan images specifically associated with COVID-19. This dataset was created by researchers at the "University of California, San Diego (UCSD)" to support advancements in COVID-19 research, particularly in the fields of "Deep Learning" and medical image analysis. It contains a whole of 349 CT scan images, with 349 cases representing patients diagnosed with COVID-19 and 397 cases representing individuals without COVID-19 (https://github.com/UCSD-AI4H/COVID-CT) (Nivetha et al., 2021).

FeATURe eXTRACTION USING CONVOLUTIONAL NeURAL NeTwORK
CNN, a modification of the "Multi-Layer Perceptron (MLP)", offers significant advantages in pattern recognition due to its ability to reduce data dimensionality, sequentially extract features, and perform classification. (Azar et al., 2020h;Aziz et al., 2013a;Barakat et al., 2020;Eid et al., 2013;Hassanien et al.,2014a). The inspiration for the basic architecture of CNN can be traced back to the visual cortex model proposed by "Hubel and Wiesel in 1962". In 1980, "Fukushima introduced the Neocognitron", which was the first implementation of CNN. Building upon Fukushima's work, LeCun et al. achieved state-of-the-art performance in "pattern recognition" errands using the error gradient method in 1989. The classical CNN architecture developed by "LeCun et al." extends the traditional MLP and incorporates three key ideas: "local receptive fields", "weight sharing", and "spatial/temporal subsampling". (Azar, 2013a,b;Aziz et al., 2012;Babajani et al., 2019;Banu et al., 2014;Dudekula et al., 2023, Elshazly et al., 2013bInbarani et al., 2022;Inbarani & Nivetha, 2021;Kumar et al., 2019). These concepts are ordered into dual forms of layers: "Convolution layers" and "subsampling layers". The processing layers consist of "Convolution Layers (C1, C3, and C5)" interleaved with "subsampling layers (S2 and S4)", followed by the "output layer (F6)". These "Convolution and Subsampling layers" form feature maps and are organized in planes.
"Convolutional Neural Networks (CNNs)" have reformed the arena of "Computer Vision" by automatically learning and extracting meaningful features from images. CNNs excel at capturing highlevel visual representations, making them ideal for tasks like image classification, object detection, and medical image analysis. CNNs leverage the concept of local receptive fields, where small filters scan the input image to capture local patterns and structures. (Aziz et al., 2013b;Banu et al., 2017;Ding et al., 2015;Elshazly et al., 2013c;Kumar et al., 2015b;Sayed et al., 2020;Samanta et al., 2018). As the information propagates through the network, deeper layers extract increasingly abstract and task-specific features. This hierarchical process allows CNNs to identify complex patterns, shapes, and textures that are crucial for distinguishing between different image classes or detecting specific objects.

Mathematical Model for CNN
CNNs undergo deuce foremost training phases: "feedforward" and "backpropagation". In the feedforward stage, the accrual picture is processed by multiplying the input with neuron variables and applying a convolution operation in an apiece portion of the network. The resulting yield is then assessed. (Emary et al., 2014a;Hassanien et al., 2023Hassanien et al., , 2020Hassanien et al., , 2019aHassanien et al., ,b, 2014bInbarani et al., 2020;Sain et al., 2022;Santoro et al., 2013). During network learning, the objective is to minimize the fault between the network yield and the exact result, which is quantified by a loss function.
In the backpropagation phase, a technique called the backpropagation algorithm is applied based on the error value. This algorithm uses the chain rule to calculate the derivatives of the variables and updates them grounded on their impact on the network's fault estimate. This iterative process involves repeating the feedforward training multiple times to improve the network's training (Hashemi et al., 2013;Emary et al., 2014b;Humaidi et al., 2020a;Sallam et al., 2020;Sayed et al., 2019). The goal is to learn kernel matrices that capture meaningful features for image classification. The backpropagation algorithm optimizes the network's weights to find the optimal values. To perform the layer convolution, a sliding window is introduced, which applies the dot product operation with the weights. The "activation function" commonly used in CNNs is the "Rectified Linear Unit (ReLU)", defined as follows: Max pooling is utilized in CNNs to reduce the output scale and extract the most salient features. To optimize the neuron weights for better performance, the training pair error is computed (Firouz et al., 2015;Fati et al., 2022;Malek et al., 2015aSalam et al., 2022. The backpropagation technique minimizes the cross-entropy loss, which can be articulated as: Here, d j characterizes the desired yield vector for the m th class, and: where 'l' signifies the "sample number". The endmost loss outcome includes an additional weight penalty term to regulate the magnitude of the weights. It can be represented as (Azar et al., 2020h;Fekik et al., 2018b;Humaidi et al., 2021;Pilla et al., 2021a): Here, ρ is the "weight penalty coefficient", W k denotes the "connection weight", k in layer l, and the Layer l's connections are represented by the letters L and K which stand for the total number of layers. Table 2 depicts the extraction of features in CNN architecture (Humaidi et al., 2020b;Kumar et al., 2014b, Pintea et al., 2018. The model is compiled with the Adam optimizer, categorical cross-entropy loss function, and accuracy as the metric for evaluation: Once the model is trained, the features are extracted using the model's predicted, which takes the pre-processed input images as input and generates the corresponding feature vectors. The Total number of features extracted using CNN is 100 features.

Arithmetic Optimization Algorithm
Along with geometry, algebra, and analysis, arithmetic is one of the crucial elements of modern mathematics and a major part of number theory. The conventional computation methods typically used to examine numbers are known as arithmetic operators, such as "multiplication", "division", "addition", and "subtraction" (Abualigah et al., 2021;Fouad et al., 2021;Ganesan et al., 2022;Humaidi et al., 2023;Khettab et al., 2018;Pintea et al., 2021b). To choose the finest element among a group of candidate alternatives, employ these straightforward operators as mathematical optimization. Optimization issues arise in all quantitative fields, including operations research, engineering, economics, and computer science. Mathematicians have been fascinated by the development of solution approaches for decades. The procedure of arithmetic operators in resolving arithmetic problems is the primary source of inspiration for the proposed AOA. The behavior of arithmetic operators such as "multiplication", "division", "addition", and "subtraction" is used in this research (Fredj et al., 2016;Jothi et al., 2013;Mukherjee et al., 2014 ;Malek et al., 2015b;Pilla et al., 2019Pilla et al., , 2020. The primary population in AOA is formed at random using the following equation: where x stands for the population's response. U and L stand for the higher and lesser bounds of the exploration space for an objective function. The arbitrary variable with an among [0, 1] is called ∂ Gharbia et al., 2014;Mustafa et al., 2020;Panda & Azar, 2021;Pilla et al., 2021b). The choice of exploration and exploitation was made before the start of AOA based on the results of the "Math Optimizer Accelerated (MOA)" task, which is calculated using Eq. (2): where the functional result at the t th iteration is represented by MOA (C_Iter). C_Iter indicates the repetition that is currently running between 1 to the most iterations possible (M_ Iter). The notation Min and the notation Max are used to indicate the lowest and extreme estimates of the MOA, respectively. In AOA, the "exploration" or global search has been carried out utilizing search techniques based on the "Division (D)" and "Multiplication (M)" operators, which are expressed as Eq. (3), : where x t i j , ( ) denotes the "j th " place of "i th " person in the current generation and x t i + ( ) 1 denotes the "i th " solution of the "(t + 1) th " iteration, best x j ( ) denotes the "j th " place of the present finest answer.  is a very small positive integer, and U L j j stands in for the higher and lesser limits of the "j th " location, respectively, " µ " a controlling constraint. The following formula has been used to calculate the "Math Optimizer Probability (MoPr)", which is a constant:

MoPr t t M Iter
where MoPr(t) represents the MoPr function's value at the t th iteration. The extreme number of repetitions is M_Iter. is a crucial variable that governs the effectiveness of the extraction process during all iterations Jothi et al., 2022;Humaidi et al., 2021;Nasser et al., 2021;Mathiyazhagan et al., 2022;Mohanty et al., 2021) The "exploitation strategy" within the context of AOA has been devised through the utilization of the "Subtraction (S)" or "Addition (A)" operators, as articulated in Equation (5). This strategy is both continuous and static in nature. The process of AOA is delineated in Algorithm 1, aligning with the proposed approach to "exploration" and "exploitation." The visual representation of the AOA procedure can be found in Figure 1, (Abualigah et al., 2021;Azar et al., 2013b;Ibraheem et al., 2020b;Jothi et al., 2019b;Kamal et al., 2020;Lavanya et al., 2022;Giove et al., 2013):

Genetic Algorithm
The "Genetic Algorithm" is a meta-heuristic that draws inspiration from the evolution method and is a member of the broad class of "Evolutionary Algorithms" used in computing and informatics. By concentrating on bio-inspired operators like "Selection", "Convergence", or "Mutations", these procedures are widely employed to produce superior results to optimization and exploration challenges Najm et al., 2020;Gorripotu et al., 2019). In 1988, John Holland, the author, created GAs based on "Darwin's evolutionary theory". He subsequently enlarged the GA in 1992. The category of evolutionary algorithms includes this algorithm. The employment of evolutionary algorithms allows for the solving of issues for which there is not yet a clear-cut, effective solution. This method is utilized in modeling and simulation where arbitrariness function is applied, as well as to do optimization difficulties ("Scheduling", "Shortest Path", etc.) GA is key to the population of the intrant for the problem of optimizing that is developed in the direction of better options (known as individuals, animals, or genotypes) (Holland et al., 1992). Each candidate's result takes a set of traits (the genes or phenotype) that can be altered and evolved; results are commonly represented as cords of 0s and 1s in binary digits, though alternative codecs are permitted. Ibrahim et al., 2020;Khamis et al., 2021;Khan et al., 2021;Malek & Azar, 2016a). The population is thought of as a mechanism of generation for each reproduction in evolution, which often begins with a community of randomly selected people. The population's overall fitness is assessed for each generation. The value of the objective feature being resolved, however, is typically what determines fitness. When the gene is changed to produce a novel peer group series for everyone (recombined and perhaps altered arbitrarily), suitably fit entities are probabilistically particular from the current population. Over the next generation of the process, newer candidate strategies would be used. The algorithm often comes to an end after a predetermined number of generations or after enough satisfaction has been produced. Inbarani et al., 2014b;Khamis et al., 2022;Liu et al., 2022). As a result, each new generation is better suited to the surroundings of the population. Figure 2 depicts the flowchart for AOA as follows,

Proposed Novel Hybrid Genetic Arithmetic Optimization (NHGAO)
The Hybrid Genetic Algorithm with Arithmetic Optimization is a combination of two powerful optimization techniques: Genetic Algorithm (GA) and Arithmetic Optimization (AO). By leveraging the strengths of both approaches, this hybrid algorithm aims to overcome its limitations and achieve improved optimization performance. (Azar & Vaidyanathan, 2015a, b;Kham et al., 2021Liu et al., 2020Meghni et al., 2017b). "Genetic Algorithm" is a "population-based search algorithm" inspired by the principles of "natural selection" and "genetics". It utilizes genetic operators, such as "selection", "crossover",  (Holland, 1992) and "mutation", to discover the solution space and evolve toward better solutions. GA is actual in exploring an extensive choice of solutions and maintaining population diversity, allowing it to handle complex optimization problems with large solution spaces. However, it may struggle with fine-tuning individual solutions and getting trapped in local optima.
Arithmetic Optimization is an arithmetic-based optimization technique that focuses on refining individual solutions through arithmetic operations (Azar et al., 2020e;Malek & Azar, 2016b). It operates on a single solution and makes small adjustments using operations like addition, subtraction, multiplication, and division. AO excels in local search and fine-tuning solutions, enabling it to converge faster and achieve higher solution quality. However, it may suffer from limited exploration capability and difficulties in escaping local optima.
The hybridization of GA and AO addresses these limitations by synergistically combining their strengths. The algorithm starts with a primary population generated by GA, which explores the solution space and maintains diversity. The population then undergoes AO to refine individual solutions using arithmetic operations. This hybrid approach allows for efficient exploration of the solution space by GA, while AO provides fine-grained adjustments and exploitation of promising solutions. The advantages of the Hybrid Genetic Algorithm with Arithmetic Optimization include enhanced exploration and exploitation, improved solution quality, efficient local search, and flexible adaptation. By leveraging the exploration capabilities of GA and the refinement abilities of AO, the hybrid procedure strikes a equilibrium among "exploration and exploitation", leading to better convergence and solution quality. Additionally, the algorithm can be tailored to adapt the balance between GA and AO based on the problem characteristics, making it a versatile optimization approach.
Overall, the hybridization of GA and AO offers a powerful optimization framework that combines the strengths of both algorithms, allowing for efficient exploration, fine-tuning of solutions, and improved optimization performance. It is predominantly well-matched for explaining multipart optimization problems where a steadiness among "exploration and exploitation" is crucial. Table 3 shows the parameter settings for the Proposed NHGAO. Figure 3, 4 and 5 depict the pseudocode for the AOA, GA, and the proposed NHGAO.

HyPeRPARAMeTeR TUNING FOR GA, AOA, AND NHGAO
For the NHGAO (Novel Hybrid Genetic Arithmetic Optimisation Algorithm) to maximize the recital of the hybrid algorithm and produce superior fallouts, hyperparameter adjustment is a crucial step. Hyperparameters are parameters that are set before training but are not learned during training. (Azar, 2013d;Azar et al., 2017) They take a substantial influence on the algorithm's performance and behavior as well as the outcome. Hyperparameter tuning involves finding the finest amalgamation of hyperparameter values for both the arithmetic optimization and genetic algorithm components. These hyperparameters include population size, maximum iterations, and the omega value, which controls the balance between exploration and exploitation.
Proper hyperparameter tuning ensures that the algorithm efficiently discovers the resulting space, congregates to optimal or near-optimal solutions, and achieves improved accuracy and classification results. The outcomes of the "Genetic Algorithm", "Arithmetic Optimization Algorithm", and "Novel Hybrid Genetic-Arithmetic Optimization Algorithm" are displayed in Tables 4,5, and 6. It is vibrant that the presented method produces the finest outcomes.

Case 1
Novel Hybrid Genetic Arithmetic Optimization algorithm has been anticipated to address to progress the execution of the optimization progression. Statistical techniques are used to analyze and compare the execution of three algorithms that have been employed using test algorithms of 50, 500, and 1000 Iterations respectively. In the study, each algorithm is separately running up to 50,100,1000 times. Except for the suggested algorithm, the control parameters for the search algorithms are taken from (Arabali et al., 2012, Guo et al., 2008. This ensures a fair comparison when carrying out assessments of related functions.

Case 2
It is well known that in metaheuristics-based optimization-based algorithms, a rise in the "population" and "the number of iterations" causes an increase in the length of time the algorithms take to complete. As a result, in the current study, the algorithm's maximum iterations are 50, 100, and 1000.

Problem Formulation
The optimization problem being solved in the given code is feature selection for a classification task.

Initialization
Set the parameters for both GA and AO, such as the "population size", "mutation rate", and "maximum number of iterations".

Initial Population
Generate an initial population of individuals using GA techniques. Apiece distinct epitomizes a possible result of the optimization problem. The population is usually created randomly or based on specific initialization strategies.

GA Phase
Apply GA operators, including "Selection", "Crossover", and "Mutation", to evolve the population and explore the solution space. Selection ensures that fitter individuals have a higher chance of being chosen for reproduction, while crossover combines the genetic material of selected individuals to generate offspring. Mutation introduces small random changes to maintain diversity and avoid premature convergence.

AO Phase
Apply AO techniques to refine the selected individuals from the GA phase. AO employs arithmetic operations, such as "Addition", "Subtraction", "Multiplication", and "Division", to modify the solutions and improve their fitness values. This phase aims to fine-tune the solutions obtained from the GA phase and enhance their quality.

Hybridization of GA and AO Phase
Integrate the GA and AO phases by combining the selected individuals from GA with the mutated individuals from AO. This integration can be performed through various strategies, such as incorporating AO as a local search operator within the GA framework.

Fitness Function
The fitness metric used is the accuracy score. The accuracy score is a commonly used evaluation metric for classification problems. The accuracy score quantifies the ratio of correctly classified instances to the total number of instances in the dataset.

Optimization Function
To maximize the accuracy of a Random Forest Classifier on a given dataset.

Constraint
The constraint limits the maximum number of selected features. This constraint ensures that the model does not select an excessively large number of features, which can lead to overfitting or increased computational complexity.

Selection and Replacement
Select the best individuals grounded on their "fitness values". The fittest individuals are preserved, while the fewer fit individuals are replaced with the offspring generated through GA and AO operations. This selection process helps improve the population's overall fitness and drives the optimization process toward better solutions.

Termination Criterion
Determine the ending condition for the hybrid algorithm. This could be reaching a maximum number of iterations, convergence of the objective function, or achieving a preset fitness threshold. The termination criterion ensures that the algorithm stops when it has achieved satisfactory results or when further iterations are unlikely to yield significant improvements.

Case 3
The performance of all algorithms is quantitatively evaluated. The proposed method is contrasted with the outcomes that were obtained. The effectiveness of the computer hardware configuration, the design of the algorithm, and the metaheuristic optimization technique all influence computation times. The ANACONDA software, which is installed on a computer with an i5-10210U CPU processor successively at a processing speed of 2.11 GHz and 8 GB of RAM, is utilized to run the analyses as part of the scope of the study.

Case 5
The maximum iterations parameter determines "the numeral of reiterations" or generations the algorithm goes through. The "genetic algorithm" had a maximum iteration of 50, while both the arithmetic optimization and hybrid algorithm utilized 500 iterations. Surprisingly, the genetic algorithm achieved an accuracy of 0.94, indicating its effectiveness in producing accurate results within a slighter numeral of repetitions. On the other point, the longer iterations in the arithmetic optimization and hybrid algorithm allowed for a more extensive search of the solution space, potentially leading to improved accuracy.

Case 6
The omega value is a crucial parameter that restraint the stability amid "exploration and exploitation" in the algorithms. The genetic algorithm had an omega value of 0.9, emphasizing exploration to discover diverse solutions. The arithmetic optimization employed an omega value of 0.5, striking a  firmness among "exploration and exploitation". In contrast, the hybrid procedure utilized an omega value of 0.1, prioritizing exploitation to exploit the discovered promising regions. These distinct omega values influenced the algorithm's search behavior, enabling the genetic algorithm to explore a varied choice of solutions, the arithmetic optimization to sustain a trade-off amid "exploration and exploitation", and the hybrid algorithm to exploit promising areas for optimal solutions.

Case 7
By carefully tuning these hyperparameters, the hybrid algorithm achieved the highest accuracy of 0.98, surpassing the genetic algorithm's accuracy of 0.94 and the arithmetic optimization's accuracy of 0.93. The hybrid algorithm's success can be attributed to its ability to intelligently balance the exploration and exploitation trade-off, effectively utilizing a smaller population size to find optimal solutions. This demonstrates the algorithm's capability to adaptively adjust its search strategy based on the problem requirements and improve accuracy through a combination of exploration and exploitation.

effectiveness of the Proposed Features Selection Technique
The presented approach is used to reduce the measurement of two datasets, Lung cancer, and COVID datasets using feature selection methods. K-fold cross-approval was used for better evaluation of test results to avoid overfitting difficulties throughout the preparation and testing processes. . For processing the effectiveness of the presented supervised feature selection strategy, three different meta-heuristic optimization techniques are used. They are "Genetic Algorithm", "Arithmetic Optimization Algorithm" and "Proposed Novel Hybrid Genetic Arithmetic Optimization Algorithm". The efficiency of the feature selection techniques in percentage is seen in Figure 6. The "Genetic Algorithm" and "Arithmetic Optimization Algorithm" algorithms are contrasted with the proposed Novel Hybrid Genetic Arithmetic Optimization approach in this diagram. The light blue color depicts the selected features of the Genetic Algorithm as 52.6%. Similarly, to this, the Arithmetic Optimization based algorithm reduced the features by 35.1%, and the light pink color depicts the proposed method which reduced the features by 12.3%. It's believed that the irrelevant features are reduced by using the proposed system Novel Hybrid Genetic Arithmetic Optimization algorithm.

Assessment of the Classification Performance
To determine the usefulness of the classifiers, a variety of performance metrics were used to evaluate them. The datasets were divided into training (70% of samples) and testing (30% of samples) sets and the selected features were fed into the classifiers. Ten-fold cross-validation was used to verify the classification results. Although accuracy is a frequently used evaluation metric in traditional applications, it might not be appropriate for evaluating a dataset of skewed images (Nivetha & Inbarani, 2022a). When class distributions are extremely skewed, it is common for there to be no classification rules developed for the minority class. Additional evaluation metrics were used in this work to solve this restriction. The efficacy of the classifier is evaluated using a combination of performance criteria, including "Precision", "Recall", "F1-score", "Accuracy", "Sensitivity", "Specificity", "G-Mean", "Mathew Correlation Coefficient (MCC)", "Lift", "Youden's index", "Balance Classification Rate (BCR)", "Computation Time" (Nivetha & Inbarani, 2022b). Table 7 shows the number of features acquired from the lung dataset and the COVID dataset.

Result and Discussion
The experiments were conducted on an Intel Core i5 processor with a maximum memory capacity of 2 GB. The feature selection algorithms were implemented in ANACONDA. The measurement equations can be described as follows, (Nivetha et al., 2022c):

Accuracy TP TN TP TN FP FN
Balanced Error Rate = 1 − BCR Tables 8,9,10, 11 and 12 describe several "Decision Tree classifiers", "Random Forest Classifier", "Naïve Bayes classifier", "KNN classifier", and "SVM classifiers". In these tables, it can be seen Table 8. Classification of reduct set using Decision Tree Classifier Table 9. Classification of reduct set using Random Forest Classifier Table 10. Classification of reduct set using Naïve Bayes Classifier Table 11. Classification of reduct set using KNN classifier that using the NHGAO, AOA, GA, and Unreduced data approaches in comparison to the unreduced information increases the accuracy of order classification.
Tables 8 depicts the results of Genetic Algorithm, and Arithmetic Optimisation and other approaches for Unreduced Data. The evaluation results demonstrate the effectiveness of the proposed Novel Hybrid Genetic Algorithm-Arithmetic Optimisation algorithm. The innovative hybrid method demonstrates its superiority in effectively capturing both positive and negative examples, ultimately resulting in a significantly lower mistake rate, with consistently higher precision, recall, specificity, G-mean, MCC, Youden's Index, BCR, ROC area, and accuracy. This performance highlights the method's potential to provide predictions that are more trustworthy and balanced for improving classification outcomes. Table 9 shows the results of Genetic Algorithm, Arithmetic Optimisation, and Novel Hybrid Genetic Arithmetic Optimisation technique for Unreduced Data. Notably, the Novel Hybrid Genetic Arithmetic Optimisation algorithm demonstrates outstanding precision, reaching 0.95, confirming its accuracy in anticipating positive situations. This precision is balanced, demonstrating its ability to make accurate positive and negative predictions with strong recall, F1-score, sensitivity, and specificity. The MCC score of 0.94 is impressive and indicates high overall classification quality. Notably, this strategy only slightly outperforms random guessing, as evidenced by the Lift score, which is significantly lower than that of the other methodologies. The Novel Hybrid Genetic Arithmetic Optimisation approach, however, consistently displays strong performance across a variety of measures, contributing to its noteworthy ROC area of 0.92 and a high accuracy of 0.95, positioning it as a robust choice for accurate and balanced classification.
The performance of the approaches can be determined by comparing the evaluation metrics of Genetic Algorithm, Arithmetic Optimisation, and Novel Hybrid Genetic Arithmetic Optimisation for Unreduced Data in Table 10. With balanced precision, recall, F1-score, sensitivity, and specificity, the Novel Hybrid Genetic Arithmetic Optimisation stands out as a top performer, demonstrating its effectiveness in both positive and negative predictions. The fact that it earns the highest G-mean (0.95) and BCR (0.96) scores stands up as evidence of its thorough class capture. It is exceptional in terms of overall categorization quality with a strong MCC score of 0.91. Additionally, this method exhibits reliability and potent discriminative power by maintaining competitive error rates and a strong ROC area (0.40 and 0.90, respectively). Table 11 describes the balanced performance with good precision, recall, F1-score, and specificity when approaches Genetic Algorithm, Arithmetic Optimization, and Novel Hybrid Genetic Arithmetic Optimisation are compared for Unreduced Data. It continues to have competitive G-mean and MCC scores, demonstrating its propensity for prediction. This technique demonstrates its ability to provide Table 12. Classification of reduct set using SVM classifier a thorough classification by performing particularly well in Youden's Index, BCR, and BER. Its reliability is further supported by precision and a strong ROC area, making it an appealing option for precise, comprehensive forecasts.
With a high score of 0.94, the Novel Hybrid Genetic Arithmetic Optimisation algorithm excels in terms of accuracy, indicating that it makes correct predictions for all classes. It has the lowest error rate, 0.06, demonstrating successful misclassification reduction. This high accuracy is consistent with consistently great performance on numerous metrics. The Unreduced Data technique, on the other hand, has the highest error rate (0.60), which denotes a higher percentage of inaccurate predictions. The Novel Hybrid Genetic Arithmetic Optimisation stands out as a reliable option for accurate forecasts with low error rates in terms of both accuracy and error rate as shown in Table 12.
Comparing the presented method to existing supervised feature selection techniques, the presented approach produces the lowermost bit classification and fault value. When related to alternative FS techniques, it is found that the NHGAO-based relative reduct algorithm creates the least amount of execution time. Figures 7, 8, 9, 10, and 11 classification comparison for "accuracy", "precision", "recall", and "F1-Score" for five classifiers such as "Decision Tree", "Random Forest Classifier", "Naïve Bayes Classifier", "KNN Classifier" and "SVM classifiers". A great tool for organizing, visualizing, and choosing classifiers based on performance is the ROC plot. (Nivetha & Inbarani, 2023a;Azar et al., 2016a). The idea of a "separator" variable is the foundation of the ROC curve. If the "criterion" or "cut-off" for positivity on the decision axis is altered, the frequency of positive and negative diagnostic test findings will change. The decision scale is only "implicit" when a diagnostic system's results are evaluated based on subjective assessment. (Nivetha & Inbarani, 2023b;Azar et al., 2007), Such a variable is frequently referred to as a "latent" or unobservable variable. A ROC curve, also known as a curve in the unit square, is produced by plotting TPF (sensitivity) versus FPF (1-specificity) across various cut-offs has been represented in medical diagnosis .  Figure 12 displays the ROC plots for each classifier on the three datasets, utilizing the proposed NHGAO algorithm. The classification accuracy rates of the five classifiers differ from each other. (Nivetha et al., 2023a,b,c). Particularly, the "Random Forest classifier" outperforms the other classifiers, achieving the highest classification accuracy of 0.95%, as indicated by its position above the diagonal line in the graph. This result highlights the superior performance of the "Random Forest classifier" compared to the others in the classification task.

CONCLUSION AND FUTURe DIReCTIONS
This research work addresses the complex challenge of feature selection in "Medical Image Processing". This study introduced the innovative Novel Hybrid Genetic Arithmetic Optimization algorithm for feature selection. This approach, designed for feature selection and classification tasks using COVID and lung imaging datasets, merges genetic algorithm and arithmetic optimization to efficiently optimize feature subsets. The assessment of feature subset quality, guided by accuracy findings and employing the Random Forest classifier as a fitness function, underscores GA-AOA's superiority over traditional genetic algorithms and arithmetic optimization. Novel Hybrid GAO excels in multiple performance metrics, encompassing "Accuracy," "Precision," "Recall," "F1-Score," "MCC," "Sensitivity," "Specificity," "Geometric Mean," "Lift," and "Youden's Index". Notably, the hybrid algorithm achieves remarkable accuracy in classifying COVID and lung images, underscoring its effectiveness in both feature selection and classification tasks. The findings indicate that each applied metaheuristic optimization strategy substantially enhancing classification accuracy while concurrently reducing feature size. Comparative analyses against genetic algorithms and arithmetic optimization underscores the hybrid algorithm's advantages, achieving superior accuracy with a reduced population size and fewer iterations. This study's outcomes pave the way for potential expansions utilizing evolutionary algorithm-driven feature selection methods like Particle Swarm Optimization, Ant Colony Optimization etc.., Moreover, the framework's versatility positions it for application across various medical disciplines beyond the current scope of research.