Intrusion Detection System: A Comparative Study of Machine Learning-Based IDS

The use of encrypted data, the diversity of new protocols, and the surge in the number of malicious activities worldwide have posed new challenges for intrusion detection systems (IDS). In this scenario, existing signature-based IDS are not performing well. Various researchers have proposed machine learning-based IDS to detect unknown malicious activities based on behaviour patterns. Results have shown that machine learning-based IDS perform better than signature-based IDS (SIDS) in identifying new malicious activities in the communication network. In this paper, the authors have analyzed the IDS dataset that contains the most current common attacks and evaluated the performance of network intrusion detection systems by adopting two data resampling techniques and 10 machine learning classifiers. It has been observed that the top three IDS models—KNeighbors, XGBoost, and AdaBoost—outperform binary-class classification with 99.49%, 99.14%, and 98.75% accuracy, and XGBoost, KNneighbors, and GaussianNB outperform in multi-class classification with 99.30%, 98.88%, and 96.66% accuracy.


INTRoDUCTIoN
Because of the Covid-19 pandemic, individuals stayed at home and avoided physical gatherings, and social separation has become the new normal.The usage of new paradigms in corporate transactions, work-from-home culture, and online educational delivery has increased people's reliance on mobile and electronic devices.The use of communication networks and cloud-based processing systems have increased manifold.This change in the pandemic era promotes new threats and lures intruders to exploit vulnerabilities in the data communication network.Organizations usually use diversified protocols to encrypt their data and maintain confidentiality.Volume, heterogeneity of protocols, and encryption have posed several new challenges before the IDS system in detecting malicious activities (Resende & Drummond, 2018;Senthilkumar et al., 2021).An intruder attempts to gain unauthorized access to a system or network with malafide intentions and disrupt the normal execution (Butun et al., 2014;Liao et al., 2013;Low, 2005;Mitchell & Chen, 2014).Several times intruders aim to steal or corrupt sensitive data.In 2020, Emsisoft reported that local governments, universities, and private organizations had spent $144 million in response to the worst ransomware attack (Novinson, 2020).The WHO reported that cyber-attack increased five-fold during the Covid-19 pandemic (WHO, 2020).According to the McAfee quarterly threat report 2020, fraudsters are taking advantage of the pandemic by using Covid-19-themed malicious apps, phishing campaigns, and malware (McAfee, 2020).The report also highlights that in quarter one (Q1), new malware targeting mobile devices surged by 71%, with overall malware increasing by roughly 12% over the previous four quarters (McAfee, 2020).
IDS provides security solutions against malicious attacks or security breaches.It can be a software or hardware device that detects harmful activity to maintain system security (Babu et al., 2023;Liao et al., 2013).It identifies all forms of suspicious network traffic and malicious computer activity that a firewall might miss.Signature-based Intrusion Detection Systems (SIDS) and Anomaly-based Intrusion Detection Systems (AIDS) are two popular categories of IDS that have widely been used to provide security solutions (Axelsson, 2000;Baskerville & Portougal, 2003;Hodo et al., 2017).The SIDS relies on previously known signatures and faces challenges in identifying an unknown and obfuscated malicious attack (Amouri et al., 2020;Atli, 2017;Khraisat et al., 2019;Lin et al., 2015;Low, 2005;Vinayakumar et al., 2019;Wu & Banzhaf, 2010).Therefore, SIDS cannot prevent every intruder based on previously learned indicators of compromises; however, they can detect and prevent similar attacks from happening in the future.As the number of cyber-attacks has increased exponentially and attackers are using evolved techniques to conceal attack patterns, it becomes almost infeasible to identify intruders using SIDS (Amouri et al., 2020;Khraisat et al., 2019;Vimala et al., 2019;Warsi & Dubey, 2019;Wu & Banzhaf, 2010).
Many scholars use AIDS because of its ability to overcome the limitation of SIDS.An AIDS is a typical computer system model created using statistical-based methods, machine learning algorithms, or knowledge-based methods.These methods are designed and developed to detect abnormal behaviour in computer systems.The typical usage pattern is base-lined, and alarms are generated when usage deviates from the expected behaviour.The key benefit of using AIDS is detecting zero-day attacks because it does not rely on a signature database to detect abnormal user behaviour (Alazab et al., 2012;Laughlin et al., 2020).AIDS is further categorized into three main groups: Statistics-based, Knowledge-based, and Machine learning-based.Researchers have investigated many approaches to improve intrusion detection in the last few decades, from data mining and machine learning to time series modelling.The Machine learning-based IDS can learn the attacks' behaviour and pattern, and future attacks can be predicted using trained machine learning models.
Machine Learning is a technique for extracting knowledge from massive amounts of data.It comprises a set of rules, methods, or complex "transfer functions" that can be used to discover intriguing patterns or estimate behaviour in a wide range of applications (Abu Al-Haija et al., 2022;Choudhury et al., 2023;Dua & Du, 2016;Mangal et al., 2023;Prasad Yadav et al., 2023;Sinha & Sharma, 2021).The machine learning techniques use training data to acquire complex patternmatching capabilities.Researchers (Hamzah & Othman, 2021;Hasan et al., 2016;Mehmood et al., 2021;Niyaz et al., 2015;Shams & Rizaner, 2018) widely use the Support Vector Machine (SVM) for Network Intrusion Detection Systems (NIDS) and different clustering algorithms such as K-means and Exception Maximization (EM) for both NIDS and anomaly detection (Bennett & Demiriz, 1999;Laughlin et al., 2020;Maseer et al., 2021;Syarif et al., 2012;Wazid & Das, 2016).They are mainly concerned with the detection effect and lack practical issues such as detection efficiency and data management.In this paper, the authors have tried to address some problems and highlight the performance of different machine-learning models in IDS.The contributions of this paper are as follows: 1.A new and still under-analyzed IDS dataset containing the most recent common attacks has been used for the analysis.This dataset is more representative of the current threat landscape than older datasets, which can help improve intrusion detection systems' accuracy.2. The authors have adopted two data re-sampling techniques to balance the dataset, and some pre-processing steps are performed to fix the problems that may exist in the datasets.This is important because imbalanced datasets can lead to biased results, and the researcher's approach helps ensure that their study results are more accurate.3. The authors proposed to use the ten widely used Machine learning classifiers on intrusion detection systems to find out the best model.This allows for a more comprehensive evaluation of different machine learning approaches, and the study's results can help inform the development of more effective intrusion detection systems in the future.
Overall, the contributions made by this paper provide valuable insights into the performance of Machine learning-based IDS and contribute to the advancement of intrusion detection methodologies.The findings have practical implications for organizations seeking to strengthen their security measures in the face of evolving cyber threats while contributing to network security research's theoretical foundation.
The rest of the paper is structured as follows.Section 2 briefly overviews the work related to the Intrusion Detection System.The machine learning-based intrusion detection approach, data pre-processing, and balancing techniques are explained in section 3. The experimental analysis and results are discussed in section 4. Section 5 concludes the paper with future scope.

ReLATeD woRK
Recently, many research and practical ideas based on artificial intelligence and machine learning have been published to overcome the challenges in intrusion detection systems.The authors (Sharafaldin et al., 2018) used the CICIDS2017 dataset and examined the performance of the selected features with Naive-Bayes, KNN, ID3, RF, Adaboost, MLP, and QDA.Feature selection is an essential process in building IDS systems.Varghese and Muniyal (Varghese & Muniyal, 2017) studied the efficacy of seven different algorithms concerning two different feature selection strategies on the NSLKDD dataset.The authors have used Principal Component Analysis (PCA) and Correlation-based Feature Selection (CFS) for selecting features.Then, the performance of J48, NBTree, Random Forest, LibSVM, Bagging with REPTree, PART, and Multilayer Perceptron (MLP) classifiers were evaluated using ten-fold cross-validation.Effendy et al. (Effendy et al., 2017) also used the NSL-KDD dataset and Information Gain Ratio (IGR) for selecting features.The authors assessed the Naive-Bayes classifier with accuracy as a key performance indicator.The authors (Acharya & Singh, 2018) used intelligent water drops (IWD) nature-inspired algorithm to select the feature and a support vector machine as a classifier to evaluate the selected features.Alazzam et al. (Alazzam et al., 2020) used the pigeon-inspired optimizer technique, and Tawil et al. (Tawil & Sabri, 2021) used the Moth Flame Optimization technique to choose the relevant features in designing the IDS system.The authors (Naseri & Gharehchopogh, 2022) presented a binary version of the Farmland Fertility Algorithm (FFA) called BFFA to select the feature used in IDS classification.The authors (Biswas, 2018) considered the amalgamation of feature selection techniques and classifiers to design an accurate network intrusion detection system.They used the NSL-KDD dataset and applied four feature selection methods to evaluate the performance of five classifiers using a five-fold cross-validation strategy.The authors (Imrana et al., 2021) proposed a bidirectional Long-Short-Term-Memory (BiDLSTM) based intrusion detection system to handle especially User-to-Root (U2R) and Remote-to-Local (R2L) attacks.Their proposed model improves the detection accuracy rate of U2R and R2L attacks more than the conventional LSTM.
Ammar and Faisal (Aldallal & Alisa, 2021) proposed a hybrid model of Support Vector Machine (SVM) and Genetic Algorithm (GA) intrusion detection system with innovative fitness functions to evaluate the system accuracy in the cloud computing environment.The proposed approach was evaluated on the CICIDS2017 dataset and benchmarked with KDD CUP 99 and NSL-KDD.The results showed that the proposed model outperformed benchmarks by 5.74%.The authors (Imran et al., 2021) proposed an ensemble of automated machine learning and Kalman filter prediction approaches to improve anomaly detection accuracy in a network intrusion environment.The proposed model was evaluated on the UNSW-NB15 and CICIDS2017 datasets and observed intrusion detection accuracy of 98.80% for the UNSW-NB15 dataset and 97.02% for the CICIDS2017 dataset.
The authors (Al-Omari et al., 2021;Sarker et al., 2020) presented a machine learning-based security model called Intrusion Detection Tree (IntruDTree) that considers the importance of security features and then builds a tree-based generalized intrusion detection model based on the selected essential features.A survey on machine learning approaches for Cyber Security Intrusion Detection was published in 2016 using KDD 1999 and DARPA 1998 datasets (Buczak & Guven, 2016).Similar work was also published by (Sultana et al., 2019) and(da Costa et al., 2019), focusing only on reviewing current literature.All these works correlate with ours, but our work used different machine learning-based IDS models and executed them on the recently available dataset.After that, the results were compared to the existing work to assess and analyze the performance.
The authors (Abdulhammed et al., 2019) used two machine learning methods, Auto Encoder (AE) and Principal Component Analysis (PCA), for dimensionality reduction and RF, Bayesian Network, Linear Discriminant Analysis (LDA) and Quadratic Discriminate Analysis (QDA) classifiers for designing an IDS.The proposed methodology reduced the CICIDS2017 dataset's feature dimensions from 81 to 10 while maintaining an accuracy of 99.6% for multi-class and binary classification.The above-discussed literature considered the outdated dataset for developing IDS, focusing more on prediction accuracy and less on prediction latency.The authors (Seth et al., 2021) used the latest CICIDS 2018 dataset, considering the modern-day attack, to build the IDS.They proposed hybrid feature selection methods and used the Light Gradient Boosting Machine Learning (LightGBM) classifier to design the IDS.The proposed model gives 97.73% accuracy and achieves 1.5% higher accuracy than the existing models.

MACHINe LeARNING-BASeD IDS MoDeLS
Many researchers and organizations use a variety of algorithms and techniques, including Support Vector Machine (SVM), Naive Bayes (NB), Decision Trees (DT), Logistic Regression (LR), K-Nearest-Neighbor (KNN), clustering, and various ensemble methods, to extract knowledge from intrusion datasets.Each record in supervised learning IDS has a network or host data source and an associated labelled output value, such as Malicious or Benign.To discover the intrinsic link between the input data and the labelled output value for the specified features, a model is developed using supervised learning techniques.In the testing rounds, the trained model categorizes the unknown input as Malicious or Benign.Each classifier has its strengths and weaknesses.A natural way to create a robust classifier is to combine many weak classifiers.Multiple classifiers are trained using ensemble techniques, and the classifiers then vote to determine the final results.Boosting, Bagging, and Stacking are just a few ensemble approaches proposed to improve performance.The term "boosting" refers to a group of algorithms that can improve the performance of weak learners.Training the same classifier on a different subset of the same dataset is called bagging.Stacking combines various classifications via a meta-classifier (Aburomman & Ibne Reaz, 2016).According to Jabbaret al. combination of Random Forests and the Average One-Dependence Estimator (AODE) may be used to overcome the issue of attribute dependence in Naïve Bayes.Random Forest enhances precision and reduces false alarms (Jabbar et al., 2017).The hybrid models are designed in many stages in combination with different classification models.Ensemble and hybrid classifiers tend to outperform single classifiers in terms of performance.The key points lie in selecting which classifiers to combine and how they are connected.The present work analyzes the top 10 popular machine Learning classifiers such as Adaboost, Decision Tree (DT), GaussianNB, KNeighbours, Logistic, Multinomial NB, Random Forest (RF), Stochastic Gradient Descent (SGD) Classifier, Support Vector Machine (SVM), and XGBoost on intrusion detection systems to find out the best model.The process flow of creating Machine learning-based IDS is shown in Figure 1.
This paper uses the IDS dataset containing the most recent common attacks.The dataset is highly imbalanced.Two data re-sampling techniques are used to balance the dataset.Afterwards, some pre-processing steps are performed to fix the problems that may exist in the datasets.These data pre-processing steps are discussed in the subsequent sections.

The Data Pre-Processing Steps
In machine learning, data pre-processing steps transform or encode data into suitable formats so that machines can quickly parse it.The datasets may require treating missing or inconsistent values, feature scaling, feature selection, and data imbalance problems.

Missing or Inconsistent Values
The presence of missing values in a dataset is quite common.Missing values must be evaluated for rectification, whether they occurred during data collection or validation.It can be solved by eliminating rows with missing data or filling them with estimated values.

Feature Scaling
Feature Scaling is a part of data pre-processing.It normalizes the independent features in a defined range to handle highly fluctuating magnitudes or values.There are different strategies for performing feature scaling.

Min-Max Normalization:
This approach re-scales a feature or observation value in a range between Zero and One.Its formula is:

Feature Selection
"Feature selection" is also named "Feature Learning" or "Feature Engineering", which is the most crucial stage during pre-processing.It simplifies the data, eliminates data redundancy, reduces computational difficulty, improves the detection rate, and reduces false alarms of machine learning models.Only essential features are selected based on their correlation scores with the consequence variable.Feature selection plays a critical role in building any IDS, so the chosen features highly affect the accuracy and reduce false alerts.Each feature has specific characteristics for addressing different areas of threat detection.Features containing basic information about the software or network are considered naïve, and when they represent deeper details, they are considered rich.Three approaches, Filter, Wrapper, and Embedded, are used for Feature selection, as shown in Table 2.

Imbalanced Learning
Most machine learning predictive models work based on the assumption that an equal number of classes are in each sample.But when the distribution of classes is imbalanced, for example, the minority class contains a hundred samples, and the majority class contains hundreds of thousands of samples, this results in the machine learning models having poor performance, specifically for the minority class, and for the majority class the performance might be misleading.Imbalanced Learning is an open-source python toolbox with various techniques for handling imbalanced data classification.Some of the categories of handling imbalanced data such as Random Under-Sampling (RUS), which reduces the samples from the majority class; Random Over-Sampling (ROS), which creates duplicate copies of samples from the minority class; Synthetic Minority Oversampling Technique (SMOTE), where the synthetic sample is created, and Tomek Links which removes the noise from the data, are discussed with advantage and limitation in the following sub-sections.These strategies are used to fine-tune the class distribution of a data set.
Let the imbalanced dataset is represented by x, the minority class sample is represented by x min, and x max represents the majority class samples.The balancing ratio of dataset x is defined as: The balancing process is equivalent to re-sample x into a new dataset x res such that r x >r xres .
i) Random Under Sampling (RUS): In RUS, the number of samples of the majority class (x max ) is reduced, i.e., removing some of the observations from the majority class until the majority and minority class are balanced out.The drawback of under-sampling is that we are removing the data that may be valuable.Embedded (Hernandez et al., 2007) Examine more depth interaction than Wrapper Optimal subset results -----------ii) Random Over Sampling (ROS): Contrary to under-sampling, more copies of data are added into the minority class such that new samples are generated in x min to reach the balancing ratio r xres.It is a worthy choice when we don't have tons of data to work with, but at the same time, it also causes over-fitting and poor generalization of minority sample results.iii) Synthetic Minority Oversampling Technique (SMOTE): Over-sampling method creates duplicate samples in the minority class that does not add new information to the existing data set.SMOTE solves these issues by creating synthetic samples.It chooses a random sample from a minority class and finds its k-neighbours minority class.A synthetic sample is created randomly between two samples in a feature space.This technique can be used to create as many synthetic examples for the minority class (x min ) to reach the balancing ratio r xres.This strategy may produce noisy samples by inserting new points between marginal outliers and inliers.iv) Tomek's Link: This is a cleaning method to eliminate the noise generated in the majority class while creating new samples in the minority class.This is an under-sampling strategy for reducing the unwanted samples from the majority class.
This paper uses the SMOTE oversampling strategy to balance the CICIDS2018 dataset and Tomek's links to clean the unwanted samples.

The CICIDS2018 Dataset
Sharafaldin et al. (Sharafaldin et al., 2018) analyzed the properties of eleven IDS datasets since 1998 and showed that most are outdated and unreliable.Some issues are i) existing datasets suffered from the lack of traffic diversity and volumes, and ii) datasets do not cover the diversity of known attacks.
The CICIDS2018 dataset is publicly available for networking security and intrusion detection research from the Canadian Institute of Cyber-security.More than 80 network flow features are extracted from the traffic data generated over five days.They also delivered the network flow dataset as CSV files with 85 features and class labels.Seven different attack scenarios, such as Brute-force, Heart bleed, Web attacks, DoS, DDoS, Botnet, and infiltration of the network from inside, are included in the final dataset.There are 50 machines in the attacker's infrastructure, while 420 devices and 30 servers in the victim organization's infrastructure are spread across five departments.The CICIDS2018 dataset consists of corresponding profiles and labelled network flows, including full packet payloads in PCAP format and CSV files for Machine and deep learning purposes, as shown in Table 3.
The timestamp, source and destination ports, source and destination IPs, protocols, and attacks are all labelled in this dataset.This dataset also includes complete network architecture, including a modem, a firewall, routers, switches, and nodes with different operating systems, i.e., open-source operating system Linux, Apple's macOS, Microsoft Windows 10, Windows 8, Windows 7, and Windows XP.The dataset set is captured daily from the network traffic and generated in a PCAP file.After that, the PCAP file is converted into a CSV file.The five days of CSV file is analyzed, containing 3119345 rows and 85 columns.Some columns' names are mentioned in Figure 2. The dataset contains NULL values, as shown in Figure 3.After that, data is pre-processed, and continuous NULL values are removed from CICDS2018 datasets.The NULL values are eliminated by dropping that row from the dataset, as shown in Figure 4.
The correlation map has been created using Pearson's Correlation Coefficient (r) between the feature and target variable, as shown in Figure 5.A correlation map represents the relationships between variables with each other or the target variables.The increases in one feature value increase the target variable's value, representing the positive correlation.And the increase in one feature value decreases the value of the target variable, representing the negative correlation.The feature score is computed using the univariate selection method, as shown in Figure 6.A subset of features is selected based on their score.
The CICIDS2018 is an imbalanced dataset as it has more Benign type samples than Malware type samples.It can be seen in Figure 7 that the count of Malware type samples is much less as compared to Benign type samples.The SMOTE Tomek method is applied to the imbalance CICIDS2018data to convert it into balanced data for Binary classifiers, shown in Figure 8.
Similarly, the CICIDS2018 is an imbalanced dataset in the multi-class, as shown in Figure 9.The Oversampling Technique is applied to the imbalance CICIDS2018 data to convert it into balanced data for multi-class classifiers, as shown in Figure 10.

Performance Measure
This section discusses the classification metrics for IDS.Table 4 shows the confusion matrix for a two-class classifier that can be used to evaluate an IDS's performance.Each column of the confusion matrix indicates the samples in a predicted class, while each row shows the samples in an actual  This paper uses popular performance measures, including overall accuracy, decision rates, precision, Recall, and F1-score (Aminanto et al., 2017;Atli, 2017;Hodo et al., 2017), which are briefly discussed.
Accuracy: Accuracy is the most intuitive performance measure of a classification model.It is the ratio of the total correctly predicted samples and the total number of samples in the dataset, as shown in Equation 1. High accuracy means the model is performing well.Accuracy is a valuable measurement only when the dataset is well-balanced.Precision: Precision is also a performance measure of correctly classifying data points out of total data points predicted by the classification model, as shown in Equation 2. The higher precision value indicates the better performance of the model.Precision is also known as a positive predictive value (PPV).Precision is an excellent measure to determine when the cost of false positives is high.

Accuracy TP TN TP FP TN FN
Recall: It measures the sensitivity of the model.The Recall is a performance measure of correctly retrieving the data points.In other words, the Recall is the ratio of the total correct class predicted and the actual data points in the dataset, as shown in Equation 3. The Recall is also known as the true positive rate (TPR).The higher recall value indicates the better performance of the model.It is a good metric of measurement when there is a high cost associated with False Negative.
F1-Score: It is an instrumental performance measurement technique widely used when the model produces high Recall and low precision, or low recall and high precision, i.e., uneven class distribution (a large number of actual negative classes).F1-Score uses harmonic instead of arithmetic to punish extreme values shown in Equation 4.

F score precision recall precision recall
AUC-ROC Curves: The Area under the Curve (AUC) and Receiver Operating Characteristics (ROC) curve is an approach for measuring the performance of a classification model on different threshold settings.The curve is plotted between TPR on the y-axis and the FPR on the x-axis to measure the performance of a classifier.The higher AUC means the classification has high accuracy.It is used to know the capability of a classification model to separate the classes.
The matrices mentioned above can be used to measure the performance of both binary and multiclass IDS in which incidents are classified as either Benign or Malicious or family of Malicious.

Results Analysis
The final quantitative for each class label is assigned after noise clean-up, as shown in Table 5.It can be observed from the table that the dataset is highly unbalanced.The NULL values are removed from the dataset; the missing values are treated carefully and filled with valid data.Then feature scaling/ transformation is performed using MinMaxScaler techniques because the dataset contains varying magnitudes, values, or units.The fixed range values are provided in each column of the datasets.
The Univariate Selection method is used to compute the score of each feature on the whole dataset.The top 50 features are selected based on the score shown in Table 6.The scikit-learn library provides the SelectKBest class to extract the best features of a given dataset.SelectKBest class performs statistical tests to select features with the strongest relationship with the output or target variable.In this class, the Chi-Square method is used on the groups of categorical features to evaluate the likelihood of correlation or association between them using their frequency distribution.Table 6 lists  That means those data sets that contain more than two targets or labels.The IDS dataset target or label had Benign and some family of malware, so the dataset was classified as multi-class classification, and these were imbalanced.So for balancing the dataset, the RandomOverSampler method presents in imblearn.over_samplinglibrary is used.
The hyper-parameter techniques, i.e., GridSearchCV and RandomizedSearchCV, are employed to search for the best parameter for all classifiers according to the dataset.The target in the dataset is classified using a binary-class and multi-class classifier.So, ten popular machine-learning classification models are used based on binary-class and multi-class classifiers.The results of these models are evaluated on various factors such as Score, Precision, Recall, F1_score, Accuracy, and Total time (in seconds) taken by each algorithm.

Binary Classifier
The Binary classifier classifies target samples in the CICIDS2018 dataset into two classes.All classifiers and their accuracy, precision, Recall, f1 score, and time are shown in Table 7.It can be observed from the table that the top three best classifiers are KNeighbors (99.49%),XGBoost (99.14%) and AdaBoost (98.75%).
Box plotting is an excellent tool for identifying outliers and comparing distributions.The Box plots chart is shown in Figure 11.It helps us better understand and visualize how values are spaced out in different data sets.
The ROC curve of the Binary Classifier is shown in Figure 12.The accuracy of a testing model is evaluated based on how well the model distinguishes between malware and Benign.The ROC curve is plotted considering the Sensitivity or TPR and FPR.The colour denotes the threshold value for each TPR and FPR pair.Its threshold will be around one of the given instances that has a high affinity for the class.Hence, darker will be the colour in the ROC for a higher threshold of instances.
The AUC (Area under the Curve) measures the proportion of correctly classified test data.AUC value one represents a perfect test, whereas 0.5 represents a minor accurate test.In Figure 12, KNeighbors, XGBoost and AdaBoost are close to 1 and have a larger area under the curve than all other classifiers.It can be observed from Figure 12 that KNeighbors, XGBoost, and AdaBoost classified most of the samples correctly and have a higher percentage of accuracy than other classifiers.It can be observed that the model XGBoost, K-Neighbors, and GaussianNB perform better than other multi-class classifiers with 99.30%, 98.88% and 96.66% accuracy, respectively.

CoNCLUSIoN AND FUTURe woRK
The work on IDS is important because it is essential for protecting computer networks from malicious attacks.As the amount of data processed and transferred over networks grows, so does the number of potential attack vectors.In this environment, signature-based IDS are not always effective, as they can only detect known attacks.On the other hand, Machine learning-based IDS can detect unknown attacks by learning from normal and malicious behaviour patterns.The work presented in this paper demonstrates the effectiveness of Machine learning-based IDS in detecting both known and unknown attacks.The authors evaluated the performance of ten machine learning classifiers on a dataset of common attacks.They found that the top three models (KNeighbours, XGBoost, and

PRACTICAL IMPoRTANCe
The practical importance of this work is that it provides a valuable tool for network security professionals.Organizations can improve their ability to detect and respond to malicious attacks by using machine learning-based IDS.These findings enable the development and implementation of  IDS systems to safeguard sensitive data, protect against ransomware attacks, and mitigate cyberattacks' impact during the Covid-19 era and beyond.The identified top-performing models, such as KNeighbours, XGBoost, and AdaBoost, offer practical guidance for organizations seeking adequate security against unknown and obfuscated malicious activities.The authors found that the XGBoost algorithm was the most effective for detecting malicious attacks in their dataset.The finding suggests that XGBoost may be a good choice for Machine learning-based IDS in other settings.

THeoReTICAL IMPoRTANCe
The theoretical importance of this work is that it contributes to the body of knowledge on machine learning-based IDS.The authors' findings provide insights into the effectiveness of different machinelearning algorithms for detecting malicious attacks.This information can be used to develop more effective IDS in the future.Using statistical-based, knowledge-based, and Machine learning-based methods, researchers can enhance the accuracy and effectiveness of intrusion detection systems.As presented in this research, analyzing different machine learning classifiers expands the understanding of their performance in IDS applications, thereby contributing to the theoretical foundation of network security research.Moreover, the utilization of data re-sampling techniques and pre-processing steps ensures the robustness and reliability of the IDS dataset, facilitating the development of more accurate and efficient detection models.In addition to the practical and theoretical importance, the work on IDS also has the following benefits: • It can help to identify new malicious activities that are not yet known to signature-based IDS.
• It can provide insights into the behaviour of malicious actors, which can be used to develop better defensive strategies.• It can help to improve the overall security posture of an organization.
In summary, the undertaken work on IDS is of critical importance, both in practical terms, by addressing the immediate security challenges faced in the Covid-19 era, and in theoretical terms, by advancing intrusion detection methodologies and exploring the potential of machine learning techniques.By leveraging the insights gained from this research, organizations and researchers can make informed decisions and develop effective strategies to protect against evolving cyber threats, secure sensitive data, and ensure the integrity of communication networks in the face of an increasingly interconnected digital landscape.
In future work, the adversarial example that an attacker has intentionally designed to cause the model to make a mistake can be considered an input to the different machine learning models to understand the vulnerability of machine learning classifiers.

DeCLARATIoNS
The authors of this publication declare there is no conflict of interest.This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
It is a very effective re-scaling strategy where a feature value has a distribution with zero mean value, and variance equals to one.If we do not perform feature scaling, a machine learning model will weigh larger values lower and smaller ones higher, regardless of the unit of measurement.

Figure 1 .
Figure 1.The life cycle of machine learning-based IDS

Figure
Figure 3. CICDS2018 with NULL values

Figure
Figure 6.CICDS2018 with feature score

Figure 10 .
Figure 10.Balanced data multi-class our 50 features selected.The final considered dataset has 50 feature columns and one column with class labels.Imbalanced Learning: Imbalanced classification is a classification problem when unequal classes are in the training dataset.The imbalanced class distribution may vary, but modelling severely imbalanced data may require more specialized techniques.The dataset is classified into two sections Binary classification and Multi-class classification.Both classifications contain an Imbalanced dataset.a) Binary Classification: Binary or binomial classification uses classification rules to classify elements of a given set into two groups.The IDS dataset contains a target labelled as Benign and Malware in the form of Binary classification.In this dataset target is Imbalanced.The SMOTETomek method is used to balance the dataset present in imblearn.combinelibrary.b) Multi-class Classification: In machine learning, multi-class or multinomial classification is the problem of classifying instances into one of three or more classes.

Figure 12 .
Figure 12.ROC curve of binary classifier

Figure 13 .
Figure 13.Test score of binary classifiers

Table 6 . Top-50 features selected The Top 50 Features Based on Their High Score Are Arranged in Descending Order Total
Length of Bwd Packets, SubflowBwd Bytes, Fwd PSH Flags, SYN Flag Count, URG Flag Count, Timestamp, Init_Win_bytes_backward, Average Packet Size, Fwd IAT Total, Packet Length Mean, Flow Duration, Bwd Packet Length Mean, AvgBwd Segment Size, Bwd Packet Length Std, Destination Port, Idle Max, Packet Length Std, Bwd IAT Max, Fwd IAT Max, Flow IAT Max, Bwd IAT Total, Bwd Packet Length Max, Bwd IAT Mean, Fwd Header Length, Fwd Header Length.1,Idle Mean, Bwd IAT Min, ACK Flag Count, Flow IAT Std, Flow IAT Mean, Idle Min, Max Packet Length, Bwd IAT Std, Total Fwd Packets, SubFlowFwd Packets, Packet Length Variance, Bwd Header Length, Bwd Packet Length Min, Down/Up Ratio, Fwd IAT Std, Fwd IAT Mean, Active Min, Fwd IAT Min, Total Backward Packets, SubflowBwd Packets, Init_Win_bytes_forward, Idle Std, Active Mean, PSH Flag Count, Total Length of Fwd Packets