i-2NIDS Novel Intelligent Intrusion Detection Approach for a Strong Network Security

i-2NIDS Novel Intelligent Intrusion Detection Approach for a Strong Network Security

Sabrine Ennaji, Nabil El Akkad, Khalid Haddouch
Copyright: © 2023 |Pages: 17
DOI: 10.4018/IJISP.317113
Article PDF Download
Open access articles are freely available for download

Abstract

The potential of machine learning mechanisms played a key role in improving the intrusion detection task. However, other factors such as quality of data, overfitting, imbalanced problems, etc. may greatly affect the performance of an intelligent intrusion detection system (IDS). To tackle these issues, this paper proposes a novel machine learning-based IDS called i-2NIDS. The novelty of this approach lies in the application of the nested cross-validation method, which necessitates using two loops: the outer loop is for hyper-parameter selection that costs least error during the run of a small amount of training set and the inner loop for the error estimation in the test set. The experiments showed significant improvements within NSL-KDD dataset with a test accuracy rate of 99.97%, 99.79%, 99.72%, 99.96%, and 99.98% in detecting normal activities, DDoS/DoS, Probing, R2L and U2R attacks, respectively. The obtained results approve the efficiency and superiority of the approach over other recent existing experiments.
Article Preview
Top

Introduction

Due to the overuse of the internet and recent technologies revolution, we are drowning in a rampant growth on a massive amount of data (Behera & Bhaskari, 2017). Furthermore, people need to disclose their personal information and exchange sensitive data to be connected, communicate with each other and to benefit from other upsides of the cyberspace like e-commerce, online works, cloud storage, etc. Therefore, the safety and confidentiality of the internet user’s information has become more vulnerable towards intrusions and attacks. Many research studies are well carried out to shed light on Intrusion Detection Systems (IDSs), which are a proficient software system of detecting intrusive activities by examining all traffic flow over different environments and all internet technologies (Ramdane & Chikhi, 2014; Shukla & Singh, 2019). However, its performance is still need to be updated and improved, as long as an IDS necessitates an additional maintenance effort and human intervention (Ennaji et al., 2021). Additionally, it frequently notifies the users about false positives more than it does to real intrusions (Patel et al., 2012).

Figure 1.

Architecture design of IDS based on machine learning

IJISP.317113.f01

To fill this void, a vast majority of researchers have been opting for machine learning algorithms. The latter are widely applied in dealing with the limitations of intrusion detection systems, since they have a high potential in terms of making better identification and prediction of security threats without any intervention from the user (Stone, 1974). However, an intelligent IDS cannot make good predictions when the parameters are incorrectly selected and also because of the classification issues, such as; underfitting, overfitting, imbalanced data, etc.

For this reason, there is a useful technique, namely; cross-validation. It is considered as a resampling procedure for the determination and the selection of the appropriate parameters, which cost least test error. It is a well-known evaluation method for machine learning models that shows how well the latter will perform to an independent test data that has not been used during the training phase of the model (Stone, 1974). This approach proceeds by splitting the cleaned dataset into k-chunks of equal size. The first partition is considered as a validation set, and the model is fitted on the remaining k-1 partitions that present the training partitions. Then, the analysis is performed on each fold. Finally, it takes the average of scores of all partitions, which presents the overall estimate error. Hence, the cross-validation technique provides a better utilization of the data and it comes in different types. The most commonly used are:

  • Holdout cross-validation: The simplest type of cross-validation approach. It randomly separates the data into training and test sets. The more data is used for the model’s training, the better its performance will be.

  • K-Fold cross-validation: The dataset is equally split into k folds, then the holdout approach is repeated k-times until each fold is considered as test set and other k-1 folds as training set.

  • Stratified K-Fold cross-validation: The dataset is divided into k partitions, so that the validation set has an equal instance of the dependent class label, which is a good solution for imbalanced dataset. Then, it computes the final score based on the mean of scores of each partition.

  • Leave-P-Out cross-validation: It considers p observations as a validation set and p-1 data as a training set. This process is repeated for all p combinations. Then, it averages the accuracies from all iterations to deduce the final accuracy.

  • Leave one out cross-validation: A less exhaustive method, because it is considered as a simple variation of the previous cross-validation type, as the value of p is set as 1.

  • Repeated random sub-sampling validation: Also called Monte Carlo method; it divides the dataset randomly into k-folds for training and validation. K is number of times the model has been trained. The final score average is obtained as the mean from the number of repeats.

  • Nested cross-validation: It is considered as a technique to tune the parameters of an algorithm, unlike the other cross-validation methods that only aim to estimate the performance of an algorithm.

Complete Article List

Search this Journal:
Reset
Volume 18: 1 Issue (2024)
Volume 17: 1 Issue (2023)
Volume 16: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 15: 4 Issues (2021)
Volume 14: 4 Issues (2020)
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing