Article Preview
TopIntroduction
IDS is proved to be an effective method of network security defense (Teng et al., 2020). Many researchers have used machine learning algorithms (Alyaseen et al., 2017; Kumar et al., 2019; Li et al., 2019) to research IDS, such as deep learning, support vector machine (SVM), fuzzy sets, outliers and random forest, and genetic algorithm, and have made many breakthroughs.
On the one hand, there are a large amount of network logs for IDS to detect, so an effective algorithm should be researched to delete the redundant features to improve the detection speed. There are some features selection algorithms used to reduce the redundant features, such as rough set, fuzzy set, and so on.
Feature selection algorithm (FSA) is introduced as a pretreatment to the anomaly detection to optimize existing classifiers. FSA can eliminate irrelevant and redundant features, reduce computational complexity, and improve the accuracy of the learning algorithms (Chunhui & Wenjuan, 2021; Ying- Wu et al., 2010).
Este´vez et al. (2009) designed a Mutual Information Feature Selection (MIFS) method. However, in the MIFS algorithm, the increase of the input features can easily lead to some irrelevant feature selections (Lashkia et al., 2004). Peng et al. (2014) proposed a minimal- Redundancy- Maximal- Relevance (mRMR) criteria, with which the impact of parameter β through the average of redundancy values was decreased. This criterion has a very low expense to give feature selection, but the entropy may vary considerably. Panigrahi (2021) gave an improved infinite feature selection for multiclass classification (IIFS-MC) to eliminate the superfluous attributes.
To increases the speed and deviation of mutual information among multi-valued attributes, the values of features are normalized in [0, 1]. The authors gave a NMIFS to reduce the algorithm complexity and obtain the optimal features. The experiment results showed that the NMIFS method has better performance on feature selection on several benchmark problems.
On the other hand, the classifier will directly affect the accuracy of anomaly detection (Yilei et al., 2021). JooHwa and KeeHyun (2019) designed an IDS with autoencoder - conditional, the generative adversarial networks and the random forest (AE - CGAN - RF), autoencoder-conditional method was adopted to reduce high-dimensional data dimension and to get a higher detection rate. Jiadong et al. (2019) gave a hybrid multilevel intrusion detection model. The outliers detection algorithm can effectively reduce some redundant attributes and improve the speed of detection. Alyaseen et al. (2017) used K - means algorithm to achieve training data set in a multilevel hybrid intrusion detection model, with which, they got better performance of classifiers. Yang et al. (2019) proposed an Effective IDS using the Modified Density Peak Clustering Algorithm and Deep Belief Networks (MDPCA-DBN). They used the Modified Density Peak Clustering Algorithm and Deep Networks to reduce the size of the training set, solve the imbalance of sample, and improve the efficiency of detection. Song et al. (2018) proposed an anti-adversarial hidden markov model for network-based intrusion detection (AA-HMM). However those algorithms had lower self-adaptability, lower detection rate, and higher false alert rate for small samples sets.