An SVM-Based Ensemble Approach for Intrusion Detection

An SVM-Based Ensemble Approach for Intrusion Detection

Santosh Kumar Sahu (Advanced Information Security Lab, National Institute of Technology, Rourkela, India), Akanksha Katiyar (Gurukul Kangri Vishwavidyalaya, Dehradun, India), Kanchan Mala Kumari (Department of Computer Science, Central University of Haryana, Mahendragarh, India), Govind Kumar (School of Computer Science and Informatics, Central University of Haryana, Mahendragarh, India) and Durga Prasad Mohapatra (Department of Computer Science and Engineering, National Institute of Technology, Rourkela, India)
DOI: 10.4018/IJITWE.2019010104

Abstract

The objective of this article is to develop an intrusion detection model aimed at distinguishing attacks in the network. The aim of building IDS relies on upon preprocessing of intrusion data, choosing most relevant features and in the plan of an efficient learning algorithm that properly groups the normal and malicious examples. In this experiment, the detection model uses an ensemble approach of supervised (SVM) and unsupervised (K-Means) to detect the patterns. This technique first divides the data and forms two clusters as per K-Means and labels the clusters using the Support Vector Machine (SVM). The parameters of K-Means and SVM are tuned and optimized using an intrusion dataset. The SVM provides up to 88%, and K-Means provides up to 83% accuracy individually. However, the ensemble of K-Means and SVM provides more than 99% on three benchmarked datasets in less time. The SVM only classifies three instances of each cluster randomly and labels them as per a majority voting approach. The proposed approach outperforms compared to earlier ensemble approaches on intrusion datasets.
Article Preview

1. Introduction

In the predictive analysis, the output is more quality, clarity, and stable if the input data is consistent in all manner like free from noise, duplicate records and anomaly (Dietterich, 2002). A dataset consists of feature sets, where every feature set is a depiction of an output known as a class label. In Intrusion Detection, the KDDCup99 is a popular benchmarked dataset. The details about the datasets discussed in Section 2. The term interruption refers to any unapproved approach that endeavors to compromise Privacy, Integrity, and Availability (CIA) of the security system. The intruders try to discover the expose in the protection framework, and get ready for attack. Nowadays a variety of penetration testing frameworks are available for vulnerability analysis as well as exploit the target system. As per the Gartner Forecasts (Gartner, 2017) given in Table 1, security spending of worldwide will reach $96 billion in 2018 that is 8% from 2017 (Gartner, 2017). The organizations are spending more on security regarding infrastructure, state of the art detection approach and awareness of emerging threats and their countermeasures. Hence, a lot of research work is carried out to protect the information in the individual as well as enterprise level.

In the digital age where the internet and online services play a vital role, it has become an unavoidable requirement to provide security over the Internet. It is clear that firewalls and anti-viruses are not enough to secure a network completely. Intrusion Detection is used to stop the attacks, recover from them with the minimum loss or analyze the security problems so that the attacks are not repeated. Nowadays' artificial intelligence, information mining, and machine learning calculations have been enslaved to expand investigation on ID with weight on upgrading the exactness of identification and make a safe model for (IDS) to deal with Zero-day or new assault.

Multidisciplinary approaches such as data mining, machine learning, artificial intelligence, big data analytics and deep learning applied to learn the nature and the behavior of the threats and make an immune model to predict them in future. Intrusion Detection (ID) is the procedure of quick espial of undesirable infraction in the system's normal behavior. The objective is to detect the intrusion which is a challenging task. The attacker or hacker changes their sequence of attempts, i.e. patterns/signature. As a result, it is very difficult to detect it. Hence a single detection approach is not sufficient to detect this kind of threats. Therefore, ensemble approaches come into the picture that combines multiple detection approaches and easily detects the novel attacks.

Table 1.
Worldwide security spending by segment, 2016-2018 (millions of current dollars)
Segment201620172018
Identity Access Management3,9114,2794,695
Infrastructure Protection15,15616,21717,467
Network Security Equipment9,78910,93411,66
Security Services48,79653,06557,719
Consumer Security Software4,5734,6374,746
Total82,22589,13396,296

Source: Gartner (2017)

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 14: 4 Issues (2019): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2018)
Volume 12: 4 Issues (2017)
Volume 11: 4 Issues (2016)
Volume 10: 4 Issues (2015)
Volume 9: 4 Issues (2014)
Volume 8: 4 Issues (2013)
Volume 7: 4 Issues (2012)
Volume 6: 4 Issues (2011)
Volume 5: 4 Issues (2010)
Volume 4: 4 Issues (2009)
Volume 3: 4 Issues (2008)
Volume 2: 4 Issues (2007)
Volume 1: 4 Issues (2006)
View Complete Journal Contents Listing