Network Intrusion Detection Using Multi-Objective Ensemble Classifiers

Network Intrusion Detection Using Multi-Objective Ensemble Classifiers

Arif Jamal Malik (Foundation University Islamabad, Pakistan) and Muhammad Haneef (Foundation University Islamabad, Pakistan)
Copyright: © 2016 |Pages: 15
DOI: 10.4018/978-1-5225-0448-1.ch009
OnDemand PDF Download:
$37.50

Abstract

During the past few years, Internet has become a public platform for communication and exchange of information online. The increase in network usage has increased the chance of network attacks. In order to detect the malicious activities and threats, several kinds of Intrusion Detection Systems (IDSs) have been designed over the past few years. The goal of IDS is to intelligently monitor events occurring in a computer system or a network and analyze them for any sign of violation of the security policy as well as retain the availability, integrity, and confidentiality of a network information system. An IDS may be categorized as anomaly detection system or misuse detection system. Anomaly detection systems usually apply statistical or Artificial Intelligence (AI) techniques to detect attacks; therefore, these systems have the ability to detect novel or unknown attacks. A misuse detection system uses signature-based detection; therefore, these systems are good at identifying already known attacks but cannot detect unknown attacks.
Chapter Preview
Top

Introduction

In many real world problems a situation may arise when more than one goals need to be taken care of simultaneously. Such as in the above stated problem only a single objective e.g. Intrusion Detection Rate (IDR), False Discovery Rate (FDR), or False Positive Rate (FPR) is not enough to guide a classifier’s learning process. In such cases one objective is maximized at the expense of the remaining objectives. Dealing with more than one objective means to optimize multiple objectives simultaneously, therefore, in such cases all the goals cannot be fully achieved at the same time but a compromise has to be made among them.

Genetic algorithm (GA) has been successfully used to solve global optimization problems and has proved to be efficient at solving such problems Breiman (2000). The main advantage of GA over many other optimization techniques is its simple implementation and ability to converge quickly to an optimal solution. GA has also been applied successfully in many research and application areas during the past several years Coello (2002). It is demonstrated that GA gets better results in a faster and cheaper way as compared to other evolutionary algorithms.

Evolutionary Algorithms (EAs) are capable of detecting several solutions of a Multi-objective (MO) problem in a single run as discussed by Veldhuzen (2003), Zitzler (1999), Zitzler (2000), and Deb (1999). These solutions are called Pareto optimal solutions. EAs are best suited for their ability to search solutions for multiple objectives simultaneously by Schaffer (1984), and Kennedy (2001). EAs also have the ability to easily parallelize their tasks thus decreasing the computational load and the required execution time. The solution to a multi-objective problem is the set of non-dominated solutions on the Pareto Front as discussed by Veldhuizen, (2000), where every solution on the front is optimal. Trying to improve one or more dimensions of the objective function on the front may lead to a decrease in at least one other dimension of the objective function Kennedy (2001).

In this chapter, we use multi-objective GA algorithm to select features from the DARPA KDD99 benchmark dataset Kayacık (2005). To carry out classification we use Random forests (RF) algorithm to uncover classification rules for classifying records with unknown class membership for the problem of intrusion detection based on its following significant features:

  • It is unsurpassable in accuracy among the current data mining algorithms.

  • It runs efficiently on large data sets with many features.

  • It can give the estimates of what features are important.

  • It has no nominal data problem and does not over-fit.

  • It can handle unbalanced data sets.

Complete Chapter List

Search this Book:
Reset