Classification Based on Supervised Learning

Classification Based on Supervised Learning

Yu Wang (Yale University, USA)
DOI: 10.4018/978-1-59904-708-9.ch009
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Classification plays an important role in network security. It classifies network traffic into different categories based on the characteristics of the traffic and aims to prevent network attacks by detecting intrusion as early as possible. If a labeled response variable is available then the classification belongs to the statistically supervised learning theme. The term ”supervised learning” comes from the Artificial Intelligence field where research is focused on machine learning (Nilsson, 1996). In general, a supervised learning task can be described by giving a training sample with known patterns, f, represented by predictors, X, and a labeled response variable, Y , to select 1, 2 ( , , ) g y y y y Y = ? ? values for new 1 2 ( , , ) k x x x x X = ? ? values. 1 2 ( , , ) g Y y y y ? may be either a binary class, or multilevel classes, ( 2) g > . As we discussed previously, these classes cannot be determined absolutely and they are based on the degree of our belief, which is expressed in terms of probability (Woodworth, 2004). In this chapter, we will focus mainly on the binary classification task and we will discuss several modeling approaches, including both parametric and nonparametric methods. Readers who are interested in obtaining fundamental information on supervised learning and machine learning algorithms should refer to Lane & Brodley (1997), Vapnik (1998, 1999), Hosmer & Lemeshow (2000), Duda, Hart & Stork (2001), Hastie, Tibshirani & Friedman (2001), Müller Mika, Rätsch, Tsuda & Schölkopf (2001), Herbrich (2002), Vittinghoff, Glidden, Alpaydin (2004), Shiboski & McCulloch (2005), Maloof (2006), Neuhaus & Bunke (2007), and Diederich (2008).
Chapter Preview

Whatever you are by nature, keep to it; never desert your line of talent. Be what nature intended you for and you will succeed.

- Sydney Smith

Top

Introduction

Classification plays an important role in network security. It classifies network traffic into different categories based on the characteristics of the traffic and aims to prevent network attacks by detecting intrusion as early as possible. If a labeled response variable is available then the classification belongs to the statistically supervised learning theme. The term ”supervised learning” comes from the Artificial Intelligence field where research is focused on machine learning (Nilsson, 1996). In general, a supervised learning task can be described by giving a training sample with known patterns, , represented by predictors, , and a labeled response variable, , to select values for new values. may be either a binary class, or multilevel classes, . As we discussed previously, these classes cannot be determined absolutely and they are based on the degree of our belief, which is expressed in terms of probability (Woodworth, 2004). In this chapter, we will focus mainly on the binary classification task and we will discuss several modeling approaches, including both parametric and nonparametric methods. Readers who are interested in obtaining fundamental information on supervised learning and machine learning algorithms should refer to Lane & Brodley (1997), Vapnik (1998, 1999), Hosmer & Lemeshow (2000), Duda, Hart & Stork (2001), Hastie, Tibshirani & Friedman (2001), Müller Mika, Rätsch, Tsuda & Schölkopf (2001), Herbrich (2002), Vittinghoff, Glidden, Alpaydin (2004), Shiboski & McCulloch (2005), Maloof (2006), Neuhaus & Bunke (2007), and Diederich (2008).

Complete Chapter List

Search this Book:
Reset