Models Network Data for Association and Prediction

Yu Wang

Source Title: Statistical Techniques for Network Security: Modern Statistically-Based Intrusion Detection and Protection

ISBN13: 9781599047089|ISBN10: 159904708X|ISBN13 Softcover: 9781616925048|EISBN13: 9781599047102

DOI: 10.4018/978-1-59904-708-9.ch007

Cite Chapter Cite Chapter

MLA

Yun Wang . "Models Network Data for Association and Prediction." Statistical Techniques for Network Security: Modern Statistically-Based Intrusion Detection and Protection, IGI Global, 2009, pp.220-260. https://doi.org/10.4018/978-1-59904-708-9.ch007

APA

Y. Wang (2009). Models Network Data for Association and Prediction. IGI Global. https://doi.org/10.4018/978-1-59904-708-9.ch007

Chicago

Yun Wang . "Models Network Data for Association and Prediction." In Statistical Techniques for Network Security: Modern Statistically-Based Intrusion Detection and Protection. Hershey, PA: IGI Global, 2009. https://doi.org/10.4018/978-1-59904-708-9.ch007

Export Reference

Favorite

View Full Text HTML

View Full Text PDF

Abstract

Data exploratory analysis discovers data structures and patterns with all variables as a whole, but this analysis does not particularly focus on seeking associations between response variables and predictor variables. In this chapter, we will discuss how to identify and measure this response-prediction relationship, which is an essential element in intrusion detection and prevention. Even though the expression for models for association and prediction can have a broad range, in general the goals of modeling for association and prediction in network security are two-fold: (1) to identify variables that are significantly associated with the response variable and (2) to assess the robustness of these variables, if any, in predicting the response. Although the term, model, is perhaps confusing to many people, a model is just a simpli- fied representation of some aspect of the real world, whether an object or observation, or a situation or process. Models are of particular importance for network security because of the size of data and the complex relationship among variables and the desired outcomes. Statistical modeling procedures available for analyzing the response-predictor phenomenon mainly include bivariate analysis and multiple regression-based analysis. Bivariate analysis focuses on the relationship between two variables (e.g., a response and a predictor) without taking into account any impact from other predictor variables on the response variable. The multiple regression modeling approach, on the other hand, requires establishing a regression relationship between a response variable and a set of potential predictor variables, and the predictive power of each of the predictors as adjusted by others. Therefore, a variable associates with the response significantly in the bivariate analysis may no longer hold such an association in the regression analysis after adjusting from other variables. In the following sections, we will review and discuss these two main approaches in detail. For readers who would like to attain a more general knowledge on modeling associations should refer to Mandel (1964), Press & Wilson (1978), Cohen & Cohen (1983), Berry & Feldman (1985), Cox & Snell (1989), McCullagh & Nelder (1989), Agresti (1996), Ryan (1997), Long (1997), Burnham & Anderson (1998), Pampel (2000), Tabachnick & Fidell (2001), Agresti (2002), Myers, Montgomery & Vining (2002), Menard (2002), and O’Connell (2006). Comprehensive reviews on data mining and statistical learning can be found from Vapnik (1998, 1999), Hastie, Tibshirani & Friedman (2001), Bozdogan (2003).

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Models Network Data for Association and Prediction

MLA

APA

Chicago

Export Reference

Abstract

Request Access