Models Network Data for Association and Prediction

Models Network Data for Association and Prediction

Yu Wang
ISBN13: 9781599047089|ISBN10: 159904708X|ISBN13 Softcover: 9781616925048|EISBN13: 9781599047102
DOI: 10.4018/978-1-59904-708-9.ch007
Cite Chapter Cite Chapter

MLA

Yun Wang . "Models Network Data for Association and Prediction." Statistical Techniques for Network Security: Modern Statistically-Based Intrusion Detection and Protection, IGI Global, 2009, pp.220-260. https://doi.org/10.4018/978-1-59904-708-9.ch007

APA

Y. Wang (2009). Models Network Data for Association and Prediction. IGI Global. https://doi.org/10.4018/978-1-59904-708-9.ch007

Chicago

Yun Wang . "Models Network Data for Association and Prediction." In Statistical Techniques for Network Security: Modern Statistically-Based Intrusion Detection and Protection. Hershey, PA: IGI Global, 2009. https://doi.org/10.4018/978-1-59904-708-9.ch007

Export Reference

Mendeley
Favorite

Abstract

Data exploratory analysis discovers data structures and patterns with all variables as a whole, but this analysis does not particularly focus on seeking associations between response variables and predictor variables. In this chapter, we will discuss how to identify and measure this response-prediction relationship, which is an essential element in intrusion detection and prevention. Even though the expression for models for association and prediction can have a broad range, in general the goals of modeling for association and prediction in network security are two-fold: (1) to identify variables that are significantly associated with the response variable and (2) to assess the robustness of these variables, if any, in predicting the response. Although the term, model, is perhaps confusing to many people, a model is just a simpli- fied representation of some aspect of the real world, whether an object or observation, or a situation or process. Models are of particular importance for network security because of the size of data and the complex relationship among variables and the desired outcomes. Statistical modeling procedures available for analyzing the response-predictor phenomenon mainly include bivariate analysis and multiple regression-based analysis. Bivariate analysis focuses on the relationship between two variables (e.g., a response and a predictor) without taking into account any impact from other predictor variables on the response variable. The multiple regression modeling approach, on the other hand, requires establishing a regression relationship between a response variable and a set of potential predictor variables, and the predictive power of each of the predictors as adjusted by others. Therefore, a variable associates with the response significantly in the bivariate analysis may no longer hold such an association in the regression analysis after adjusting from other variables. In the following sections, we will review and discuss these two main approaches in detail. For readers who would like to attain a more general knowledge on modeling associations should refer to Mandel (1964), Press & Wilson (1978), Cohen & Cohen (1983), Berry & Feldman (1985), Cox & Snell (1989), McCullagh & Nelder (1989), Agresti (1996), Ryan (1997), Long (1997), Burnham & Anderson (1998), Pampel (2000), Tabachnick & Fidell (2001), Agresti (2002), Myers, Montgomery & Vining (2002), Menard (2002), and O’Connell (2006). Comprehensive reviews on data mining and statistical learning can be found from Vapnik (1998, 1999), Hastie, Tibshirani & Friedman (2001), Bozdogan (2003).

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.