Shopping Cart | Login | Register | Language: English

Determining the Minimum Sample Size of Audit Data Required to Profile User Behavior and Detect Anomaly Intrusion

Volume 2, Issue 3. Copyright © 2006. 15 pages.
OnDemand Article PDF Download
Download link provided immediately after order completion
$30.00
List Price: $37.50
Current Promotions:
20% Online Bookstore Discount*
Available. Instant access upon order completion.
DOI: 10.4018/jbdcn.2006070103
Sample PDFCite

MLA

Wang, Yun and Sharon-Lise T. Normand. "Determining the Minimum Sample Size of Audit Data Required to Profile User Behavior and Detect Anomaly Intrusion." IJBDCN 2.3 (2006): 31-45. Web. 22 Oct. 2014. doi:10.4018/jbdcn.2006070103

APA

Wang, Y., & Normand, S. T. (2006). Determining the Minimum Sample Size of Audit Data Required to Profile User Behavior and Detect Anomaly Intrusion. International Journal of Business Data Communications and Networking (IJBDCN), 2(3), 31-45. doi:10.4018/jbdcn.2006070103

Chicago

Wang, Yun and Sharon-Lise T. Normand. "Determining the Minimum Sample Size of Audit Data Required to Profile User Behavior and Detect Anomaly Intrusion," International Journal of Business Data Communications and Networking (IJBDCN) 2 (2006): 3, accessed (October 22, 2014), doi:10.4018/jbdcn.2006070103

Export Reference

Mendeley
Favorite
Determining the Minimum Sample Size of Audit Data Required to Profile User Behavior and Detect Anomaly Intrusion
Access on Platform
Browse by Subject
Top

Abstract

Although statistical modeling techniques have been employed to detect anomaly intrusion and profile user behavior with network traffic data collected from multi-sites (IP addresses), the minimum sample size of audit data required for each site is unclear. Using the Intrusion Detection Evaluation off-line data developed by the Lincoln Laboratory at Massachusetts Institute of Technology under the Defense Advanced Research Projects Agency, this study aimed to address the challenge of determining sample size. Bivariate analysis was employed to construct a composite score to rank each site’s probability of being an anomaly, and statistical simulations were conducted to evaluate the ranking variation between the population based “true” pattern of user behavior and different sample based “observed” patterns. A sequence of hierarchical random effects logistic regression models was fitted to compare the performance of the full dataset-based and sample-based classifications. The results show that a minimum sample size of 500 per site provides a sensitivity value of 0.85, specificity value of 0.92 and kappa statistic of 0.77. Compared with the full dataset-based model, the minimum sample-based model had a similar Receiver Operating Characteristic area (0.983 vs. 0.997) and a slightly higher misclassification rate (3.16% vs. 1.71%) in detecting abnormal patterns.
Top

Complete Article List

Search this Journal: Reset
Volume 10: 1 Issue (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing