Development and Evaluation of a Dataset Generator Tool for Generating Synthetic Log Files Containing Computer Attack Signatures

Development and Evaluation of a Dataset Generator Tool for Generating Synthetic Log Files Containing Computer Attack Signatures

Stephen O’Shaughnessy (Institute of Technology Blanchardstown, Ireland) and Geraldine Gray (Institute of Technology Blanchardstown, Ireland)
DOI: 10.4018/978-1-4666-2041-4.ch011


A key requirement for experimental analysis in the areas of network intrusion and computer forensics is the availability of suitable datasets. However, the inherent security and privacy issues surrounding these disciplines have resulted in a lack of available “test-bed” datasets for testing and evaluation purposes. Typically, the datasets required in these cases are from system log files, containing traces of computer misuse. Therefore, there is obvious potential for the use of synthetically generated log files that can accurately reproduce these traces or patterns of misuse. This paper discusses the development, testing, and evaluation of a dataset generator tool, designed to produce such datasets, particularly those containing patterns of common computer attacks.
Chapter Preview

2. Background

Anyone who uses a computer will leave a trace or “fingerprint” of their activities, whether this activity is malicious or not. These usage traces are stored in the various log files generated by components on a computer or network and as such, a wealth of information on network and user activity that can be gleaned from such log files. For this reason, the authors chose to use log files to replicate the patterns of attack signatures featured in the dataset generator tool.

2.1. Types of Log File Represented by the Tool

The tool uses two different log file types to represent the attack signatures. The first type, the firewall log, records network activity to and from a host machine. The firewall log can serve as a critical component of information security, as it can be used to identify unusual or unexpected traffic patterns on a local network (Stingley, 2009). The second type, the Common Log Format log or CLF log, is the type of log file present on Web servers such as Apache or IIS and typically log a history of page requests to the server. Many differing types of computer attack signatures can be identified in this log.

Both of these log file types are ranked in the “Top 5 Essential Log Reports” by SANS (Brenton, Bird, & Ranum, 2006) and for this reason were chosen to represent the attack signatures replicated by the tool.

2.2. Categories of Attack Types

There are countless different types of attacks that are afflicted on computers and computer networks. Loosely, these attacks can be categorised as either passive or active. Passive attacks are those in which the attacker monitors transmissions and accesses data without modifying the data in any way. A passive attacker does not want the victim to know they are being attacked. Active attacks on the other hand involve the motoring and accessing of data like passive attacks, but the data is also modified in the process, to either cause malicious damage or to make some kind of gain for the attacker, whether it’s financial or otherwise (Asgaut Eng., 1996).

Complete Chapter List

Search this Book: