Crime Identification Using Traffic Analysis of HTTP Botnet

Crime Identification Using Traffic Analysis of HTTP Botnet

Ciza Thomas (Directorate of Technical Education, India)
Copyright: © 2020 |Pages: 12
DOI: 10.4018/978-1-5225-9715-5.ch074

Abstract

A botnet is a network of malware-infected systems called bots that are controlled by a botmaster through a command and control channel. Various types of crimes are done by the botmaster with the help of these bots. Botmasters can use HTTP protocol for the C&C channel as majority of the internet traffic uses HTTP and hence are allowed in most of the networks. Effectively, bots hide their communication within the normal HTTP traffic as it is not easy to block this service as a precautionary measure. This fact makes the HTTP-based C&C communication stealthier. This work proposes a technique to collect and analyse HTTP botnets. In this work a framework was developed in order to build HTTP botnets in a controlled environment. Signatures of the bots that were set up are also obtained. Further analysis was done using machine learning-based classification as well as periodicity analysis. The results demonstrate the superior detection performance of the proposed method with 100% accuracy and detection rate.
Chapter Preview
Top

Introduction

A botnet is a network of malware infected systems that are controlled by an attacker through a Command and Control (C&C) channel. The attacker, also called the botmaster, controls the infected systems that are called bots or zombies. Various types of cyber-crimes are done by the botmaster with the help of these bots. A group of bots under the control of a botmaster is called a botnet. The general layout of a botnet system is shown in figure 1. The attackers use botnets to create disruption on the network or on a victim host either by utilizing the entire bandwidth in that network with bogus connections or by 100% CPU utilization on the victim host. This is through commanding the compromised bots to overload the resources of the victim machine/ network, to the point that it stops functioning resulting in denial of access. Such an attack is called a denial of service (DoS). Botnets can be used by botmaster to perform distributed denial-of-service (DDoS) attack, steal data, send spam, and access the device and its connection. These cyber-crimes are constantly evolving and hence the list of cyber threats can at no stage be considered exhaustive. Thus botnets are a great threat on the Internet by serving as the basic infrastructure for various distributed attacks.

Botmasters can use HTTP protocol for the C&C channel as majority of the Internet traffic uses HTTP and hence are allowed in most of the networks. Effectively, bots hide their communication within the normal HTTP traffic as it is not easy to block this service as a precautionary measure. This fact makes the HTTP-based C&C communication stealthier. Centralised C&C channels are prone to single point of failure as the C&C channel if detected and stopped, results in the loss of the communication channel between the compromised hosts. The advantage of centralised C&C channels is that they are simple and easy to setup as highlighted in Gu and Perdisci (2008). Botmasters have moved to peer-to-peer (P2P) C&C architecture to make their bots more powerful and stealthy.

Figure 1.

Typical botnet system

978-1-5225-9715-5.ch074.f01

Bots run automated programs that are designed to execute specific scheduled activities or to respond to commands in a particular manner. Hence, it is expected that the botnet generated traffic should always be having an apparent structure and regularity in the behavioral pattern. This is because the normal user behaviour is totally unpredictable, random and complex. This is attributed to the innumerable online applications and resources available for users. Hence the normal traffic differs from the botnet communication traffic, which is systematic and consistent in behaviour.

Several detection strategies have been developed in the available literature for botnet detection like up-to-date anti-virus software, signature-based intrusion detection systems for IRC/botnet traffic and traffic flow monitoring for known C&Cs. These detection techniques differ based on the C&C mechanism being centralised architecture (IRC, HTTP) or peer to peer (P2P) architecture or hybrid P2P/Centralised architecture. Detection also varies depending on other factors such as area of deployment, data captured for the detection system, etc.

Key Terms in this Chapter

Firewall: A firewall is a network security system that allows or denies incoming and outgoing network traffic based on predetermined security rules.

Router: Router is an inter-networking device that forwards the packets from one computer network to another. Routers perform the traffic route finding on the Internet by means of a routing table.

Bots: Any system connected to Internet becomes a bot when it runs automated tasks or scripts over the internet. Majority of the malicious web traffic gets originated from bots.

Periodicity: Periodicity is the occurrence of similarity in more or less regular intervals and is a property exhibited by many processes that are of interest in a variety of scientific disciplines.

Botmaster: Botmaster is the master of bots on the network. A botmaster is responsible for keeping the bot online, sending control commands to bots for its operation, making sure to fix issues with bots, and set a set of rules for bots to function.

Botnet: A botnet is a collection of bots. Botnets originate many types of attacks like distributed denial-of-service attack (DDoS attack), data theft, spamming, and intrusion to systems and networks.

IRC: Internet relay chat (IRC) is a system for chatting that involves a set of rules and conventions and client/server software.

C&C Server: A command and control server (C&C server) is a system that issues directives to other connected systems that have been infected with rootkits or other types of malware such as ransomware.

Intrusion Detection Systems: An intrusion detection system (IDS) is a hardware device or software application that monitors a network or systems for malicious activity or policy violations.

HTTP: HTTP means HyperText Transfer Protocol. HTTP is the underlying protocol used by the world wide web and this protocol defines how messages are formatted and transmitted, and what actions web servers and browsers should take in response to various commands.

Signature-Based Classifier: Signature-based classifier classifies items based on an entry in a lookup table with signatures and the corresponding class labels. Hence, this classifier is not efficient in detecting novel patterns/signatures.

Random Forest Classifier: Random forest classifier uses a set of decision trees from randomly selected subset of the training data and then aggregates the decision from each set to arrive at the final decision class of the test data. This improves accuracy and avoids over-fitting and hence better than decision tree algorithms.

Complete Chapter List

Search this Book:
Reset