Deep Learning-Based Malware Detection and Classification

Deep Learning-Based Malware Detection and Classification

Mirnalinee T. T. (Sri Sivasubramaniya Nadar College of Engineering, India), Bhuvana J. (Sri Sivasubramaniya Nadar College of Engineering, India), Arul Thileeban S. (Sri Sivasubramaniya Nadar College of Engineering, India), Daniel Jeswin Nallathambi (Sri Sivasubramaniya Nadar College of Engineering, India), and Anirudh Muthukumar (Sri Sivasubramaniya Nadar College of Engineering, India)
Copyright: © 2021 |Pages: 24
DOI: 10.4018/978-1-7998-4900-1.ch006
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Malware analysis is an important aspect of cyber security and is a key component in securing systems from attackers. New malware signatures are being created continuously and detection techniques need to keep pace with them. The primary objective is to propose a solution which detects malicious files in real time by evaluating each file. Other objectives are to assess the threat level of the malware and recognize the family of malicious file. Hence, to cover all the needs and to fulfill the motivation, a deep neural network is more suitable to detect and classify the malware. Convolutional neural network-based system MalNet-D is designed to detect the presence of malware, and subsequently, to classify the detected malware into the family in which it belongs, a variation of MalNet-D termed as MalNet-C is proposed. Images of the executable files, both malign and benign, are used as input data, which is trained by the respective MalNet. This is used to detect and classify malware into families. The system achieved 93% accuracy in malware detection and 96% accuracy in malware classification.
Chapter Preview
Top

Introduction

Malware analysis is an important aspect of cyber security and is a key component in securing systems from attackers. Attacks are directed in form of malware as attack vectors. They are primarily in executable formats bound to benign exe files to fool people. In this case, a separate malware analysis tool is required to detect the malwares. Typically, Antivirus software are used for detection of malwares. Behind the screen, analysis is done by antivirus establishments and the virus signatures are uploaded into database from which detection is done. There are two major methods available namely static and dynamic analysis, both consume time and not feasible to be done in real time. Static analysis evaluates the code piece by piece for viral signatures. Dynamic analysis is typically done by running the executable in sandbox and detecting viral nature if any.

Malware is a software which purposefully damage the computer resources and takes the form of executable code. Such codes are referred to as virus, worms, Trojans, etc. Malware’s intent is to act against the computer user but not by exploiting some deficiency in the system. Major business faces a loss of $12 million, annually due to the virus attacks revealed by an FBI survey (Computer crime, 2013). The role of antivirus software is to detect and eliminate the viruses in computer software and hence it is called so. During 2009, a survey by Symantec stated that 80% of home user had installed antivirus software though most of them had not used the antivirus software for protecting their system (Internet Security, 2013).

In a world where effects of malware infection are fatal, real time analysis is needed to block them before they spread like “WannaCry” did. Beyond the monetary losses, this also helps to invade privacy, a skeptical key point with many users. Added to these defects, there is loss of data which is highly critical for major operations. These major issues clearly serve as motivation for real time detection of malware. With the current available tools, they don’t seem feasible due to the latency involved with detection by anti-virus software.

New malware signatures are being created continuously and detection techniques need to keep pace with them. The nature of malware analysis has been typically passive. In passive analysis the analyst analyses executable files by running a traceback on the operations executed in them. This is tedious and time consuming in nature. Such methods cannot be used to detect malware or infected files in real time. Detecting and blocking malware in real time greatly increases defense mechanisms against new types of malware created each year. Panda Labs annual report for the year 2017 suggests that over 15 million of the malware files that were seen that year were infected with newly created malware (PandaLabs, 2017). Moreover, these statistics were gathered only for clients serviced by Panda Labs. Therefore, to combat the proliferation of malware there is a need to analyze and identify them in real time.

Our primary objective is to propose a solution which detects malicious files in real time by evaluating each file. Beyond this, we also want to classify the type of malware family the malicious file belongs with, to assess the threat level of the malware. Hence, to cover all the needs and to fulfil the motivation, a deep neural network is more suitable to detect and classify the malware. we propose a solution using deep learning networks for detection of malware in real time. This work proposes the usage of a Convolutional Neural Network to perform this analysis of malware. We utilize images of files, both malign and benign as input data which is trained on a Convolutional Neural Network (CNN) where it is used to both detect and classify malware into families.

The proposed system of malware detection takes as input an executable file and performs binary classification to predict whether the executable is a malware or not. Hence the input dataset consists of both malware executables and benign executables. These input executables are fed to 8-bit encoding module which converts the executable files into grayscale images of high dimensions.

Complete Chapter List

Search this Book:
Reset