The Era of Advanced Machine Learning and Deep Learning Algorithms for Malware Detection

The Era of Advanced Machine Learning and Deep Learning Algorithms for Malware Detection

Kwok Tai Chui, Patricia Ordóñez de Pablos, Miltiadis D. Lytras, Ryan Wen Liu, Chien-wen Shen
Copyright: © 2022 |Pages: 15
DOI: 10.4018/978-1-7998-7789-9.ch004
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Software has been the essential element to computers in today's digital era. Unfortunately, it has experienced challenges from various types of malware, which are designed for sabotage, criminal money-making, and information theft. To protect the gadgets from malware, numerous malware detection algorithms have been proposed. In the olden days there were shallow learning algorithms, and in recent years there are deep learning algorithms. With the availability of big data for training of model and affordable and high-performance computing services, deep learning has demonstrated its superiority in many smart city applications, in terms of accuracy, error rate, etc. This chapter intends to conduct a systematic review on the latest development of deep learning algorithms for malware detection. Some future research directions are suggested for further exploration.
Chapter Preview
Top

Introduction

Computing tools and smartphones have played an impactful role towards smart city vision in recent decades (Ficco, Esposito, Xiang, & Palmieri, 2017; Rose, Raghuram, Watson, & Wigley, 2021). According to the Statista (Technology Markets: Software, 2021), as shown in Figure 1, there is a steady growth rate of around 7.1-7.7% in the revenue of software development from 2017 to 2025, except the historical low 2.6% in 2020 and bounced back to 9% in 2021 during pandemic. The projection could be altered depending on the deployment of 5G and development of 6G (Stergiou, Psannis, & Gupta, 2020).

Intuitively, the more the number of software linking to gadgets, the more the number of malware attacks. Yet, numerous types of malware have been developed such as scareware, wiper, rogue software, adware, spyware, ransomware, Trojan horses, worms, and computer viruses (Kumar, 2020; Rendell, 2019). Surprisingly, the global yearly malware attacks (SonicWall, 2021) does not follow an increasing trend, as shown in Figure 2. From 2015 to 2017, the percentage changes in the number of malware attacks are -3.7% and 8.9%, respectively. There was a notably increment by 22.1% from 2017 to 2018 and slightly decrement by 5.7% from 2018 to 2019. Compared the last two recorded periods from 2019 to 2020, a significant drop (43.4%) in the number of malware attacks was observed. The key explanation to the drop of the malware attacks is malware detection algorithms which can detect malware and thus avoid the damage of gadgets.

A lot of traditional machine learning algorithms was employed for malware detection in literature, including decision tree, Naïve Bayes, support vector machine, K-nearest neighbour, Bayseian network, multi-layer perception, J48, and random forest, (Jerlin, & Marimuthu, 2018; Li et al., 2018; Narudin, Feizollah, Anuar, & Gani, 2016). There is room for improvement in terms of accuracy. Owning to the fact that a large amount of data is available as training dataset, attention is drawn into deep learning which can further enhance the accuracy of the detection model.

Figure 1.

The worldwide statistics on the revenue of software development (by segment).

978-1-7998-7789-9.ch004.f01
Figure 2.

The global yearly number of malware attacks between 2015 and 2020.

978-1-7998-7789-9.ch004.f02

The research contributions are two-fold (i) a systematic review has been conducted to review the latest development of deep learning-based malware detection; and (ii) Future research directions on the enhancement of detection models.

This chapter is structured as follows. Section 2 introduces typical deep learning algorithms. This is followed by the systematic review in the deep learning algorithms for malware detection. At last, a conclusion is drawn.

Top

Typical Deep Learning Algorithms

Ten typical deep learning algorithms namely convolutional neural network (CNN), restricted Boltzmann machine (RBM), deep belief network (DBN), autoencoders (ATE), self organizing map (SOM), multilayer perceptron (MLP), generative adversarial network (GAN), recurrent neural network (RNN), gated recurrent unit (GRU), and long short-term memory network (LSTM), are briefly introduced with background information and their variants proposed in recent years. Table 1 summarizes the number of publications related to each deep learning algorithms in 2016-2021 (up to 10 August 2021) using the document search (TITLE-ABS-KEY) in Scopus. The Year 2021 is for reference only because there are four more months before the end of 2021.

Complete Chapter List

Search this Book:
Reset