A New Feature Selection Method Based on Dragonfly Algorithm for Android Malware Detection Using Machine Learning Techniques

A New Feature Selection Method Based on Dragonfly Algorithm for Android Malware Detection Using Machine Learning Techniques

Mohamed Guendouz, Abdelmalek Amine
Copyright: © 2023 |Pages: 18
DOI: 10.4018/IJISP.319018
Article PDF Download
Open access articles are freely available for download

Abstract

Android is the most popular mobile OS; it has the highest market share worldwide on mobile devices. Due to its popularity and large availability among smartphone users from all around the world, it becomes the first target for cyber criminals who take advantage of its open-source nature to distribute malware through applications in order to steal sensitive data. To cope with this serious problem, many researchers have proposed different methods to detect malicious applications. Machine learning techniques are widely being used for malware detection. In this paper, the authors proposed a new method of feature selection based on the dragonfly algorithm, named BDA-FS, to improve the performance of Android malware detection. Different feature subsets selected by the application of this proposed method in combination with machine learning were used to build the classification model. Experimental results show that incorporating dragonfly algorithm into Android malware detection performed better classification accuracy with few features compared to machine learning without feature selection.
Article Preview
Top

1. Introduction

Android, the Linux-based open-source mobile operating system is the largest used mobile OS in the world, it dominates the smartphone OS market with 73% share which makes it the most popular OS in the world, with over 2.5 billion active users. That success is due to the open-source nature of Android itself and for the large availability of smartphones that run it on the one hand, and on the other hand, the large number of apps and games freely available and easily accessible for users. Figure 1 shows the number of available applications in Google Play Store from December 2009 to March 2022.

Android applications are mainly available for download on the Google Play Store which is the official Google app store, and other manufacturer-specific app stores such as: Samsung, Huawei, Xiaomi. Android applications are also available on many unofficial and unsecure third-party websites in a form of APK files. Applications downloaded from these third-party websites could be very dangerous and might contain malware codes since they are not verified by Google or any other device manufacturer, thus, it is necessary to detect malware applications in order to protect user personal data and device integrity.

Figure 1.

Number of Available Applications in the Google Play Store from December 2009 to March 2022

IJISP.319018.f01

The primary goal of mobile device malware is to gain access to user data stored locally on the device or on cloud as well as user information used in sensitive financial transactions in mobile banking apps. Mobile malware can be distributed in a variety of ways, including infected file attachments, shared files via Bluetooth and SMS phishing attacks. However, the primary malware distribution channel on mobile devices is currently app stores. According to a recent G DATA's Mobile Security Report (G DATA, 2022), the company's security experts counted more than 2.5 million malware apps for Android devices in 2021. As a result of these factors, Android malware is becoming increasingly problematic for both enterprise and individual users.

In order to deal with those dangerous attacks, researchers have proposed various methods and techniques to effectively detect malware apps on Android. Many of these methods use machine learning algorithms to classify Android apps into benign or harmful using popular classification algorithms. One of the most used techniques in literature is to use Android permissions as features to train and build one or multiple classification models, this type of techniques are known as permission-based methods.

In permission-based malware detection methods, generally the complete set of features is used as input for training classification algorithms without prior feature selection, because of the large number of Android permissions, which can exceed 150 permissions (XU, Zhang & Zhu, 2013), using the whole set of features makes training more difficult and can decrease detection accuracy. Feature selection is an essential stage in all machine learning-based techniques. Obtaining an appropriate feature set will not only help in enhancing classification accuracy, but will also help in decreasing the curse of dimensionality associated with most machine learning-based techniques.

In this paper, a novel permission-based machine learning method for Android malware detection with feature selection using dragonfly optimization algorithm is presented. The main contributions of this paper are summarized as follows:

  • 5,000 malicious applications from different malware families and 5,000 benign Android applications from multiple categories were used to generate the dataset.

  • Android permissions were extracted from each application in the dataset and used to generate the feature vector.

  • A new feature selection method based on Dragonfly algorithm was proposed to select the most relevant permissions for Android malware detection using five machine learning algorithms.

  • The performance of our proposed system is demonstrated through experiments using various evaluation metrics.

Complete Article List

Search this Journal:
Reset
Volume 18: 1 Issue (2024)
Volume 17: 1 Issue (2023)
Volume 16: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 15: 4 Issues (2021)
Volume 14: 4 Issues (2020)
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing