Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

An Opcode-Based Malware Detection Model Using Supervised Learning Algorithms

Om Prakash Samantray, Satya Narayan Tripathy

Source Title: International Journal of Information Security and Privacy (IJISP) 15(4)

DOI: 10.4018/IJISP.2021100102

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

There are several malware detection techniques available that are based on a signature-based approach. This approach can detect known malware very effectively but sometimes may fail to detect unknown or zero-day attacks. In this article, the authors have proposed a malware detection model that uses operation codes of malicious and benign executables as the feature. The proposed model uses opcode extract and count (OPEC) algorithm to prepare the opcode feature vector for the experiment. Most relevant features are selected using extra tree classifier feature selection technique and then passed through several supervised learning algorithms like support vector machine, naive bayes, decision tree, random forest, logistic regression, and k-nearest neighbour to build classification models for malware detection. The proposed model has achieved a detection accuracy of 98.7%, which makes this model better than many of the similar works discussed in the literature.

Article Preview

Top

Introduction

Malware is known as malicious program designed with an intent to damage sensitive information stored in computers or mobile devices. Malware is a generic name used to represent different malicious software like virus, ransomware, rootkit, botnet, trojan, worm, adware and so on. These are the computer pollutants which enter into a system and find vulnerabilities of the operating system to execute unintended codes in the system (Behera & Bhaskari, 2017). Malware writers are designing malware not only to get some fame but also to have some financial benefits. Many anti–malware or malware detection systems have been developed so far but still the need for even more efficient detection strategy motivates the researchers to this domain. The signature based detection systems may not detect unknown and zero-day attacks because these detection systems use known signature databases to detect malware. Though signature based detection is good in identifying well-known malicious code but it may not be fit for detecting obfuscated code and previously unknown malware. The situation is even critical in case of metamorphic and polymorphic malware. Metamorphic and polymorphic malware are the malicious software which are created to evade detection engines. Malware creators usually create these malware by inserting different codes with similar functionalities. The purspose of metamorphic and polymorphic malware is same but they use different obfuscation and propagation techniques. Packing, encryption and compression are the most common obfuscation techniques used to change the appearance of malware so as to evade the detection engines. Therefore, there is a need of a detection model which can use the code (significantly opcode) of the executable as a feature to classify it as malware or benignware.

Malware analysis is an important step in malware research. This process is used to understand the structure and behavior of malware and benign samples. Analysis can be done either statically or dynamically. If an executable is analysed without execution, it is called as static analysis. Abimannan & Kumaravelu (2019) have presented a detailed mathematical description of heuristic based static malware analysis. Dynamic analysis is performed by executing the file in a safe and controlled environment to understand behavior of the file. These two analysis methods can also be combined to extract best features from the samples for classification(Vidyarthi et al., 2017). In this article, the operation codes of the samples are extracted using static analysis.

For solving a classification problem, machine learning methods are used on a feature set to train a model and then test the model. These algorithms can learn the patterns present in the training set and find these patterns in test dataset to classify the inputs either as malware or benign (Shabtai et al., 2012) . This article proposes an opcode based malware detection model using machine learning classification algorithms.

The features (opcodes) for this experiment are extracted by disassembling the sample programs using IDAPro tool. Upon disassembling the programs, the assembly language format of the executable instructions are obtained which comprise of many operation codes (Opcode). Opcode is the part of an instruction which states the action to be accomplished. Executable files (malware or benign) usually contain opcodes such as; SUB, ADD, AND, OR, XOR, INC, DEC, MOV, MOVZX, CALL, TEST, SBB, IMUL, CMP, RETN, PUSH, PUSHF, POP, NOP, JZ, JNZ, JMP, LEA, JB, FDIVP etc. They disclose important differences between malicious and legitimate programs (Bilar, 2007). Therefore, operation codes can be used as a feature in malware classification.

The contributions of this work are,

1.
Performing static analysis of malware and benign samples using IDAPro disassembler to generate an output file containing assembly language format of the input file.
2.
Applying the opcode extract and count (OPEC) algorithm on the output file(.asm) to create opcode count feature vector.
3.
Applying Extra Tree Classifier method on the feature vector to select relevant features for the experiment.
4.
Implementing machine learning algorithms on the dataset and comparing their results.

Complete Article List

Search this Journal:

Reset

Volume 18: 1 Issue (2024)

Volume 17: 1 Issue (2023)

Volume 16: 4 Issues (2022): 2 Released, 2 Forthcoming

Volume 15: 4 Issues (2021)

Volume 14: 4 Issues (2020)

Volume 13: 4 Issues (2019)

Volume 12: 4 Issues (2018)

Volume 11: 4 Issues (2017)

Volume 10: 4 Issues (2016)

Volume 9: 4 Issues (2015)

Volume 8: 4 Issues (2014)

Volume 7: 4 Issues (2013)

Volume 6: 4 Issues (2012)

Volume 5: 4 Issues (2011)

Volume 4: 4 Issues (2010)

Volume 3: 4 Issues (2009)

Volume 2: 4 Issues (2008)

Volume 1: 4 Issues (2007)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

An Opcode-Based Malware Detection Model Using Supervised Learning Algorithms

Abstract

Introduction

Complete Article List