Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Insider Threat Detection Using Supervised Machine Learning Algorithms on an Extremely Imbalanced Dataset

Naghmeh Moradpoor Sheykhkanloo, Adam Hall

Source Title: International Journal of Cyber Warfare and Terrorism (IJCWT) 10(2)

DOI: 10.4018/IJCWT.2020040101

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

An insider threat can take on many forms and fall under different categories. This includes malicious insider, careless/unaware/uneducated/naïve employee, and the third-party contractor. Machine learning techniques have been studied in published literature as a promising solution for such threats. However, they can be biased and/or inaccurate when the associated dataset is hugely imbalanced. Therefore, this article addresses the insider threat detection on an extremely imbalanced dataset which includes employing a popular balancing technique known as spread subsample. The results show that although balancing the dataset using this technique did not improve performance metrics, it did improve the time taken to build the model and the time taken to test the model. Additionally, the authors realised that running the chosen classifiers with parameters other than the default ones has an impact on both balanced and imbalanced scenarios, but the impact is significantly stronger when using the imbalanced dataset.

Article Preview

Top

1. Introduction

Insider attacks present a considerable issue in the cyber-threat landscape, with 40% of organisations labelling the vector as the most damaging attack faced (Cole, 2017) and (Moradpoor, 2017). In 2016, the containment and remediation of reported insider threats cost affected organisations 4 million dollars on average (Ponemon Institute, 2016). In addition, insider threats are extremely common among cyber-incidents; in 2015, 55% of cyber-attacks were insider threat cases (Bradley, 2015). Despite the high cost and frequent occurrence of insider threat attacks, detection and mitigation remain a problem. In 2018, 90% of companies are regarded vulnerable (Insiders, 2018). A further 38% of companies acknowledge that their insider threat detection and prevention capabilities are not adequate (Cole, 2017). This disparity demonstrates a significant gap between the current advancements in insider threat detection, and the requirements of businesses. Given the availability of computational resources, it is feasible to use Machine Learning (ML) techniques to solve problems of larger complexity than has previously been possible. A strong precedent of this can be observed in recent history with the growth of the field of Big Data. This is also exemplified by the historic achievement of Google Deepmind (Hassabis, 2017), creating a machine learning algorithm which masters the immensely complex board game Go (Silver, 2016). Most organisations have the resources to keep logs of employee interactions with technology. By harnessing the data produced through logging, this information could be digested into a format upon which predictions regarding insider threat cases could be made. Having said this, a data driven approach to insider threat mitigation is not a new idea, this is a field experiencing an increasing rate of publication. However, vanguard attempts still report more effective models than later cases where machine learning has been applied (Gheyas, 2016).

In machine learning/data mining projects, an imbalanced dataset is a dataset in which the number of observations belonging to one class is considerably lower than those belonging to other class/classes. A predictive model employing conventional machine learning algorithms could be biased and inaccurate when being employed on such datasets. This is purely because machine learning algorithms are designed to improve accuracy by reducing the error in the network. Therefore, they do not consider the class distribution, class proportion, or balance of the classes in their classification process. A predictive machine learning model being bias or inaccurate can be predominant in scenarios where the minority class belongs to the malicious activities and the anomaly detection is extremely crucial. This includes scenarios such as: occasional fraudulent transactions in banks, irregular insider threats, rare disease identification, natural disaster such as earthquakes, and periodic malicious activities on critical infrastructures (e.g. infrequent attacks on nuclear power plants or water supply systems in a city). Given the importance of these scenarios, an inaccurate classification by a predictive machine learning model could cost thousands of lives or huge cost to individuals and/or organisations. There are several techniques to solve such class imbalance problems using various sampling/non-sampling mechanisms e.g. oversampling, undersealing and SMOTE as well as ensemble methods and cost-based techniques. However, the importance of an imbalanced dataset has not been clearly and adequately investigated in the literature particularly for machine learning-based solutions for insider threat detections.

Complete Article List

Search this Journal:

Reset

Volume 14: 1 Issue (2024)

Volume 13: 1 Issue (2023)

Volume 12: 4 Issues (2022): 2 Released, 2 Forthcoming

Volume 11: 4 Issues (2021)

Volume 10: 4 Issues (2020)

Volume 9: 4 Issues (2019)

Volume 8: 4 Issues (2018)

Volume 7: 4 Issues (2017)

Volume 6: 4 Issues (2016)

Volume 5: 4 Issues (2015)

Volume 4: 4 Issues (2014)

Volume 3: 4 Issues (2013)

Volume 2: 4 Issues (2012)

Volume 1: 4 Issues (2011)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Insider Threat Detection Using Supervised Machine Learning Algorithms on an Extremely Imbalanced Dataset

Abstract

1. Introduction

Complete Article List