An Integrated Machine Learning Framework for Fraud Detection: A Comparative and Comprehensive Approach

An Integrated Machine Learning Framework for Fraud Detection: A Comparative and Comprehensive Approach

Karim Ouazzane, Thekla Polykarpou, Yogesh Patel, Jun Li
Copyright: © 2022 |Pages: 17
DOI: 10.4018/IJISP.300314
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The research develops a practical Machine Learning framework with a comparative and comprehensive approach to sequence-learn and then detect the online banking payment fraud. The integrated framework introduces exploratory analysis and feature engineering, multiple modelling and performance comparison, and model robustness, uncertainty and sensitivity analysis toward a systematic approach for Machine Learning applications. For demonstration purpose, the framework is implemented on a set of real-life online banking transaction datasets obtained from a UK-based bank through three models, i.e., Support Vector Machine, Markov Model and LSTM model, with various combinational features of the datasets evidenced in the exploratory analysis and modelling with noise ratios of datasets, range values of model parameters and confidence intervals of prediction results. The modelling results show that overall, the LSTM model achieves the best performance, with outcome accuracy of 97.7%, indicating its advantage in modelling sequential data such as customer behaviours.
Article Preview
Top

2. Recent Development In Banking Fraud Detection

In the field of credit card fraud detection and computer intrusion, most of the work carried out so far use learning techniques, such as SVM, NN, Decision Trees, Logistic Regression and Markov models etc.

SVM binary classification is well suited for detecting the legal and illegal frauds. SVM has the advantage over NN when the learning objective is a non-convex problem with multiple local minima, in which a NN tends to get stuck at a saddle point (Cristianini and Taylor, 2000; Vapnik, 1999). Another weakness of a Neural Network model is its large amount of hyperparameters to be tuned (Cristianini and Taylor, 2000). Sahin and Duma (2011) explore the use of SVM for credit card fraud detection based on feature engineering and profiling techniques, and then compare the model with a decision tree. The results show that SVM generalise better than the decision tree with a limited dataset. However, the growth in dataset size resulted in its performance decreasing. Several studies (Benard, 2007; Ando, 2016; Li, 2017; Batani, 2017) address the fraud detection as a sequence classification problem. A variant of Recurrent Neural Networks (RNN) called Long Short-Term Memory (LSTM) is used to profile the customer behaviours over timesteps. It processes sequences of values and shares the parameters across different part of the model (Goodfellow et. al., 2013). The statistical strength of sharing over timesteps across different input sequence lengths allows it to be generalised well for time-series problems. The state of the system at given point in time can be given by:

IJISP.300314.m01
where St is the state of the system at time t, IJISP.300314.m02 is a deterministic function and Xt is the input vector. Ando et al. (2016) apply RNNs and SVMs to profile fraudulent behaviours from web log data and show that the RNN model with LSTM outperforms other models. Wise (2007) also demonstrates the advantage of LSTM to model customer behaviours to better detect the credit card fraud over Feed-Forward Neural Networks and SVM models.

Complete Article List

Search this Journal:
Reset
Volume 18: 1 Issue (2024)
Volume 17: 1 Issue (2023)
Volume 16: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 15: 4 Issues (2021)
Volume 14: 4 Issues (2020)
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing