Using an Artificial Neural Network to Improve Email Security

Using an Artificial Neural Network to Improve Email Security

Mohamed Abdulhussain Ali Madan Maki (Ahlia University, Bahrain) and Suresh Subramanian (Ahlia University, Bahrain)
DOI: 10.4018/978-1-7998-2418-3.ch006

Abstract

Email is one of the most widely used features of internet, and it is the most convenient method of transferring messages electronically. However, email productivity has been decreased due to phishing attacks, spam emails, and viruses. Recently, filtering the email flow is a challenging task for researchers due to techniques that spammers used to avoid spam detection. This research proposes an email spam filtering system that filters the spam emails using artificial back propagation neural network (BPNN) technique. Enron1 dataset was used, and after the preprocessing, TF-IDF algorithm was used to extract features and convert them into frequency. To select best features, mutual information technique has been applied. Performance of classifiers were measured using BoW, n-gram, and chi-squared methods. BPNN model was compared with Naïve Bayes and support vector machine based on accuracy, precision, recall, and f1-score. The results show that the proposed email spam system achieved 98.6% accuracy with cross-validation.
Chapter Preview
Top

Background

Theoretical Background

Machine learning is the development of algorithms that permit machines to learn. ML has been used in medical diagnosis, bioinformatics, Money fraud, stock market analysis, classifying DNS, speech recognition, computer games, and spam filtering (Bhuiyan et al., 2018),(R Manikandan, 2018).

Neural Network (NN) is a beautiful biologically inspired programming paradigm, which enables a computer to learn from observational data. Currently, the NN algorithm used widely in many problems, such as text categorizations, image, and speech recognition.

However, extracting the emails and classify them needs knowledge of Natural Language Processing (NLP) to normalize the datasets, extract and select the features to feed the classifiers (Ndumiyana, Magomelo, and Sakala, 2013)(Jayanthi and Subhashini, 2016).

NN has more efficiency in detecting spam because its supervised learning method and also errors can be corrected NB, DT, SVM, KNN are also good classifiers (Sharma, 2014).

The study will use BPNN to improve accuracy and performance in detecting email spam.

Complete Chapter List

Search this Book:
Reset