Deep Learning Based Sentiment Analysis for Phishing SMS Detection

Deep Learning Based Sentiment Analysis for Phishing SMS Detection

Aakanksha Sharaff, Ramya Allenki, Rakhi Seth
DOI: 10.4018/978-1-7998-8061-5.ch001
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Sentiment analysis works on the principle of categorizing and identifying the text-based content and the process of classifying documents into one of the predefined classes commonly known as text classification. Hackers deploy a strategy by sending malicious content as an advertisement link and attack the user system to gain information. For protecting the system from this type of phishing attack, one needs to classify the spam data. This chapter is based on a discussion and comparison of various classification models that are used for phishing SMS detection through sentiment analysis. In this chapter, SMS data is collected from Kaggle, which is classified as ham or spam; while implementing the deep learning techniques like Convolutional Neural Network (CNN), CNN with 7 layers, and CNN with 11 layers, different results are generated. For evaluating these results, different machine learning techniques are used as a baseline algorithm like Naive Bayes, Decision Trees, Support Vector Machine (SVM), and Artificial Neural Network (ANN). After evaluation, CNN showed the highest accuracy of 99.47% as a classification model.
Chapter Preview
Top

Introduction

Text Classification

Text classification is one of the most important parts of text analysis. It is defined as the process of interpreting and extracting important information from the present textual data this data can be of any type like SMS, Twitter data, emoji, and short messages while talking about classification which is one of the major parts of sentiment analysis; which occurs to be the measuring people's attitude from the piece of text through which they are sharing their views. Views can be of different types based on user intent this can be understood through various examples, we saw over the internet sometimes inappropriate like abusive language and pornographic content; sentiment analysis also deals with classifying those data which helps the policymaker to understand the trend that is running in a market that solely depends on users’ reviews, feedbacks, and ratings. From a research point of view, some of the major challenges that could be solved through sentiment analysis like spam filtering, phishing attack, categorization, and summarization well over the decades, spamming and phishing based classification has been some of the most researched topics based on techniques like machine learning and deep learning. A good text classifier is a classifier that efficiently categorizes large sets of text documents in a reasonable amount of time with acceptable accuracy. Many techniques and algorithms for automatic text categorization have been devised.

Key Terms in this Chapter

Long Short-Term Memory: It comes under the field of deep learning and works on the feedback connection and especially of this network is the whole sequence of data.

Punctuation Count: It is used to find the total no of punctuation present in ham and spam messages.

Pooling: It is used to decrease the resolution of the feature map while preserving the features that are required for classification.

Frequency: Occurrence of no of times word appear during the text processing.

Message Length: It shows no. of messages present in ham as well as spam messages which later helpful for finding maximum length.

Gated Recurrent Unit: It is an advanced version of RNN. GRU uses the Gates for the information flow, and it is a two-step process with the Reset and Update gate.

Complete Chapter List

Search this Book:
Reset