Northern Bald Ibis Algorithm-Based Novel Feature Selection Approach

Northern Bald Ibis Algorithm-Based Novel Feature Selection Approach

Ravi Kumar Saidala
DOI: 10.4018/IJSSCI.2019100102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Emails have become one of the popular and flexible web or mobile-based applications that enables users to communicate. For decades, the most severe problem identified in email applications was unwanted emails. Electronic spam is also referred as spam emails, in which unsolicited and unwanted mails are Received. Making an email mailbox clean by detecting and eliminating all the spam mails is a challenging task. Classification-based email filtering is one of the best approaches used by many researchers to deal with the spam email filtering problem. In this work, the NOA optimization algorithm and the SVM classifier are used for getting an optimal feature subset of the Enron-spam dataset and classifying the obtained optimal feature subset. NOA is a recently developed metaheuristic algorithm which is driven by mimicking the energy saving flying pattern of the Northern Bald Ibis (Threskiornithidae). The performance comparisons have been made with other existing methods. The superiority of the proposed novel feature selection approach is evident in the analysis and comparison of the classification results.
Article Preview
Top

1. Introduction

Electronic mail, in short email has become one of the popular and attractive web or mobile application. Emails are widely used feature of the internet which make the users to communicate worldwide by sending and receiving different kinds of emails with an email address. More plainly, a user can send and receive text files, images, PDF files, etc., either to individual or group of individuals seamlessly without any hesitation through a network (Golan et al., 2015). With the cheaper rates of bandwidth offered by different network service providers, even a naïve user is also using the email application with an ease. With the advent of mobile phones, especially mobile apps into the lives of people, the usage has gone tremendously high. Most of the currently existed mail providers provide email services with no cost up to some notable amount of space. In fact, this is one of the main reasons that the email service providers are getting some notable share in the internet traffic. According to the statistics, there are around 6 billion active email users throughout the globe using email on a regular basis for different purposes. Out of it, around 2 billion people are actively engaged with different emails which related to their personal or business purpose (Stolfo et al., 2019; Ashminov & Stein, 2019).

At one side this seems to be one the greatest achievement in this current tech-driven world, but unfortunately, this has become one of the business strategies for spammers for flooding users’ mailboxes with their business-related emails and getting profits from it (Hatton & John, 2017). These spam mails will lead users to spend their time for classifying mails into their desired categories and make them segregated. However, most of the users are feeling unhappy with these spam emails because of its unwanted filling of the inbox and wasting their valuable time for dealing unsolicited emails every time when they open mailboxes. Making users’ mailbox clear by detecting and eliminating all the spam mails manually is not a suggested way. So, automated spam filtering tools are needed to come into existence. These tools must analyse all the incoming mails effectively and categories which are relevant, and which are irrelevant. So far, many automated mail classifiers have been developed and these are extensively used by different mail service providers to classify the incoming emails into different categories. But many of them are not perfectly addressing the problem of eliminating spam emails from the inbox (Khan et al., 2015; Yu & Xu 2008).

Most of the datasets of real-world applications contain numerous features that may or may not relevant to the solution. Only the relevant feature is supposed to be extracted from the dataset. Removing those unwanted and redundant features from the dataset reflects its impact on results. The performance of Machine Learning strategies be relied on the type and the number of features that are extracted from the given problem. In fact, all these strategies are mainly focused on how we select the features of the dataset for getting a proper solution to the given problem. In general feature extraction would be done either by a human expert or by automated feature extraction tools like Principal Component Analysis, Deep Belief Network, Fuzzy based systems, rule-based techniques etc. (Idris et al., 2014; Idris et al., 2015; Wu 2009). In this paper, we present a new automated optimal feature extraction method for classifying the Enron-spam dataset. Recently added Metaheuristic algorithm into Computational Intelligence, i.e. Northern Bald Ibis optimization algorithm is used for extracting the optimal feature subset.

The structural organization of the remaining paper is presented as follows. The outline of review of the literature is presented in Section II. The Section III presents the existing optimal feature selection method where improved whale optimization algorithm is described. The Section IV presents the proposed optimal feature selection method where the standard Northern Bald Ibis Algorithm and SVM classification techniques are presented. The experimental result analysis has been furnished in Section V. Conclusions of the whole paper and future scope is presented in the last section.

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 1 Issue (2023)
Volume 14: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing