Image Spam Detection Scheme Based on Fuzzy Inference System

Image Spam Detection Scheme Based on Fuzzy Inference System

Copyright: © 2017 |Pages: 19
DOI: 10.4018/978-1-68318-013-5.ch007

Abstract

The evasion techniques used by image spam impose new challenges for e-mail spam filters. Effectual image spam detection requires selection of discriminative image features and suitable classification scheme. Existing research on image spam detection utilizes only visual features such as color, appearance, shape and texture, while no efforts is made to employ statistical noise features. Further, most image spam classification schemes assume existence of clear cut demarcation between extracted features from genuine image and image spam dataset. In this chapter, we attempt to solve these issues; by proposing a novel server side solution called F-ISDS (Fuzzy Inference System based Image Spam Detection Scheme). F-ISDS considers statistical noise features along with the standard image features and meta-data features. F-ISDS employs dimensionality reduction using Principal Component Analysis (PCA) to map selected set of n features into a set of m principal components. Based on the selected significant principal components, input/output membership functions and rules are designed for Fuzzy Inference System (FIS) classifier. FIS provides a computationally simple and an intuitive means of performing the image spam detection. Email server can tag email with this knowledge so that client can take decision as per the local policy. Further, a Linear Regression Analysis is used to model the relationship between selected principal components and extracted features for classification phase. Experimental results confirm the efficacy of the proposed solution.
Chapter Preview
Top

7.2. Proposed F-Isds

The proposed model consists of Feature Extraction Module, PCA Module, Linear Regression Analysis Module and FIS based classifier as shown in Figure 1.

Figure 1.

Proposed F-ISDS model

978-1-68318-013-5.ch007.f01

Given a batch of image attachments from an e-mail server, the Feature Extraction Module would first extract significant features from each image. During the training phase, the selected set of 978-1-68318-013-5.ch007.m01 features, 978-1-68318-013-5.ch007.m02 are mapped onto a set of 978-1-68318-013-5.ch007.m03 principal components 978-1-68318-013-5.ch007.m04, using PCA module. Linear Regression Analysis is used to model the relationship between 978-1-68318-013-5.ch007.m05and978-1-68318-013-5.ch007.m06. Based on the selected significant 978-1-68318-013-5.ch007.m07values, input/output membership functions and rules are designed for Fuzzy Inference System (FIS) based classifier. In classification phase, the features 978-1-68318-013-5.ch007.m08are extracted from the test image dataset and are used to predict the principal components 978-1-68318-013-5.ch007.m09, using the linear equations modeled during the training phase. The 978-1-68318-013-5.ch007.m10values are then fed to the designed FIS and used to classify into Genuine (Ham) or Spam.

Complete Chapter List

Search this Book:
Reset