Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Visual Feature-Based Image Spam Filters

Source Title: Advanced Image-Based Spam Detection and Filtering Techniques

DOI: 10.4018/978-1-68318-013-5.ch006

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

This chapter provides the details of visual feature based image spam filters, a literature review on these spam filters and their limitations. These methods are generally computationally efficient and exhibits more accuracy in presence of various noises compared to OCR based detection schemes, as they do not include any text recognition stage (Lamia et al., 2012). Previously discussed near-duplicate spam detection methods are likely to perform well in abstracting base templates, when given enough examples of various spam templates in use (Mehta et al., 2008). However, the generalization ability of these methods will be limited. Visual feature based spam detection methods are generally built using different high level and/or low level image features (refer Chapter 3 of this book) related to color, shape, texture characteristics of spam images; hence they have more generalization capability (Lamia et al., 2012). Mostly; these techniques exploit the text intensive and noisy nature of spam images.

Chapter Preview

Top

6.1. Previous Work

This section provides a detailed literature review of some good works proposed using image feature based techniques for spam detection.

Aradhye et al., (2005) exploits the text intensive nature of image spam by calculating 1) Extent of text feature - fraction of the total area of image that falls under text region, 2) Color Saturation Features - fraction of the total number of pixels in the image for which the difference max(R,G,B) – min(R,G,B) is greater than some threshold T (here, T=50 set) and 3) Color Heterogeneity Features. Figure 1 shows the distribution of Color Saturation Feature for both spam and ham images which shows good separable feature. However, no such good separation is observed in color heterogeneity feature. The authors claimed approximately 80% detection accuracy.

Figure 1.

Distribution of color saturation feature for spam and ham images

The method proposed by Nhung and Phuong (2007) uses computationally efficient edge based feature vector extraction to calculate vector of similarity (L1 distances) measures from an image to a small set of templates. Edge Directions (ED) and Edge Orientation Autocorrelogram (EOAC) are used as edge based translation and scale invariant features. ED is histogram of edge angles and reflects global shape information. Image spam’s text intensive nature is exploited in this scheme, as text elements have special shape characteristics that differentiate them from that of background or other elements. Figure 2 (b) shows the output of edge detection using Sobel operator for sample spam image (See Figure 2 (a)).

Figure 2.

Edge direction feature for sample spam image

The authors in this work have used SVM classifier in Weka Tool for experimentation on personal dataset only. Authors claim overall accuracy of 80% for the scheme. Using edge-based feature only may allow fast processing along with capturing regularities in shapes of text intensive spam images but may fail to achieve generalization capability.

In the same year, Byun et al., (2007) considered four spam image properties: color moment, color heterogeneity, conspicuousness, and self-similarity for image based spam detection (Byun et al., 2007). They applied multi-class characterization instead of single class characterization to improve detection robustness along with maximal figure-of-merit (MFoM) learning algorithm to design classifiers. Spam images are first categorized as text intensive synthetic/artificially modified images with diverse background region and non-synthetic/ images with no artificial modifications. Figure 3 (a)-(f) shows distribution of first and second order color moments in spam and ham images. The first order central moments shows wider separation compared to that of second order central moments here.

Figure 3.

Distribution of first and second order color moments in spam and ham images

The authors calculated color heterogeneity, by first scaling image by the maximum possible intensity in the RGB channels and converting scaled image to an indexed image by using minimum variance quantization. The RMS error between the original image and the indexed image is used as color heterogeneity feature which found no significant during our experiments; although natural images have more color heterogeneity and hence lower RMS errors than that of spam images. Calculation of conspicuousness feature - based on highly contrast property of spam images and self-similarity feature - based on uniform background property of spam images is highly computational. The authors claimed the detection rate of 81.5% and 5.6% of misclassification of legitimate images and good performance compared to the scheme discussed in the work (Aradhye et al., 2005).

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Visual Feature-Based Image Spam Filters

Abstract

6.1. Previous Work

Complete Chapter List