Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Spam Image Clustering for Identifying Common Sources of Unsolicited Emails

Chengcui Zhang, Xin Chen, Wei-Bang Chen, Lin Yang, Gary Warner

Source Title: New Technologies for Digital Crime and Forensics: Devices, Applications, and Software

DOI: 10.4018/978-1-60960-515-5.ch006

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

In this article, we propose a spam image clustering approach that uses data mining techniques to study the image attachments of spam emails with the goal to help the investigation of spam clusters or phishing groups. Spam images are first modeled based on their visual features. In particular, the foreground text layout, foreground picture illustrations and background textures are analyzed. After the visual features are extracted from spam images, we use an unsupervised clustering algorithm to group visually similar spam images into clusters. The clustering results are evaluated by visual validation since there is no prior knowledge as to the actual sources of spam images. Our initial results show that the proposed approach is effective in identifying the visual similarity between spam images and thus can provide important indications of the common source of spam images.

Chapter Preview

Top

Introduction

Spamming is a problem that affects people all over the world. Spam is an unsolicited email which has been sent to many people. There can be legal spam, where the sender gave proper contact information and also has an option to no longer receive the messages. However, in almost all situations, spam is illegal. It is an unsolicited mail that the recipient did not ask to receive and did not give the sender permission to send. Spam falsifies the sender information to prevent anyone from finding out where it has been sent from. Botnets are machines that keep on sending spam. Today, botnets are the main choice for cyber criminals who seek to conceal their identities by using third-party computers as vehicles for their crimes (www.cnn.com/2007/TECH/11/29/fbi.botnets). Spam sometimes attempts to sell a product, convey some messages, or they might also try to trick the recipient to become infected, or attempt to lure them into visiting a website that can infect them.

Spam can cause a lot of problems to internet users. More than 90% of the emails sent on the internet are spam. Billions of dollars of counterfeit software, electronics, as well as shoes, watches, etc., are being sold because of spam advertisements. In this way, huge financial loss occurs to the companies. Spam emails, claiming to be from banks, might also lure users to give their usernames and passwords. Besides software piracy and viruses, spam is also the primary means of phishing and identity theft. Therefore, spam email analysis is one of the most important topics in cyber security. The most effective way of controlling spam emails at the moment is filtering (Carreras & Mrquez, 2001; Clark, Koprinska, & Poon, 2003; Drucker, Wu, & Vapnik, 1999; Sanpakdee, Walairacht, & Walairacht, 2006). However, filters can only differentiate spam emails from non-spam emails but cannot tell the origins of spam. In order to hide their origins, escape detection and spam filter analysis, and to conceal the fact that there are relatively few organizations creating the vast majority of these unsolicited emails, criminals use a variety of intentional obscuring techniques. For example, one of the techniques is to present text primarily as an image, to avoid traditional computer-based filtering of the text. Spam images are sent for two reasons: 1) for advertisement purposes; 2) to hide the textual contents of an email from spam filters. Having no words in the message will not allow spam filters to understand the nature of the message.

Spam images are harder to detect than text spam. Spam images are created when text is embedded into images and content obscuring technologies are used to defeat spam blocking techniques. Spammers use certain methods to defeat traditional anti-spam technologies such as fingerprinting (e.g., md5 (Rivest, 1992)), OCR, and URL blocklist.

1.
A text can be embedded in an image which appears as normal text to the recipient but the spam blocking technologies will never be able to “see” the text as it is actually an image.
2.
Spammers vary the space between words and lines and also add random speckles to make messages look different to different recipients, though all of them have the same text. By this way, they evade fingerprinting technology such as md5 (Rivest, 1992) by making the images appear unique to standard spam analysis.
3.
Use of different colors and varying font size makes it impossible for OCR techniques to find out spam. Also, splitting up one word into two halves with a gap in between deceives OCR techniques.
4.
Botnets are also becoming efficient and they can produce a large number of random images within a short time.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Spam Image Clustering for Identifying Common Sources of Unsolicited Emails

Abstract

Introduction

Complete Chapter List