Spam Filtering and Detection: State of the Art and Overview

Spam Filtering and Detection: State of the Art and Overview

Yasmin Bouarara (Dr. Tahar Molay University of Saida, Algeria)
Copyright: © 2019 |Pages: 11
DOI: 10.4018/978-1-5225-7338-8.ch010


In today's world of globalization and technology without borders, the emergence of the internet and the rapid development of telecommunications have made the world a global village. Recently, the email service has become immensely used, and the main means of communication because it is cheap, reliable, fast, and easy to access. In addition, it allows users with a mailbox (BAL) and email address to exchange messages (images, files, and text documents) from anywhere in the world via the internet. Unfortunately, this technology has become undeniably the original source of malicious activity, in particular the problem of unwanted emails (spam), which has increased dramatically over the past decade. According to the latest report from Radicati Group, which provides quantitative and qualitative research with details of the e-mail, security, and social networks, published in 2012, 70-80% of email traffic consists of spam. The goal of the chapter is to give a state of the art on spam and spam techniques and the disadvantages of this phenomenon.
Chapter Preview


After the mailbox of our home, it is now our electronic mailbox with the explosion of Internet use. Nowadays, social networking sites such as Facebook, Myspace, and Twitter have become one of the main vectors for users to keep track and communicate with their friends online. Merely, the number of electronic mail box is increasing. Each user has at least an email address; the minimum number of BALE (Box for Electronic Arts) is 800 million worldwide. Such mass approachable person is of course a boon to advertisers, but also a favored means of communication for the spammers, scammers, hackers, political, publicity…. etc.

In this chapter we will see an overview concerning the preliminary concepts of spam detection and the different spam detection techniques existed in literature.

Spam History

The real origin of the term “SPAM” comes from 1970 Monty Python’s Flying Circus skit. In this skit, all the restaurant’s menu items devolve into SPAM. When the waitress repeats the word SPAM, a group of Vikings in the corner sing “ SPAM, SPAM, lovely SPAM Wonderful SPAM” drowning out other conversation, until they are finally told to shut it.

Although the first spam message had already been sent via telegram in 1864, then it was send as commercial e-mail occurred in 1978, the term spam for this practice had not yet been applied in the 1980s. It was adopted to describe certain users who frequented BB (Bulletin board is a computer system running software that allows users to dial into the system over a phone line or Telnet), who would repeat “SPAM” a huge number of times to scroll other users’ text off the screen in early chat rooms services like the early days of AOL (Glasner, 2001).


Spam is considered to be an unsolicited commercial electronic message (figure 1). It is often a source of scams, computer viruses and offensive content that takes up valuable time and increases costs for consumers, business and governments (Cormack, 2007).

Figure 1.

A model of spam email


The Different Types Of Spam

The most common spam is of course linked to spam emails. Nevertheless, there are different forms of spam:

Spam Voice Over IP

The spam VoIP also called SPIT or vishing SPLIT is a new kind of spam via the telephone and it's like Anonymous Call issued at any time of day or night, are issued to raise (as phishing technique) personal information (Saberi, 2007).

The Spam Messages in the Discussion Forums

This is an advertising message (containing commercial nature hyperlinks) left on some forums the goal is the same as the spam received by email: advertise for free (Saberi, 2007).

Spam in Blogs (SIG)

It is called SPLOG (contraction of spam and blog). It is a very popular technique it's to let Internet users on blogs with links to advertising sites (Fumera, 2007).


It is called filoutage or hameçonnage in french as presents the next figure 2. It is a technique by which attackers pose major corporations or financial institutions that are familiar by sending fraudulent e-mails. It retrieves passwords of bank accounts or credit card numbers. In this case the hacker could create a false social network page (Facebook, Twitter, ….etc.) appearance entirely legitimate. Then, when you try to connect the fake page, it saves your information with your user name and password in hand [A3].

Figure 2.

A phishing model (Chirita2005)


Complete Chapter List

Search this Book: