Spam Detection Approaches with Case Study Implementation on Spam Corpora

Spam Detection Approaches with Case Study Implementation on Spam Corpora

Biju Issac
ISBN13: 9781609600150|ISBN10: 1609600150|EISBN13: 9781609600174
DOI: 10.4018/978-1-60960-015-0.ch012
Cite Chapter Cite Chapter

MLA

Issac, Biju. "Spam Detection Approaches with Case Study Implementation on Spam Corpora." Cases on ICT Utilization, Practice and Solutions: Tools for Managing Day-to-Day Issues, edited by Mubarak S. Al-Mutairi and Lawan Ahmed Mohammed, IGI Global, 2011, pp. 194-212. https://doi.org/10.4018/978-1-60960-015-0.ch012

APA

Issac, B. (2011). Spam Detection Approaches with Case Study Implementation on Spam Corpora. In M. Al-Mutairi & L. Mohammed (Eds.), Cases on ICT Utilization, Practice and Solutions: Tools for Managing Day-to-Day Issues (pp. 194-212). IGI Global. https://doi.org/10.4018/978-1-60960-015-0.ch012

Chicago

Issac, Biju. "Spam Detection Approaches with Case Study Implementation on Spam Corpora." In Cases on ICT Utilization, Practice and Solutions: Tools for Managing Day-to-Day Issues, edited by Mubarak S. Al-Mutairi and Lawan Ahmed Mohammed, 194-212. Hershey, PA: IGI Global, 2011. https://doi.org/10.4018/978-1-60960-015-0.ch012

Export Reference

Mendeley
Favorite

Abstract

Email has been considered as one of the most efficient and convenient ways of communication since the users of the Internet has increased rapidly. E-mail spam, known as junk e-mail, UBE (unsolicited bulk e-mail) or UCE (unsolicited commercial e-mail), is the act of sending unwanted e-mail messages to e-mail users. Spam is becoming a huge problem to most users since it clutter their mailboxes and waste their time to delete all the spam before reading the legitimate ones. They also cost the user money with dial up connections, waste network bandwidth and disk space and make available harmful and offensive materials. In this chapter, initially we would like to discuss on existing spam technologies and later focus on a case study. Though many anti-spam solutions have been implemented, the Bayesian spam detection approach looks quite promising. A case study for spam detection algorithm is presented and its implementation using Java is discussed, along with its performance test results on two independent spam corpuses – Ling-spam and Enron-spam. We use the Bayesian calculation for single keyword sets and multiple keywords sets, along with its keyword contexts to improve the spam detection and thus to get good accuracy. The use of porter stemmer algorithm is also discussed to stem keywords which can improve spam detection efficiency by reducing keyword searches.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.