ML-EC2: An Algorithm for Multi-Label Email Classification Using Clustering

ML-EC2: An Algorithm for Multi-Label Email Classification Using Clustering

Aakanksha Sharaff, Naresh Kumar Nagwani
DOI: 10.4018/IJWLTT.2020040102
Article PDF Download
Open access articles are freely available for download

Abstract

A multi-label variant of email classification named ML-EC2 (multi-label email classification using clustering) has been proposed in this work. ML-EC2 is a hybrid algorithm based on text clustering, text classification, frequent-term calculation (based on latent dirichlet allocation), and taxonomic term-mapping technique. It is an example of classification using text clustering technique. It studies the problem where each email cluster represents a single class label while it is associated with set of cluster labels. It is multi-label text-clustering-based classification algorithm in which an email cluster can be mapped to more than one email category when cluster label matches with more than one category term. The algorithm will be helpful when there is a vague idea of topic. The performance parameters Entropy and Davies-Bouldin Index are used to evaluate the designed algorithm.
Article Preview
Top

Literature Review

Managing huge amount of emails received from users is a very challenging problem which needs to be solved in an effective and efficient way. Various researches have been done in the field of email mining. Some of the surveys done are as follows.

Park & An (2010) proposed an Email multicategory classification approach using semantic features and a dynamic category hierarchy reconstruction method in which the user reorganizes all e-mail messages into categories. Guan & Yuan (2013) reviews the existing work on mislabeled data detection techniques for pattern classification and classifies them into three types: Local learning-based, ensemble learning-based and single learning-based methods. The author Armentano & Amandi (2014) presented an approach to label the incoming emails based on user preference; a set of experiments using Google’s webmail system, Gmail is performed to obtain a good rate of acceptance of the agent interactions. Alsmadi & Alhami (2015) introduced an algorithm for performing clustering and classification of email text corpus. They have proposed a model for classification of emails based on subject and folder using N-grams. Islam et al. (2009) proposed a new technique of e-mail classification based on the analysis of grey list (GL), which uses multi-classifier classification ensembles of statistical learning algorithms.

Complete Article List

Search this Journal:
Reset
Volume 19: 1 Issue (2024)
Volume 18: 2 Issues (2023)
Volume 17: 8 Issues (2022)
Volume 16: 6 Issues (2021)
Volume 15: 4 Issues (2020)
Volume 14: 4 Issues (2019)
Volume 13: 4 Issues (2018)
Volume 12: 4 Issues (2017)
Volume 11: 4 Issues (2016)
Volume 10: 4 Issues (2015)
Volume 9: 4 Issues (2014)
Volume 8: 4 Issues (2013)
Volume 7: 4 Issues (2012)
Volume 6: 4 Issues (2011)
Volume 5: 4 Issues (2010)
Volume 4: 4 Issues (2009)
Volume 3: 4 Issues (2008)
Volume 2: 4 Issues (2007)
Volume 1: 4 Issues (2006)
View Complete Journal Contents Listing