Reference Hub42
Spam Detection on Social Media Using Semantic Convolutional Neural Network

Spam Detection on Social Media Using Semantic Convolutional Neural Network

Gauri Jain, Manisha Sharma, Basant Agarwal
Copyright: © 2018 |Volume: 8 |Issue: 1 |Pages: 15
ISSN: 1947-9115|EISSN: 1947-9123|EISBN13: 9781522544661|DOI: 10.4018/IJKDB.2018010102
Cite Article Cite Article

MLA

Jain, Gauri, et al. "Spam Detection on Social Media Using Semantic Convolutional Neural Network." IJKDB vol.8, no.1 2018: pp.12-26. http://doi.org/10.4018/IJKDB.2018010102

APA

Jain, G., Sharma, M., & Agarwal, B. (2018). Spam Detection on Social Media Using Semantic Convolutional Neural Network. International Journal of Knowledge Discovery in Bioinformatics (IJKDB), 8(1), 12-26. http://doi.org/10.4018/IJKDB.2018010102

Chicago

Jain, Gauri, Manisha Sharma, and Basant Agarwal. "Spam Detection on Social Media Using Semantic Convolutional Neural Network," International Journal of Knowledge Discovery in Bioinformatics (IJKDB) 8, no.1: 12-26. http://doi.org/10.4018/IJKDB.2018010102

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, especially in case of text within the limited number of characters. Effective spam detection requires more number of efficient features to be learned. In the current article, the use of a deep learning technology known as a convolutional neural network (CNN) is proposed for spam detection with an added semantic layer on the top of it. The resultant model is known as a semantic convolutional neural network (SCNN). A semantic layer is composed of training the random word vectors with the help of Word2vec to get the semantically enriched word embedding. WordNet and ConceptNet are used to find the word similar to a given word, in case it is missing in the word2vec. The architecture is evaluated on two corpora: SMS Spam dataset (UCI repository) and Twitter dataset (Tweets scrapped from public live tweets). The authors' approach outperforms the-state-of-the-art results with 98.65% accuracy on SMS spam dataset and 94.40% accuracy on Twitter dataset.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.