Parallel Hybrid BBO Search Method for Twitter Sentiment Analysis of Large Scale Datasets Using MapReduce

Parallel Hybrid BBO Search Method for Twitter Sentiment Analysis of Large Scale Datasets Using MapReduce

Ashish Kumar Tripathi, Kapil Sharma, Manju Bala
Copyright: © 2019 |Pages: 17
DOI: 10.4018/IJISP.201907010107
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Sentiment analysis is an eminent part of data mining for the investigation of user perception. Twitter is one of the popular social platforms for expressing thoughts in the form of tweets. Nowadays, tweets are widely used for analyzing the sentiments of the users, and utilized for decision making purposes. Though clustering and classification methods are used for the twitter sentiment analysis, meta-heuristic based clustering methods has witnessed better performance due to subjective nature of tweets. However, sequential meta-heuristic based clustering methods are computation intensive for large scale datasets. Therefore, in this paper, a novel MapReduce based K-means biogeography based optimizer(MR-KBBO) is proposed to leverage the strength of biogeography based optimizer with MapReduce model to efficiently cluster the large scale data. The proposed method is validated against four state-of-the-art MapReduce based clustering methods namely; parallel K-means, parallel K-means particle swarm optimization, MapReduce based artificial bee colony optimization, dynamic frequency based parallel k-bat algorithm on four large scale twitter datasets. Further, speedup measure is used to illustrate the computation performance on varying number of nodes. Experimental results demonstrate that the proposed method is efficient in sentiment mining for the large scale twitter datasets.
Article Preview
Top

Introduction

From last one decade, enormous growth of the digital data has been observed (Gantz J, 2018). The social sites such as Instagram, Face-book, Twitter etc., are the major source of the digital data. Such huge availability of data has attracted the user based industries to analyze the sentiments of users for making business strategies. Thus, efficient data mining methods are required for the sentiment analysis of the social media. Twitter, one of the popular social media, provides a prodigious platform for the sentiment analysis. Twitter database has approximately 200 millions of users and nearby 400 million tweets are posted every day. Often, user shares their personal experiences about products or companies. Since, the maximum length of the tweet is 140 characters. Therefore, some short symbols like emoji are available for expressing the sentiments. The study of the tweets can deliver profound viewpoints and emotions about any subject (Asur, S., & Huberman, B. A. 2010). Sentiment analysis methods are mainly classified into three categories: machine learning based methods, hybrid methods and lexicon based methods (Medhat, W., Hassan, A., & Korashy, H. 2014). The methods based on lexicon require prior knowledge of sentiment lexicon to predict the sentiment. However, for the short-hand and emoji based texts, lexicon-based methods fail to perform well (Khan, A. Z., Atique, M., & Thakare, V. M. 2015). Pandey et al. (Pandey, A. C., Rajpoot, D. S., & Saraswat, M. 2017) proposed a hybrid cuckoo search based method for Twitter sentiment analysis and concluded that emoticons are good predictors of the sentiments for short texts. Further, Canuto et al. (Canuto, S., Gonçalves, M. A., & Benevenuto, F. 2016) used the meta level features for prediction of sentiments. Bravo et al. (Bravo-Marquez, F., Mendoza, M., & Poblete, B. 2013) proposed a supervised approach to amalgamate the strengths of emotions and polarities for revamping the twitter opinion prediction. Furthermore, Mohammad et al. (Mohammad, S. M., Zhu, X., Kiritchenko, S., & Martin, J. 2015) employed the supervised classifier to analyze emotion stimulus, emotion state, and intent of tweets for the US election. An ontology-based method was introduced for sentiment analysis of tweets by Kontopoulos et al. (Kontopoulos, E., Berberidis, C., Dergiades, T., & Bassiliades, N. 2013) where a sentiment grade was allocated to each different perception in the tweet. Agarwal et al. (Agarwal, B., Mittal, N., Bansal, P., & Garg, S. 2015) presented a novel approach using common sense information taken from ConceptNet based ontology method for the sentiment analysis. Furthermore, SentiCircle method was introduced by Saif, et al. (Saif, H., He, Y., Fernandez, M., & Alani, H. 2016), in which context specific polarity of words was determined. Qiu et al. (Qiu, G., Liu, B., Bu, J., & Chen, C. 2009, July) proposed a semi-supervised algorithm for sentiment mining. Furthermore, Pandarachalil et al. (Pandarachalil, R., Sendhilkumar, S., & Mahalakshmi, G. S. 2015) introduced a distributed unsupervised approach for extraction of the lexicons. Likewise, Fernndez et al. (Fernández-Gavilanes, M., Álvarez-López, T., Juncal-Martínez, J., Costa-Montenegro, E., & González-Castaño, F. J. 2016) proposed an unsupervised algorithm to predict the sentiment polarity of the informal texts using linguistic sentiment propagation model.

Complete Article List

Search this Journal:
Reset
Volume 18: 1 Issue (2024)
Volume 17: 1 Issue (2023)
Volume 16: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 15: 4 Issues (2021)
Volume 14: 4 Issues (2020)
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing