Article Preview
Top1. Introduction
Nowadays, one way to aid individuals and/or organizations in making intelligent decisions such as choosing among available options wisely is to draw upon the opinion of the crowd. Traditionally, many of us have depended on other people’s opinions, particularly those of family members, friends and relatives, when making decisions on critical issues (Pang & Lee, 2008; Saif, He, & Alani, 2012; Kharde & Sonawane, 2016; Xia, Zong, & Li, 2011; Cambria, Schuller, Xia, & Havasi, 2013). However, with rapid technological advances and the increasing ubiquity of the Internet in all corners of the world, many of us are now showing interests in social platforms, as these have made it relatively easy for us to know the thinking of not only family members and friends, but also of strangers around us (including willing experts who do not mind providing their educated advice) (Godbole, Srinivasaiah, & Skiena, 2007; Tan, Lee, Tang, Jiang, Zhou, & Li, 2011).
Accordingly, around 6,000 tweets are generally disseminated on Twitter every second; on average, this amounts to 500 million tweets daily or, 200 billion tweets annually. Platforms such as Facebook, Yelp and Amazon have accumulated a huge traffic of texts and opinions being generated daily. Such huge numbers means a lot of texts and data from all around the globe. Consequently, it has become crucial for individuals and/or organizations to be able to analyze these data meaningfully so as to be able to profit from, and/or capitalize on, these opinions to enhance one’s reputation (Balahur & Jacquet, 2015; Kumar, Morstatter, & Liu, 2014; Isah, Trundle, & Neagu, 2014; Jiang & Kotzias, 2016).
Sentiment analysis (SA), a process by which sentiment over the accumulated tweets can be automatically detected, is an increasingly popular means of analyzing “big data” such as “tweets” arising from the use of Twitter. Furthermore, such analysis allows the text polarity (whether it is neutral, positive or good, negative or bad), to be aggregated. Briefly, in order to classify the polarity of the accumulated text via sentiment classification (West, Paskov, Leskovec, & Potts, 2014; Cogburn & Espinoza-Vasquez, 2011; Gamallo & Garcia, n.d.), SA entails five fundamental steps: (1) collecting the data to be analyzed; (2) preprocessing the data; (3) extracting feature(s) linked to the data; (4) performing sentiment classification on the data; and (5) presenting result(s).
In essence, SA can be conducted at four different levels: Word, Sentence, Document and/or the Feature/Aspect level (Karlgren & Ericsson, 2013; Recupero & Cambria, 2014; Irsov & Cardie, 2014). At the Document level, the aim will be to aggregate the single sentiment polarity of the entire document by seeking out the sentiment polarities of all sentences combined in the document and then summarizing them. At the Sentence level, sentiment polarity of a sentence is first computed by identifying the sentiment polarity of each and every word in the sentence. These are then aggregated (Tan et al., 2011; Vijendra & Laxman, 2013; Vijendra, Sahoo, & Ashwini, 2010). At the Word level, sentiment polarity of each and every word is determined. At the Aspect/Feature level, the main concern will be to identify and extract product features from the source data. In this approach, the entities for which the sentiment may be directed will have to be identified, for example, if the sentiment analysis encompasses that of phone reviews, the differing aspects/features may include the camera, the screen, and the phone speaker.