A New SVM Method for Recognizing Polarity of Sentiments in Twitter

A New SVM Method for Recognizing Polarity of Sentiments in Twitter

Sanjiban Sekhar Roy (VIT University, India), Marenglen Biba (University of New York – Tirana, Albania), Rohan Kumar (VIT University, India), Rahul Kumar (VIT University, India) and Pijush Samui (NIT Patna, India)
DOI: 10.4018/978-1-5225-2128-0.ch009

Abstract

Online social networking platforms, such as Weblogs, micro blogs, and social networks are intensively being utilized daily to express individual's thinking. This permits scientists to collect huge amounts of data and extract significant knowledge regarding the sentiments of a large number of people at a scale that was essentially impractical a couple of years back. Therefore, these days, sentiment analysis has the potential to learn sentiments towards persons, object and occasions. Twitter has increasingly become a significant social networking platform where people post messages of up to 140 characters known as ‘Tweets'. Tweets have become the preferred medium for the marketing sector as users can instantly indicate customer success or indicate public relations disaster far more quickly than a web page or traditional media does. In this paper, we have analyzed twitter data and have predicted positive and negative tweets with high accuracy rate using support vector machine (SVM).
Chapter Preview
Top

Introduction

There has been a colossal surge in user generated opinion-rich data since the Web 2.0 era. Millions of people post opinions on all aspects of life every day. Social networks such as Facebook, LinkedIn and the microblogging website Twitters are the mainstays of this massive, continuous stream of user-generated content on a wide range of topics. Easy availability of such data via APIs or publicly available data sets has led to new opportunities in the field of Sentiment Analysis and Opinion Mining. Sentiment Analysis is the process of computationally identifying and extracting opinions in natural language texts. Furthermore, known as Opinion Mining, Sentiment analysis uses Natural Language Processing and text data mining (text analysis) to understand opinions expressed in the text. It is not only applied to social media data but also to product reviews, news articles and other public opinion texts. Given the direct applicability to real-world problems such as understanding customer opinion, financial prediction and disaster management, it is imperative that Sentiment Analysis prevails as one of the most researched topics today. There are three different classification levels in sentiment analysis: document level, sentence level and aspect level. The character limitation on Twitter posts may suggest sentence-level classification as the most suited classification; however, given the informal nature of tweets, we expect the sentiment to be compact and explicit. Hence, our paper focuses on the document-level approach to Sentiment Analysis. Twitter is considered one of the best sources of opinion-rich data because a large number of people share and discuss their thoughts and opinions on the platform. That is to say, we focus on Tweets from Twitter in this paper. Polarity classification is the fundamental task in Sentiment Analysis. It can be achieved through multiple approaches. Lexicon-based methods decide the polarity of the document based on polarities of individual words and phrases in the document. Machine Learning based methods aim at building classifiers to resolve polarity and identify which category a document belongs to. We focus on the Machine Learning based approach. Over the years, a large amount of wide-reaching research has been carried out in the area of sentiment classification on large texts as explained by Pang and Lee(2008). There has been research on effects of machine learning techniques such as Naive Bayes, Maximum Entropy, and Support Vector Machines in the specific domains. More recently, similar techniques have been applied to the Twitter microblogging platform as stated by Go and Huang(2009).Machine learning techniques like support vector machine is getting popularity in recent time as stated by Roy and Viswanatham(2016),Roy et al.(2015),Basu et al.(2015), Mittal et al.(2015) and Das et al(2014).Challenges include a wide range of topics to be classified, use of informal language on social media platforms, and the message length limitation of 140 characters per tweet imposed by Twitter. In our experiments, we transact a 2-way classification into positive and negative labels, unlike other approaches such as multi-label or multi-class classification as stated by Liu and Chen(2015). Support Vector Machines are a group of associated learning methods that aim to recognize inherent patterns in data as stated by Boser et al.(1992) . They are one of the several kernel-based techniques of machine learning algorithms. SVMs have been successfully applied to varied fields. SVMs can be used for both classification and regression analysis. However, given our focus on sentiment analysis on microblogging data, we limit our investigation to classification techniques.

Complete Chapter List

Search this Book:
Reset