Machine Learning Approaches for Sentiment Analysis

Machine Learning Approaches for Sentiment Analysis

Basant Agarwal (Malaviya National Institute of Technology, India) and Namita Mittal (Malaviya National Institute of Technology, India)
DOI: 10.4018/978-1-5225-1759-7.ch070
OnDemand PDF Download:
No Current Special Offers


Opinion Mining or Sentiment Analysis is the study that analyzes people's opinions or sentiments from the text towards entities such as products and services. It has always been important to know what other people think. With the rapid growth of availability and popularity of online review sites, blogs', forums', and social networking sites' necessity of analysing and understanding these reviews has arisen. The main approaches for sentiment analysis can be categorized into semantic orientation-based approaches, knowledge-based, and machine-learning algorithms. This chapter surveys the machine learning approaches applied to sentiment analysis-based applications. The main emphasis of this chapter is to discuss the research involved in applying machine learning methods mostly for sentiment classification at document level. Machine learning-based approaches work in the following phases, which are discussed in detail in this chapter for sentiment classification: (1) feature extraction, (2) feature weighting schemes, (3) feature selection, and (4) machine-learning methods. This chapter also discusses the standard free benchmark datasets and evaluation methods for sentiment analysis. The authors conclude the chapter with a comparative study of some state-of-the-art methods for sentiment analysis and some possible future research directions in opinion mining and sentiment analysis.
Chapter Preview

1. Introduction

It has always been important to know what other people think. With the rapid growth of popularity and availability of online review sites, blogs, forums, and social networking sites necessity of analysing and understanding these reviews has arisen. Companies and people can use the opinion given in these reviews for better decision making for example a user can know about pros and cons of various features of the products that can help in taking decision of purchasing items. E-commerce companies can use the users’ opinion for improving their product quality and to know the current trends. Opinion Mining or Sentiment Analysis is the study that analyse people’s opinion, sentiment towards entities such as products, services etc. in the text (Liu, 2012). Sentiment analysis research can be categorized among Document level, Sentence level and Aspect/Feature level sentiment analysis. Document level sentiment analysis classifies a review document as positive or negative sentiment polar document. It considers a document as a single unit. Sentence level sentiment analysis takes a sentence to extract the opinion or sentiment expressed in that sentence. Aspect based sentiment analysis deals with the methods that identify the entities in the text about which an opinion is expressed (Liu, 2012). Further, the sentiments expressed about these entities are identified. Other important tasks in sentiment analysis and opinion mining research are opinion summarisation, opinion retrieval, spam review detection, etc. Solutions for the challenges incurred in these problems come from NLP, cognitive science, information retrieval, machine learning etc. Sentiment analysis research challenges and existing solutions are nicely presented in detail (Liu, 2012). Detailed survey of sentiment analysis research through various techniques is presented by Liu, (2012) and Cambria et al. (2012). However, these surveys do not discuss in detail about the machine learning approaches for sentiment analysis specifically that is the objective of this chapter. Machine learning approaches have been widely applied to sentiment classification mostly for document level sentiment classification.

There are several challenges being faced in the sentiment analysis research. Firstly, words used for expressing sentiment are domain specific. For example, word “unpredictable” has positive orientation in movie review domain but may be negative oriented for car review domain. Secondly, to identify the subjective portion of text from the overall review, because same words can be used in subjective and objective sentences. For example, “author used very crude language” and “crude oil is extracted from sea beds”. In this example, same word crude is used for expressing sentiment in first sentence, however second sentence is purely objective (Verma et. al 2009). Thirdly, thwarted expectations are difficult to handle. In certain cases most of the text represent positive or negative polarity, out of sudden polarity of overall text is reversed. For example, “This film has a great cast. It has excellent storyline and nice cinematography. However it can’t hold up the audiences”. Most of the review analysis research is based on the movie reviews and product review. Movie review sentiment classification face the challenge of handling the real facts which is generally mixed with actual review data. People generally discuss about the general traits of actors, plot of movie and relate the movie to their normal life. It is very difficult to extract the opinion from the reviews when there is a discussion of characteristics of artist and in the end overall movie is disliked. One of the biggest challenges of movie review analysis is to handle the negated opinion. Product review domain significantly differs from movie review dataset. In product reviews, reviewer generally writes both positives and negative opinion, because some features of the product are liked and some are disliked. These types of reviews are difficult to classify into positive or negative class. Generally, product review dataset contains more comparative sentences than movie review dataset, which is difficult to classify.

Complete Chapter List

Search this Book: