Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Extracting Sentiment Patterns from Syntactic Graphs

Alexander Pak, Patrick Paroubek

Source Title: Social Media Mining and Social Network Analysis: Emerging Research

DOI: 10.4018/978-1-4666-2806-9.ch001

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Sentiment analysis and opinion mining became one of key topics in research of social media and social networks. Polarity classification, i.e. determining whether a text expresses a positive attitude or a negative one, is a basic task of sentiment analysis. Based on traditional information retrieval techniques, such as topic detection, many researchers use a bag-of-words or an n-gram model to represent an analyzed text. Regardless of its simplicity, such a representation loses latent information contained in relations between words in a sentence. The authors consider this information to be important for sentiment analysis and thus propose a novel method for representing a text based on graphs extracted from sentence linguistic parse trees. The new method preserves the information of words relations and can replace a standard n-gram model. In this chapter, the authors give a description of their approach and present results of conducted experimental evaluations that prove the benefits of their text representation. In the authors’ experiments, they work with English and French languages; however, their approach is generic and can be easily adapted to other languages.

Chapter Preview

Top

Introduction

The increase of interest in sentiment analysis is associated with the appearance of Web-blogs and social networks, where users post and share information about their likes/dislikes, preferences, and lifestyle. Many websites provide an opportunity for users to leave their opinion about a certain object or a topic. For example, the users of IMDb¹ website can write a review on a movie they have watched and rate it on a 5-star scale. As a result, given a large number of reviews and rating scores, the IMDb reflects general opinions of Internet users on movies. Many other related Web-resources, such as cinema schedule websites, use the information from the IMDb to provide information about the movies including the average rating. Thus, IMDb reviews influence the choice of other users, who will have a tendency to select movies with higher ratings.

Another example is social networks. It is popular among users of Twitter² or Facebook³ to post messages that are visible to their friends with an opinion on different consumer goods, such as electronic products and gadgets. Jansen (2009) called Twitter as “electronic word of mouth.” The companies who produce or sell those products are interested in monitoring the current trend and to analyze people's interest. Such information can influence their marketing strategy or bring changes in the product design to meet the customers’ needs.

Therefore, there is a need in algorithms and methods of automatic collection and processing of opinionated texts. Such methods are expected to classify the texts by their polarity (positive or negative), estimate the sentiment and determine the opinion target and the holder, where the target is the object or a subject of the opinion and the holder is usually the author of the text, but not limited to (Toprak, et al., 2010).

Bag-of-words is one of the first models of text representation, which is nowadays often used in sentiment analysis. In this approach, text is represented usually as a set of unigrams (or bigrams) disregarding their order and relations within the text. Common machine learning techniques such as Naive Bayes or SVM are then used to perform the sentiment classification of the given text. Although the accuracy of such approaches can be quite high, especially when using advanced feature selection techniques and additional lexicons of opinionated texts. We think that this model should be improved or replaced by the one that can identify more complex sentiment expressions rather than only simple ones such as good movie or bad acting.

One of the problems of bag-of-words representation is the information loss when representing a text as a collection of unrelated terms. However, these relations are often very important and may change the degree and the polarity of a sentiment expressed in the text. We illustrate this problem with a simple example. Let us consider a simple phrase: “This book is bad.” The sentiment of this phrase is obviously negative and a standard classifier based on unigrams model will easily classify this sentence correctly provided a good training dataset. Now let us make the sentence a little more complex: “This book is not bad.” In this case, a simple unigram model will probably fail. However, a bigram model will still work, capturing not bad as a term with a positive polarity. If we make the sentence more complex: “This books is surprisingly not that bad,” both unigram and bigram models will fail. To make them work, a more sophisticated treatment of negations is needed.

Other than handling negations, the n-gram model has problems with capturing long dependencies. A bigram model will capture “I like” as a positive pattern in a sentence such as “I like fish,” but not in “I really like fish.” If we advance the task and move to a more refined polarity classification, i.e. identifying not only the polarity of a text (positive or negative), but also the degree of the polarity (low/high or even more precise), the n-gram model cannot provide sufficient information.

In order to solve the problems of the n-gram model, we propose to use a dependency parse tree of a sentence to generate a text representation. A dependency tree is a graphical representation of a sentence where nodes correspond to words of the sentence and edges represent syntactic relations between them such as object, subject, modifier etc. Figure 1 depicts a dependency parse tree of a sentence “I do not like fish very much.”

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Extracting Sentiment Patterns from Syntactic Graphs

Abstract

Introduction

Complete Chapter List