Polarity Classification of Arabic Sentiments

Polarity Classification of Arabic Sentiments

Mohammed N. Al-Kabi (Information Technology Faculty, Zarqa University, Zarqa, Jordan), Heider A. Wahsheh (College of Computer Science, King Khaled University, Abha, Saudi Arabia) and Izzat M. Alsmadi (University of New Haven, West Haven, CT, USA)
DOI: 10.4018/IJITWE.2016070103
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Sentiment Analysis/Opinion Mining is associated with social media and usually aims to automatically identify the polarities of different points of views of the users of the social media about different aspects of life. The polarity of a sentiment reflects the point view of its author about a certain issue. This study aims to present a new method to identify the polarity of Arabic reviews and comments whether they are written in Modern Standard Arabic (MSA), or one of the Arabic Dialects, and/or include Emoticons. The proposed method is called Detection of Arabic Sentiment Analysis Polarity (DASAP). A modest dataset of Arabic comments, posts, and reviews is collected from Online social network websites (i.e. Facebook, Blogs, YouTube, and Twitter). This dataset is used to evaluate the effectiveness of the proposed method (DASAP). Receiver Operating Characteristic (ROC) prediction quality measurements are used to evaluate the effectiveness of DASAP based on the collected dataset.
Article Preview

1. Introduction

Millions of Arab Internet users enjoy the use of Online Social Networks (OSNs). Arab users of social media generate millions of interactions and a huge amount of opinionated data about different topics each day. This wealth of opinionated Arabic data includes beside the text comments, images, audios, and videos that are posted through user accounts to display their sentiments and opinions about a wide range of topics.

Sentiment analysis (SA) is used to automatically analyse the opinionated data about different aspects of life like products, sportsmen, actors, actresses, politicians, governments, etc. Opinionated data includes reviews and comments posted by social media users. Nowadays, many companies analyse these posts to enhance their services and products. Therefore, there is no need to conduct surveys and opinion polls anymore by those parties. SA is an essential tool that used to make the right decision by companies, political campaigns, governments, politicians, candidates, economists, movie stars, sportsmen, etc. In business, for example, customers usually read posts of the consumers of a certain product before deciding to buy it, and the point of views of different owners highly affect their decision to buy a product or to look for different alternatives.

SA is not a straightforward task since the polarity of a review is not always fixed, but it depends on the domain it belongs to. Let us consider a simple example for the word “cheap” that is considered in economics domain as a positive word, while in the politics domain, it is considered to be negative. SA is highly dependent on the context and the domain. Furthermore, the authors of these posts (reviews) use slang, emoticons, lengthened words (Words with repeated letters), and their posts are characterized by a high percentage of spelling mistakes relative to those found in newspaper articles due to the nature of the devices used to generate these posts and their authors. Previous experiences of specialists in this field reveal that human understandings of comments and reviews are different, and, therefore, different conclusions are drawn by different persons. Therefore, it is harder for software or website tools to identify precisely the polarities of these comments and reviews.

Most of the studies of SA are about English reviews and comments, and there are relatively few SA studies in other languages such as French and Arabic (Ghorbel, & Jacot, 2011). Arabic sentiment analysis is the core topic of this study, and it is used to identify the polarity of the collected Arabic text sentiments. The scope of this study does include image, audio, and video posts generated by social media users, besides an analysis to Arabic text posts. Examples of similar research articles include: (Morency, Mihalcea, & Doshi, 2011) study and (Refaee, & Rieser, 2014) study, as our paper concentrates on Arabic multimedia posts. The task of determining the polarity of video posts is much harder than text posts. Therefore, to enhance the accuracy of video posts annotations, for example, the body language can be included. Supervised classification of the collected Arabic dataset is presented in this study. Our paper proposes a new method called Detection of Arabic Sentiment Analysis Polarity (DASAP) using a modest dataset of Arabic comments, posts, and reviews of different types: text, images, videos, and audios. This dataset is collected from social network websites (i.e. Facebook, Blogs, YouTube and Twitter). Receiver Operating Characteristic (ROC) prediction quality measurements are used to evaluate the effectiveness of the proposed approach.

The rest of this paper is organized as it follows: Section 2 presents the related work to this study, and section 3 presents the methodology followed to accomplish this study. Section 4 presents the data collection. Section 5 shows text and images lexicons while section 6 presents opinion extraction methods. Section 7 shows the experiments and results. Section 8 presents the conclusion and future work.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 13: 4 Issues (2018): 1 Released, 3 Forthcoming
Volume 12: 4 Issues (2017)
Volume 11: 4 Issues (2016)
Volume 10: 4 Issues (2015)
Volume 9: 4 Issues (2014)
Volume 8: 4 Issues (2013)
Volume 7: 4 Issues (2012)
Volume 6: 4 Issues (2011)
Volume 5: 4 Issues (2010)
Volume 4: 4 Issues (2009)
Volume 3: 4 Issues (2008)
Volume 2: 4 Issues (2007)
Volume 1: 4 Issues (2006)
View Complete Journal Contents Listing