Opinion Mining: A Tool for Understanding Customers – Challenges and Approaches

Opinion Mining: A Tool for Understanding Customers – Challenges and Approaches

Rawan T. Khasawneh (Jordan University of Science and Technology, Jordan) and Izzat Alsmadi (Texas A&M University - San Antonio, USA)
Copyright: © 2017 |Pages: 14
DOI: 10.4018/978-1-5225-1686-6.ch010


In recent years social media sites become very popular communication tools among Internet users where a significant amount of information is exchanged via computers, smart phones, etc. Internet now is not only a source of information for users to search for; regular users are now a major source of Internet information; where now regular people post daily life activities, share online pictures, and express their opinions about products, news, political debates, etc. Such noticed growing of opinion-rich resources along with user-generated content makes it worthwhile to use information technologies to collect, analyze, and understand human factors and behaviors. This chapter covers three main sections where the first section introduces the field of opinion mining in general along with a detailed exploration of its definitions and goals. Then a discussion of opinion mining related challenges is presented in the second section. The last section explores opinion mining available approaches along with possible future directions.
Chapter Preview


There are several recent indicators that show the importance and the significance of web information. While classical web information can be broadly classified under “pure science” where the Internet was more like a large open library, the current face of the Internet is largely seen as a major “social science” source of information. Online Social Networks (OSNs) are by large the current most popular websites through the Internet. The number of users in those OSNs is overwhelming. For example, Figure 1 shows that more than one-third of the world population are active OSN users. Statistics showed also that those numbers are continuously rising. Entities, individuals, young or adult are all trying to have their visible presence in those OSNs. Users post details and information related to their own daily life activities. In addition, they interact with activities posted by their peers or friends.

Figure 1.

Active OSN users


The classical “mining” term is used in two main categories: Data mining for structured data and text mining for unstructured data. While there are many commonalities between those mining categories, there are also some unique attributes. Based on techniques and algorithms used, opinion mining can fall within text mining in general where the data in opinion mining is also unstructured. Nonetheless, the containers that opinions are extracted from are typically heterogeneous and dispersed in comparison with text mining sources. Text and data mining typical goals (e.g. clustering, classification, association, or prediction) are also relevant in opinion mining. However, opinion mining is concerned with a specific type of clustering related to opinions. For example, typically such classification can take one of two possible cases or classes (e.g. with or against, positive or negative, like or dislike, pro or against, democrat or republication, etc.). This is why opinion mining is also called polarity or sentimental analysis. In some cases, a third neutral class is also suggested to have a balance classification model (i.e. positive, negative and neutral). Many research papers showed that if algorithms are not precise, the majority of opinions will fall in this middle neutral class. This can reduce the value of the final findings of the opinion mining process.

In several earlier published papers (e.g. Khasawneh et al 2015, Al-Kabi et al 2014), researchers collected and evaluated different datasets collected from OSNs in Arabic language, they proposed automatic methods and developed tools to automate the process of collecting opinions, preprocessing data and finally make the final sentiment judgement on each post whether it typically represents a positive, negative or neutral opinion. Researchers described some of the challenges usually associated with such automatic based tools. While some of the challenges that face all languages can be similar, nonetheless, there are some challenges that can vary or be different from one language to another. For example, in the case of Arabic language, researchers noticed that in OSNs it is a trend for users to use slang languages that vary from one country to another and even in the same country. Users mix slang with standard languages. They also mix Arabic with English terms. Further, they heavily use icons, symbols, figures to express special opinions. All those issues can make the final automatic judgement of the sentiment harder or can lower the overall accuracy in such systems. In order to improve accuracy for sentiment detection systems in general researchers recommended two broad recommendations. The first one is that this is an evolutionary process that should be repeated and extended frequently to improve the quality of future predictions. The size and the scope of the evaluated dataset is also important where it is important to have large datasets that consider different domains. Researchers should also mix automatic detection with manual verification or annotation.

Complete Chapter List

Search this Book: