Feature Optimization in Sentiment Analysis by Term Co-occurrence Fitness Evolution (TCFE)

Feature Optimization in Sentiment Analysis by Term Co-occurrence Fitness Evolution (TCFE)

Sudarshan S. Sonawane, Satish R. Kolhe
DOI: 10.4018/IJITWE.2019070102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The opinion of a target audience is a major objective for the assessing state of efficacy pertaining to reviews, business decisions surveys, and such factors that require decision making. Feature selection turns out to be a critical task for developing robust and high levels of classification while decreasing training time. Models are required for stating the scope for depicting optimal feature selection for escalating feature selection strategies to escalate maximal accuracy in opinion mining. Considering the scope for improvement, an n-gram feature selection approach is proposed where optimal features based on term co-occurrence fitness is proposed in this article. Genetic algorithms focus on determining the evolution and solution to attain deterministic and maximal accuracy having a minimal level of computational process for reflecting on the sentiment scope for sentiment. Evaluations reflect that the proposed solution is capable, which outperforms the separate filter-oriented feature selection models of sentiment classification.
Article Preview
Top

Introduction

User-generated content analysis has become very vital in terms of analyzing the data insights that are generated by the organization. Sentiment analysis as a subject has gained momentum and globally there are many companies that are focusing on using the user-centric data to analyze the issues and identify the insights that they could use for enhancing the user experience.

Analyzing user opinions are very critical in the decision making and globally many companies have reaped potential benefits of same. Profoundly, the sentiment analysis enables users to have an opinion and the consumer behavior patterns. Opinion of one single individual might be ignorable but considering the viewpoints of the macro set of consumers is very essential to evaluate and adjudge the brand perception among the customers, as divergent views of customers can be envisaged from such process (Hu & Liu, 2004).

In NLP (Natural Language Processing) opinion mining and the process of sentiment analysis has gained profound importance (Liu, 2012) and profoundly in the case of web mining, analytics of social media messages data mining has become an important part.

The study (Cambria, Das, Bandyopadhyay, & Feraco, 2017) emphasizes the importance of opinion mining as:

  • Entity (E) (Deng & Wiebe, 2015), the ones that focus on the entity, event, a service or an individual task

  • Entity Aspects (A) are the ones related to a product, an organization, resulting outcome of an event or service.

  • Opinion Articulated Time (T) which reflects the event of time wherein the opinion is expressed.

  • The sentiment (S) like the positive, or opinion oriented like negative or neutral or the ones that relate to aspect of entity.

It is imperative from the aforesaid objectives that the sentiment analysis are being resourceful in various dimensions that are stated above, that focus on the overall opinion of target set representatives. Majority of the sentiment analysis solutions contributed are machine-learning solutions that focus on the opinion aspects.

The process of feature representation usually impacts the efficacy of machine learning approaches (LeCun, Bengio & Hinton, 2015) and there is need for focused attempt in terms of using the feature selection strategies that can deliver better outcome (Mohammad, Kiritchenko & Zhu, 2013). Learning mechanisms are very effective in terms of choosing the discriminative features (Bengio, Courville & Vincent, 2013), learners ensemble (LeCun, Bengio & Hinton, 2015) which leads to more opportunities for learning from the chosen data at distinct levels and can it can lead to deep learning conditions. However, the majority of contemporary contributions related to opinion mining are portraying the optimal features from individual elements such as terms (subject, predicates, or sentiment lexicons) of the learning data selection.

In this paper, an improved n-gram feature selection approach and optimal features based on term co-occurrence fitness evolution are proposed. The proposed model is compared with genetic rank aggregation model. Performance analysis shows that the proposed model is outstanding and robust with high classification accuracy as compared with genetic rank aggregation model.

Top

The Authors of (Turney, 2002; Taboada, Brooke, Tofiloski, Voll & Stede, 2011) have focused on sentimental words repository that consists intensification, sentiment polarity and negation incorporation towards computation of sentiment polarity in every sentence. Author of (Turney, 2002) adapted the process of representative lexicon-oriented method, wherein the extraction phase is carried out in the first stage, wherein the postages affirm patterns that are based on pre-defined conditions. In the second stage, PMI (point-wise mutual information) comprising measured degree of dependency between two terms were targeted. In the final stage, average of polarity for all the stages of review are considered and addressed as sentiment polarity.

Complete Article List

Search this Journal:
Reset
Volume 19: 1 Issue (2024)
Volume 18: 1 Issue (2023)
Volume 17: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 16: 4 Issues (2021)
Volume 15: 4 Issues (2020)
Volume 14: 4 Issues (2019)
Volume 13: 4 Issues (2018)
Volume 12: 4 Issues (2017)
Volume 11: 4 Issues (2016)
Volume 10: 4 Issues (2015)
Volume 9: 4 Issues (2014)
Volume 8: 4 Issues (2013)
Volume 7: 4 Issues (2012)
Volume 6: 4 Issues (2011)
Volume 5: 4 Issues (2010)
Volume 4: 4 Issues (2009)
Volume 3: 4 Issues (2008)
Volume 2: 4 Issues (2007)
Volume 1: 4 Issues (2006)
View Complete Journal Contents Listing