Using Latent Fine-Grained Sentiment for Cross-Domain Sentiment Analysis

Using Latent Fine-Grained Sentiment for Cross-Domain Sentiment Analysis

Kwun-Ping Lai, Jackie Chun-Sing Ho, Wai Lam
Copyright: © 2021 |Pages: 17
DOI: 10.4018/IJKBO.2021070103
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The authors investigate the problem task of multi-source cross-domain sentiment classification under the constraint of little labeled data. The authors propose a novel model which is capable of capturing both sentiment terms with strong or weak polarity from various source domains which are useful for knowledge transfer to unlabeled target domain. The authors propose a two-step training strategy with different granularities helping the model to identify sentiment terms with different degrees of sentiment polarity. Specifically, the coarse-grained training step captures the strong sentiment terms from the whole review while the fine-grained training step focuses on the latent fine-grained sentence sentiment which are helpful under the constraint of little labeled data. Experiments on a real-world product review dataset show that the proposed model has a good performance even under the little labeled data constraint.
Article Preview
Top

Introduction

Online shopping is the modern way of shopping due to its great convenience. Besides connecting to customers, e-commerce companies also provide service for store owners to open their virtual stores in the online marketplace for selling products to customers. Customers sometimes leave their reviews regarding the products for sharing their experience of using the products. The product reviews serve as an important information source both for store owners and potential buyers. Store owners could adjust their retail strategy based on those reviews. Potential buyers usually make decision of buying products biased towards the overall degree of satisfaction of product reviews. Sentiment polarity classification, i.e. classifying the reviews into positive or negative polarity, therefore becomes a hot research topic. Supervised learning is always a good choice which could give good performance on various tasks provided that sufficient labeled data is available, not limited to sentiment classification. Unfortunately, labeled data could be scarce for some unpopular domains and it might be impractical to obtain sufficient amount of labeled data which is essential for traditional supervised learning methods. One possible solution for handling the scarce labeled data problem is applying the technique of cross-domain sentiment classification, which involves transfer learning between the source domain with abundant labeled data and the target domain with insufficient or even no labeled data. It requires the model having a less dependent relationship to the labeled domain specific data. Learning a shared feature space (Blitzer, Dredze, and Pereira (2007); Blitzer, McDonald, and Pereira (2006); Bollegala, Mu, and Goulermas (2015); Pan, Ni, Sun, Yang, and Chen (2010)) between the source and the target domain is one possible direction to solve the domain discrepancy problem. Other promising solutions include domain adversarial training (Ajakan, Germain, Larochelle, Laviolette, and Marchand (2014); Ganin et al. (2016); Li, Wei, Zhang, and Yang (2018); Li, Zhang, Wei, Wu, and Yang (2017)) and applying the large pre-trained model (Myagmar, Li, and Kimura (2019)). However, these methods still require a large amount of data from the source domain. This requirement might not be restrictive for large retail companies. It is a completely different story for middle to small scale online store owners. It is unrealistic for them to obtain a few thousands of product reviews from a single domain and at the same time with proper annotations.

A more natural setting is to reduce the burden of one source domain by separating the information source to multiple source domains. Naturally, the requirement of the amount of available labeled data for each source domain would be decrease by a factor of m, which is the number of available source domains. Researchers recently propose multi-source cross-domain sentiment classification problem task (Guo, Shah, and Barzilay (2018); Ruder and Plank (2017); Wu and Huang (2016); Yang, Shen, Chen, and Li (2020)), which aims at transferring the knowledge learnt from various source domains to the target domain. However, the authors observe that the total amount of labeled data required remains large during the experiments and the model performance under little labeled data is unknown.

In this paper, the authors investigate the setting under the constraint of little labeled data that is more applicable in real-world practice. The amount of labeled data for each source domain is restricted while there remains no labeled data in the target domain. This reflects realistic setting of cross-domain sentiment classification. In addition, the problem task uses coarse-grained label representing the overall sentiment polarity of the whole review instead of fine-grained sentence level. This new problem setting aims at solving the real-world situation that the online store owners are often facing nowadays. While the difficulty of the problem increases under the constraint of little labeled data, the multiple-source setting provides information diversity that helps the model obtain different factors useful for determining the sentiment polarity at the target domain.

Complete Article List

Search this Journal:
Reset
Volume 14: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 13: 1 Issue (2023)
Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing