Sentiment Analysis in the Light of LSTM Recurrent Neural Networks

Sentiment Analysis in the Light of LSTM Recurrent Neural Networks

Subarno Pal, Soumadip Ghosh, Amitava Nag
Copyright: © 2018 |Pages: 7
DOI: 10.4018/IJSE.2018010103
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Long short-term memory (LSTM) is a special type of recurrent neural network (RNN) architecture that was designed over simple RNNs for modeling temporal sequences and their long-range dependencies more accurately. In this article, the authors work with different types of LSTM architectures for sentiment analysis of movie reviews. It has been showed that LSTM RNNs are more effective than deep neural networks and conventional RNNs for sentiment analysis. Here, the authors explore different architectures associated with LSTM models to study their relative performance on sentiment analysis. A simple LSTM is first constructed and its performance is studied. On subsequent stages, the LSTM layer is stacked one upon another which shows an increase in accuracy. Later the LSTM layers were made bidirectional to convey data both forward and backward in the network. The authors hereby show that a layered deep LSTM with bidirectional connections has better performance in terms of accuracy compared to the simpler versions of LSTM used here.
Article Preview
Top

1. Introduction

Sentiment analysis is a computational method of identifying or categorizing opinions expressed in a text, which is also one of the very active fields of research (Manning et al., 2008). Text obtained from different sources like user reviews and micro blogs express user’s view or attitude towards the particular product or event etc. Sentiment analysis of small text is challenging because they are contextually limited. Decisions are taken on the basis of limited number of words used by the user. We deal with Sentiment Analysis as a supervised learning process where each data element (text reviews) are labeled as either ‘positive’ or ‘negative’ (Pang, Lee, & Vaithyanathan, 2002). Machine learning models are trained with word embeddings on these datasets and their accuracy is measured on the basis of their performance.

Artificial neuron network, a computational model developed on the basis of the structure and functions of biological neural networks, has achieved huge success over other machine learning techniques in sentiment analysis (Yoon, 2014; Socher, Pennington, Huang, Ng, & Manning, 2011; Xiong, Zhong, & Socher, 2002). Deep neural networks (DNNs) have recently achieved significant performance gains in a variety of NLP tasks such as language modeling (Bengio, Ducharme, Vincent, & Jauvin, 2003), sentiment analysis (Socher et al., 2013), syntactic parsing (Collobert & Weston, 2008), and machine translation (Lee, Cho, & Hofmann, 2016). A recurrent neural network (RNN) is a special type of neural network, where connections are made between units which form a directed cycle, which allows it to exhibit a dynamic temporal behavior for the model. An RNN has an Input layer, variable number of hidden layers and finally one output layer. Basic RNNs are a network of neuron-like nodes, each with a directed (one-way) connection to every other node in which all connection (synapse) has a modifiable real-valued weight. These weights are constantly updated through successive iterations of the neural network. RNN are mostly used in hand writing recognition and speech recognition. For text classification purpose RNN is certainly more effective than any other variations of neural networks in practice.

A special variation of RNN, long short term memory (LSTM) networks is discussed. LSTM showed a striking accuracy in language modeling and speech recognition. We will be varying different forms of LSTM for our text classification purpose. A LSTM network contains LSTM units along with the input and output network layer units. A LSTM unit is capable of remembering values for either long or short time periods (Hochreiter & Schmidhuber, 1997) and it uses no activation function within its recurrent components. The stored value is not iteratively squashed over time, thus solving the vanishing gradient problem. LSTM blocks contain three or four “gates” that control information flow, implemented using the logistic function to compute a value between 0 and 1. An “input” gate controls the extent to which a new value flows into the memory, a “forget” gate controls the extent to which a value remains in memory, and an “output” gate controls the extent to which the value in memory is used to compute the output activation of the block.

Complete Article List

Search this Journal:
Reset
Volume 11: 2 Issues (2020)
Volume 10: 2 Issues (2019)
Volume 9: 2 Issues (2018)
Volume 8: 2 Issues (2017)
Volume 7: 2 Issues (2016)
Volume 6: 2 Issues (2015)
Volume 5: 2 Issues (2014)
Volume 4: 2 Issues (2013)
Volume 3: 2 Issues (2012)
Volume 2: 2 Issues (2011)
Volume 1: 2 Issues (2010)
View Complete Journal Contents Listing