Improving Polarity Classification for Financial News Using Semantic Similarity Techniques

Improving Polarity Classification for Financial News Using Semantic Similarity Techniques

Tan Li Im (Universiti Malaysia Sabah, Kota Kinabalu, Malaysia), Phang Wai San (Universiti Malaysia Sabah, Kota Kinabalu, Malaysia), Patricia Anthony (Lincoln University, Christchurch, New Zealand) and Chin Kim On (Universiti Malaysia Sabah, Kota Kinabalu, Malaysia)
Copyright: © 2018 |Pages: 16
DOI: 10.4018/IJIIT.2018100103

Abstract

This article discusses polarity classification for financial news articles. The proposed Semantic Sentiment Analyser makes use of semantic similarity techniques, sentiment composition rules, and the Positivity/Negativity (P/N) ratio in performing polarity classification. An experiment was conducted to compare the performance of three semantic similarity metrics namely HSO, LESK, and LIN to find the semantically similar pair of word as the input word. The best similarity technique (HSO) is incorporated into the sentiment analyser to find the possible polarity carrier from the analysed text before performing polarity classification. The performance of the proposed Semantic Sentiment Analyser was evaluated using a set of manually annotated financial news articles. The results obtained from the experiment showed that the proposed SSA was able to achieve an F-Score of 90.89% for all cases classification.
Article Preview

Introduction

In stock market trading, it is a common practice for investors to analyse and study potential companies before making any investment decision. Financial news articles often contain information about finance, company updates, and business reports, which can provide cues for investors to buy, hold, or sell a particular stock. A financial news article can be classified as positive, negative or neutral. An article reporting negatively about a particular company may have a negative impact on that company’s stock price. On the other hand, an article reporting positively about a particular company may have a positive impact on that company’s stock price. Since there are many financial news articles online, reading, analysing and tagging each article are time consuming. Sentiment analysis tools can be used to quickly determine the sentiment of a given article by extracting meaningful information from natural language text in the article.

Sentiment in news articles and its impacts have been studied in the financial domain. Niederhoffer (1971) found that markets reacted differently to news of different categories. Engle and Ng (1993) asserted that the stock market is influenced by news in an asymmetric way in which negative news seemed to exert greater impact on the volatility as opposed to positive news. In another study, Mian and Sankaraguruswamy (2012) concluded that the positive stock price’s response to good news increases with sentiment while the opposite is true for the negative stock price. In addition, the effect of sentiment on stock with respect to stock price is more significant for bad news.

This paper describes the Semantic Sentiment Analyser (SSA), which is a sentiment analysis system for polarity classification of financial news articles. The Sentiment Analyser performs bipolar classification, which classifies financial news article as positive and negative. The first step of the polarity classification process is to perform phrase extraction according to its Part-of-Speech (POS using the Stanford RNN parser (Kong & Smith, 1994). Next, the parsed words are tagged for their polarity using Subjectivity Lexicon from Wilson et al. (2005). Semantic similarity technique is used to find similar words for those input words not found in the lexicon. Finally, the polarity of the financial news article is determined using a set of sentiment composition rules in combination with Positivity/Negativity ratio.

The objectives of this research are threefold. Firstly, the development of a semantic sentiment analyser using lexical approach. Secondly, the implementation of semantic similarity to enrich the sentiment in the analysed text. Thirdly, the comparison of three semantic similarity metrics to determine the best metric to be used in the model. This work advances the state of the art as it provides a comparison of three semantic similarity metrics in an attempt to select the best metric to improve its classification’s ability. In addition, the implementation of semantic similarity enriches and expands the lexicon by continuously adding new words with polarity carrier to the lexicon.

The focus of this paper is on the semantic aspect of the proposed SSA. Semantic similarity is used in this work to enrich the sentiment in the analysed text. The objective of the semantic enrichment process is to find as many polarity carriers (the words with sentiment) as possible and use it as input to the sentiment analyser for further processing. Three semantic similarity metrics were considered namely HSO (Hirst & St-Onge, 1998), LIN (Lin, 1998) and LESK (Lesk, 1986). These three metrics use WordNet as their backbone to support the measurement of the semantic relatedness between two words. WordNet is used in this work, as it is a powerful lexical resource containing words represented in lemma form. This representation conforms to the setup of the proposed SSA. In the experimental evaluation, the performances of these three similarity metrics were compared and based on the result, HSO outperformed the other two metrics. Hence, HSO was used as the semantic similarity metric for SSA. The overall performance of SSA was also evaluated and the results are described in the final section.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 15: 4 Issues (2019): 2 Released, 2 Forthcoming
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing