Article Preview
Top2. Literature Review
(Pang et al., 2008) presented a wide-ranging and detailed review of traditional automatic sentiment detection techniques, including many sub-components. In general, sentiment detection techniques can roughly be divided into lexicon-based methods and machine-learning methods described by (E.Boiy et al., 2009). Lexicon-based methods discussed by (M Taboada et al., 2010). rely on a sentiment lexicon, a collection of known and precompiled sentiment terms. Machine learning approaches make use of syntactic and/or linguistic features and hybrid approaches are very common, with sentiment lexicons playing a key role in the majority of these methods. However, such approaches are often inflexible regarding the ambiguity of sentiment terms. (Maynard et al., 2011) discussed opinion mining from micro posts and the challenges on NLP system, and suggested evolved techniques to handle them.
In our paper an application for political opinion mining is developed using GATE a concept given by (H. Cunningham et al., 2012). and is a freely available toolkit for language processing. ANNIE by (H. Cunningham, 2011). is the default named entity recognition system and a part of GATE in used for Tokenization, Gazetteer, Sentence Splitting and POS tagging which are essential for further processing and analysis. Before analysis the tweets are pre- processed to remove non opinionated and irrelevant tweets and Java Suggester, an open source java program, is used for spell checking. A Phonetic Dictionary of most commonly used 5000 words was developed using Metaphone algorithm given by (Philips Lawrence, 1990). to identify words using their pronunciation. For example the word ‘gud’ for ‘good’ or ‘awsum’ for ‘awesome’ are commonly used in micro posts and basic spell checkers fail to correctly identify them, hence for this purpose the dictionary is formulated.