Article Preview
Top1. Introduction
Twitter, a micro-blogging social networking website has a large and rapidly growing user base. Rich bank of data is provided by the Twitter in form of 'tweets' which must be written within 140 characters. The experiment in Go et al. (2009) found that the average length of tweets is 14 words or 78 characters. Some of the applications which rely on Twitter data are analysis of disasters (Sen et al., 2015; Brynielsson et al., 2013), detection of diseases (Dai and Bikdash, 2015; Grover and Aujla, 2015; Grover et al., 2014; Aramaki et al., 2011), political elections (Asur and Huberman, 2010), movie review (Bollen et al., 2011) and stock market (Tumasjan et al., 2010). In tweets, abbreviations, orthographic mistakes, emoticons and hash tags are frequently used to express the message in few words. There have been a lot of researches on analyzing tweets posted during disasters, and most of the prior studies have focused on extracting situational information, that is, information which helps to gain a high-level understanding of the circumstance (Sarter and Woods, 1991; Vieweg et al., 2010). For instance, several studies have been done to develop classifiers for differentiating situational tweets from other non-situational tweets (Sen et al., 2015; Imran et al., 2013), while some studies not only attempted to summarize situational tweets (Sen et al., 2015; Nguyen et al., 2015) in English but also in different languages like Hindi (https://en.wikipedia.org/wiki/2015_Gurdaspur_attack; Sharma et al., 2015).