Real-Time Unspecified Major Sub-Events Detection in the Twitter Data Stream That Cause the Change in the Sentiment Score of the Targeted Event

Real-Time Unspecified Major Sub-Events Detection in the Twitter Data Stream That Cause the Change in the Sentiment Score of the Targeted Event

Ritesh Srivastava (NSIT, Delhi University, New Delhi, India) and M.P.S. Bhatia (NSIT, Delhi University, New Delhi, India)
DOI: 10.4018/IJITWE.2017100101
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Twitter behaves as a social sensor of the world. The tweets provided by the Twitter Firehose reveal the properties of big data (i.e. volume, variety, and velocity). With millions of users on Twitter, the Twitter's virtual communities are now replicating the real-world communities. Consequently, the discussions of real world events are also very often on Twitter. This work has performed the real-time analysis of the tweets related to a targeted event (e.g. election) to identify those potential sub-events that occurred in the real world, discussed over Twitter and cause the significant change in the aggregated sentiment score of the targeted event with time. Such type of analysis can enrich the real-time decision-making ability of the event bearer. The proposed approach utilizes a three-step process: (1) Real-time sentiment analysis of tweets (2) Application of Bayesian Change Points Detection to determine the sentiment change points (3) Major sub-events detection that have influenced the sentiment of targeted event. This work has experimented on Twitter data of Delhi Election 2015.
Article Preview

Introduction

In last few years of 2000’s, with the popularity of microblogging sites like Twitter, the research direction of automatic event detection has been changed enough. The Twitter produces a plethora of continuous flow of user-generated on-line micro-texts, which covers almost all aspects of the real world. Consequently, the Twitter has attracted many researchers of the era to perform event detection in real time for staying informed of “what is going on in the real world”. The Twitter behaves as a social sensor of the real world for sensing the public sentiment over varieties of topics and events. The tweets provided by the Twitter firehose reveal the properties of big data (i.e. volume, variety, and velocity). With millions of users on Twitter, the Twitter’s virtual communities are now replicating the real-world communities. Thus, the discussions of real world events are also very often on Twitter.

Recently, the Twitter data has been utilized in many task of predicting, monitoring and analyzing the real-world events such as breaking news tracking (Jackoway, Samet, & Sankaranarayanan, 2011), election prediction (Srivastava, Kumar, Bhatia, & Jain, 2015), natural disasters like earthquake (Sakaki, Okazaki, & Matsuo, 2010), and crime, radicalization and terrorism (Weimann, 2014).

An automatic event detection process from the textual data can be defined as a process of identifying the occurrences of novel events by analyzing the temporally ordered textual streams (Yang, Pierce, & Carbonell, 1998). With the evolution of Web 0.2 and the sudden increase in the computer-mediated communications, the automatic event detection has gained attention in the early years of 2000’s. However, the event detection has been long studied as Topic Detection and Tracking (TDT) (Allan, 2002, 2012; Yang et al., 1998). Most of the earlier works that addressed the issues related to event detection have been initially implemented on the traditional news oriented textual materials from a variety of broadcast news media (AlSumait, Barbará, & Domeniconi, 2008; Brants, Chen, & Farahat, 2003; Fiscus & Doddington, 2002).

The event detection task in the Twitter data streams can be categorized as specified (targeted) event detection and unspecified (untargeted) event detection (Atefeh & Khreich, 2015). In targeted event detection, there is sufficient prior information or clues that help in the detection process. This information may include time, place, domain, features and description of the event. In contrast to targeted event detection, in untargeted event detection, there is no prior information, which can guide the direction of event detection. The unspecified event detection process must rely on the temporal signal of Twitter data streams. The sub-events are the intermediate temporal events initiated in the real world and are associated with a specified event. For example, US presidential election is a targeted event and presidential debate and evolved controversies are sub-events related to US presidential election. These sub-events draw subjective tweets of users and hence produce impact to the sentiment of the targeted event.

Use of pre-decided related keywords is very common in specified event detection in Twitter (Sakaki et al., 2010), (Popescu & Pennacchiotti, 2010), (Li, Ritter, Cardie, & Hovy, 2014; Sankaranarayanan, Samet, Teitler, Lieberman, & Sperling, 2009). In our work, we have also performed targeted search by using a set of keywords to narrow down the search in Twitter data stream towards the specific events. Figure 1(a) depicts this scenario, in which the global Twitter stream is narrowed by pulling only related tweets of the targeted event E. During the observation period of the targeted event in the Twitter stream, many sub-events are also evolved. We have use different color to represent different sub-events in Figure 1(a). The discussion duration of a sub-event is the time duration from the time of origin of sub-event to the time when the discussion about the sub-event discontinues. The sub-events also influence the sentiment score of the targeted event as shown in Figure 1(b). It can be easily interpreted from the Figure 1(b) that average sentiment score for the event E in the observation window is due to the occurrence of five subjective sub-events tweets and some normal subjective tweets.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 13: 4 Issues (2018): 1 Released, 3 Forthcoming
Volume 12: 4 Issues (2017)
Volume 11: 4 Issues (2016)
Volume 10: 4 Issues (2015)
Volume 9: 4 Issues (2014)
Volume 8: 4 Issues (2013)
Volume 7: 4 Issues (2012)
Volume 6: 4 Issues (2011)
Volume 5: 4 Issues (2010)
Volume 4: 4 Issues (2009)
Volume 3: 4 Issues (2008)
Volume 2: 4 Issues (2007)
Volume 1: 4 Issues (2006)
View Complete Journal Contents Listing