Enabling Rapid Classification of Social Media Communications During Crises

Enabling Rapid Classification of Social Media Communications During Crises

Muhammad Imran (Qatar Computing Research Institute, Doha, Qatar), Prasenjit Mitra (The Pennsylvania State University, University Park, PA, USA) and Jaideep Srivastava (Qatar Computing Research Institute, Doha, Qatar)
DOI: 10.4018/IJISCRAM.2016070101
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The use of social media platforms such as Twitter by affected people during crises is considered a vital source of information for crisis response. However, rapid crisis response requires real-time analysis of online information. When a disaster happens, among other data processing techniques, supervised machine learning can help classify online information in real-time. However, scarcity of labeled data causes poor performance in machine training. Often labeled data from past event is available. Can past labeled data be reused to train classifiers? We study the usefulness of labeled data of past events. We observe the performance of our classifiers trained using different combinations of training sets obtained from past disasters. Moreover, we propose two approaches (target labeling and active learning) to boost classification performance of a learning scheme. We perform extensive experimentation on real crisis datasets and show the utility of past-labeled data to train machine learning classifiers to process sudden-onset crisis-related data in real-time.
Article Preview

Introduction

In the last few years, the use of social media platforms during disasters and emergencies has increased. In particular, microblogging platforms such as Twitter provide active communication channels during the onset of mass convergence events such as natural disasters (Palen et al., 2009; Hughes et al., 2009; Starbird et al., 2010; Vieweg et al., 2010). Studies show that Twitter has been used to spread news about casualties and damage, donation offers and requests, and alerts, including multimedia information such as videos and photos during crises (Cameron et al., 2012; Imran et al., 2013a; Qu et al., 2011). Many studies show the significance of this online information (Vieweg et al., 2014; Sakaki et al., 2010; Neubig et al., 2011) for crisis response and management. Moreover, it has been observed that these messages are usually communicated more quickly than disaster information shared via traditional channels such as news websites, etc. For instance, the first tweet to report on the 2013 Westgate Mall attack was posted within a minute of the initial onslaught.1 Given the importance of crisis-related messages for time-critical situational awareness, disaster-affected communities and professional responders may benefit from using an automatic system to extract relevant information from social media.

Among other benefits that encourage responding organizations to use social media data is the timeliness of information when there are no other information sources available, especially in the beginning of a crisis situation (Tapia et al. 2013). For this reason, to enable rapid crisis response, real-time insights of an ongoing situation play an important role for emergency responders. To identify informational, actionable, and tactical informative pieces from a growing stack of social media information and to inform decision-making processes as early as possible, messages need to be processed as soon as they arrive. Given the large volume of messages, we need to classify them. That is, we need to put them in different informational categories such as food needs, supplies requests; financial support requests, logistics, etc. so that disaster-response professionals can quickly examine each bin to identity urgent needs.

Different approaches can be employed to filter and classify these online messages. For instance, many humanitarian organizations use the Digital Humanitarian Network (DHN)2 of volunteers to analyze messages one by one to find useful information for disaster response. However, given the amount of information that needs to be processed, and the scarcity of volunteers, we would ideally like the messages to be categorized automatically, and volunteers to use their time to perform higher-order tasks. Despite advances in natural language processing, full automation is still not feasible.

In this paper, we propose to use a hybrid approach in which both humans and machines work together to perform complex tasks (e.g. classification of tweets). Among other automatic processing techniques, most automatic classifiers that achieve high accuracy in solving different classification tasks are based on supervised machine learning techniques where humans provide a set of training samples consisting of positive and negative examples for each classification category. For instance, a semi-automated system having similar characteristics to DHN is AIDR (Artificial Intelligence for Disaster Response) (Imran et al., 2014).

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 9: 4 Issues (2017): 1 Released, 3 Forthcoming
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing