In the last few years, the use of social media platforms during disasters and emergencies has increased. In particular, microblogging platforms such as Twitter provide active communication channels during the onset of mass convergence events such as natural disasters (Palen et al., 2009; Hughes et al., 2009; Starbird et al., 2010; Vieweg et al., 2010). Studies show that Twitter has been used to spread news about casualties and damage, donation offers and requests, and alerts, including multimedia information such as videos and photos during crises (Cameron et al., 2012; Imran et al., 2013a; Qu et al., 2011). Many studies show the significance of this online information (Vieweg et al., 2014; Sakaki et al., 2010; Neubig et al., 2011) for crisis response and management. Moreover, it has been observed that these messages are usually communicated more quickly than disaster information shared via traditional channels such as news websites, etc. For instance, the first tweet to report on the 2013 Westgate Mall attack was posted within a minute of the initial onslaught.1 Given the importance of crisis-related messages for time-critical situational awareness, disaster-affected communities and professional responders may benefit from using an automatic system to extract relevant information from social media.
Among other benefits that encourage responding organizations to use social media data is the timeliness of information when there are no other information sources available, especially in the beginning of a crisis situation (Tapia et al. 2013). For this reason, to enable rapid crisis response, real-time insights of an ongoing situation play an important role for emergency responders. To identify informational, actionable, and tactical informative pieces from a growing stack of social media information and to inform decision-making processes as early as possible, messages need to be processed as soon as they arrive. Given the large volume of messages, we need to classify them. That is, we need to put them in different informational categories such as food needs, supplies requests; financial support requests, logistics, etc. so that disaster-response professionals can quickly examine each bin to identity urgent needs.
Different approaches can be employed to filter and classify these online messages. For instance, many humanitarian organizations use the Digital Humanitarian Network (DHN)2 of volunteers to analyze messages one by one to find useful information for disaster response. However, given the amount of information that needs to be processed, and the scarcity of volunteers, we would ideally like the messages to be categorized automatically, and volunteers to use their time to perform higher-order tasks. Despite advances in natural language processing, full automation is still not feasible.
In this paper, we propose to use a hybrid approach in which both humans and machines work together to perform complex tasks (e.g. classification of tweets). Among other automatic processing techniques, most automatic classifiers that achieve high accuracy in solving different classification tasks are based on supervised machine learning techniques where humans provide a set of training samples consisting of positive and negative examples for each classification category. For instance, a semi-automated system having similar characteristics to DHN is AIDR (Artificial Intelligence for Disaster Response) (Imran et al., 2014).