Article Preview
Top1. Introduction
With the rapid development of information and communication technology, people can easily share their opinions and get the latest news from social network platforms. Since these user-generated contents are not verified before being posted, people cannot tell if they are real or false. Thus, the unverified false messages are considered as rumors since they are annoying in various aspects of our daily lives. For example, people could receive messages that masquerade as being sent from the government or companies, asking to provide personal information. People can only tell if these messages are real or false by checking if the text contents are relevant or not. To distinguish real messages from false ones, rumor detection has been an increasingly important research topic in social media. Nowadays, third-party fact-checking services such as FactCheck.org and Snopes.com have been used for message verification. These fact-checking services usually require manual labeling which needs lots of human efforts. In the face of rapid information dissemination, these services alone cannot be effectively provided in time.
There could be numerous ways to disguise the unverified false messages or rumors as real messages. Since it’s very common to post images and texts in the same post in social media, in this paper we define rumors as the message type when there’s mismatch between multimedia contents and the surrounding texts. Our research problem for rumor detection is defined as: given a social media post with images and their corresponding surrounding texts, we want to determine if there’s a mismatch between semantic information in the images and the surrounding texts. In recent research, deep learning methods are widely used to construct the model and to learn features for rumor detection. For example, as the baseline in our experiments, the method proposed by Jin et. al (2017) utilized a RNN with attention mechanism (att-RNN) to fuse multimodal features, including texts and images. They achieved an accuracy of 68.2% for the Twitter dataset in MediaEval task. The visual neuron attention mechanism gives each neuron different weights for different words. However, the relations between image visual features and text features are not well-addressed. In order to better address the relations between images and the surrounding texts, in this paper, we propose to improve rumor detection by image captioning and multi-cell RNNs with self-attention. Firstly, it’s very important if we can aggregate multimodal contents as an effective feature to tell the mismatch between texts and images. Instead of simply adjusting weights of different visual features by the attention mechanism from the RNN results of texts, we first translate images into the most relevant caption words as a more coherent way of feature representation. This helps to closely connect the semantic meanings of images and texts. Secondly, we design a novel type of multi-cell bidirectional RNNs which combine self-attention mechanism for identifying more important features from different sources. Finally, different feature fusion approaches are used to improve the performance of rumor detection.
The main contributions of this paper are as follows: Firstly, we propose a novel multimodal feature fusion approach to rumor detection based on image captioning model that represents image semantics in textual descriptions. The sequence-to-sequence model can extract more meaningful descriptions from images than simple convolutional approaches. To the best of our knowledge, our proposed method is the first to apply image captioning in extracting image semantics for rumor detection. Secondly, instead of one single layer which might not fully capture the relations among words, we design a novel way of stacking bidirectional RNNs called Multi-cell Bi-RNN, which adds more cells in each individual direction of forward and backward passes to learn more deeply in each neuron. A better performance can be obtained than the baseline model. This shows the potential of our proposed approach to rumor detection.
The rest of the paper is structured as follows: In Sec. 2, we provide a review of related previous research works. Then the proposed method is presented and discussed in Sec.3. In Sec.4, we show the experimental results and compare with existing methods. Finally, we conclude the paper in Sec.5.