Article Preview
TopWith the development of deep learning, scholars have gradually recognized its advantages over traditional machine learning. Deep learning possesses stronger representational capabilities, ACC, and applicability.
Ma et al. (2018) used a recurrent neural network (RNN) model for rumor detection on Weibo. They modeled text vectors with long short-term memory (LSTM) to effectively consider the context and contextual information of text. Liu et al. (2017) employed a character-level convolutional neural network (CNN) model, treating text as character sequences and mapping characters to vectors, which is followed by using convolutional and pooling layers to extract features. D. Lin et al. (2019) utilized LSTM to capture sequential contextual features of content for learning the falsehood of information and used CNN to learn the relationship. Zhou et al. (2018) employed CNN to automatically construct rumor features and used a gated recurrent unit (GRU) to explore information between Weibo posts. Moreover, GCN (Bai et al., 2021) and generative adversarial networks (Guo et al., 2021) have been effectively employed in the domain of rumor detection. In deep learning models, different types of information have different levels of importance. Therefore, researchers have introduced attention mechanisms into the problem of rumor detection. To detect highly attentive information, T. Chen et al. (2018) introduced attention mechanisms into RNNs to capture implicit and explicit characters from repetitive and variable Twitter information. Peng and Wang (2021) explored the temporal sequence background and sentiment polarity features of rumor lifecycles, utilizing a CNN model with spatial attention mechanisms for rumor detection and classification.
However, the use of images to propagate rumors has become increasingly prevalent, making it difficult to effectively identify rumor information solely through textual features. To tackle this issue, Weibo rumor detection has incorporated image data alongside textual features, helping in identifying misinformation more effectively. Qian et al. (2021) employed a co-attention approach to enhance text and visual characters mutually and fused the output information of every four layers of BERT with image information. Yang et al. (2019) proposed a dual-stream attention mechanism for target location perception, which can better acquire contextual information. Huang et al. (2023) modeled spatial and temporal structures to capture information dissemination and proposed a rumor detection method named STS-NN. (spatial–temporal structure neural network). The STS-NN model consists of three components: spatial capturer, temporal capturer, and integrator. All three components share parameters, allowing them to work together efficiently to identify rumors based on information dissemination. Lv et al. (2023) introduced a transformer-based model that employs an end-to-end approach to fuse multimodal feature representations into the same data domain. The model effectively captures dependencies across multiple levels of multimodal content while mitigating the impact of differences in multimodal heterogeneity.
Wan et al. (2023) developed a method involving sliding intervals to efficiently intercept necessary data instead of processing the entire sequence. To address hyper-parameter selection issues arising from integrating multiple optimization objectives, convex optimization techniques were employed to avoid excessive computational costs associated with enumeration. Throughout the training process, detection time, ACC, and stability were continuously adjusted and optimized as training objectives, enhancing the model's adaptability and generalizability. H. Li, Huang, et al. (2023) adopted bidirectional LSTM (Bi-LSTM) to extract user and text features and employed GCN to extract high-order propagation features. The complementary and alignment relationships between different features were also considered to achieve better fusion. S. Li, Wang, et al. (2023) utilized a dynamic graph attention network to encode temporal knowledge structures and an adaptive spatio-temporal and knowledge fusion network. Adaptive aggregation of knowledge information enables better integration of propagation structure information and knowledge structure information.