基于多模态多层级注意力网络的社交平台谣言检测

doi:10.12005/orms.2025.0303

摘要/Abstract

摘要： 在线社交网络平台已成为信息获取的重要渠道,但其中谣言的传播引发了误导和混乱,影响了社会稳定。大语言模型的出现进一步降低了信息生成和伪造的成本,使谣言更易产生。为减少谣言影响,需要不断发展谣言检测技术,重点关注在线社交网络中可能包含欺骗性信息的文字和图片。以往研究较少关注谣言写作风格特征和图像被篡改的现象以及图文模态之间更深层次的关联。为此,本研究提出了“多模态多层级注意力网络”的谣言检测框架。该框架利用图卷积网络捕捉谣言和非谣言的文本写作风格特征,除了通过传统方式获取图像的语义特征外还通过误差水平分析识别图像的被篡改片段。受Transformer编码器启发,模型通过多模态特征编码器学习模态间的高维特征,以将其输入分类器进行谣言检测。最后,本研究使用了国内外社交平台的公开数据集,证实了模型的有效性,并对模型进行了分析。

关键词: 舆情管理, 深度学习, 谣言检测, 社交平台, 多模态, 注意力机制

Abstract: In today’s rapidly evolving information age, social media has become an indispensable platform for disseminating various types of information to broad and diverse audience. However, the surge of content on these platforms has also led to negative consequences, particularly the chaos and misinformation caused by rumors. The content on these platforms often includes both text and images. This multimodal nature makes it difficult for users to discern the authenticity of the information, leading to the widespread dissemination and adoption of rumors, which threatens social stability. The emergence of large language models like ChatGPT has significantly lowered the barriers to generating and spreading information, making it easier to create rumors. Therefore, there is a pressing need to continuously advance rumor detection technology to mitigate the harmful impact of rumors and protect individuals from their influence. Traditionally, rumor detection technologies have primarily focused on identifying relevant features in text and images. However, the complex relationships among rumor writing styles, the potential for image tampering, and multimodal information remain critical areas that need attention. This study aims to address these challenges by developing an advanced deep learning framework called the multi-modal multi-layer attention network (MMAN). This framework integrates multiple data modalities and utilizes multi-layer attention mechanisms to uncover the complex patterns inherent in deceptive content. The goal of this approach is to enhance the accuracy and efficiency of rumor detection systems, thereby reducing the harmful impact of misinformation on individuals and society.
This study focuses on constructing a MMAN framework for rumor detection and conducting a multi-dimensional analysis of it. The deep learning model framework uses the TF-IDF algorithm and the PMI algorithm to build a text segment-word network, and then employs a graph convolutional network to capture writing style features related to rumors or non-rumors. Additionally, an error level analysis is used to detect tampered parts of images and extract corresponding features, built on traditional methods for extracting image semantic features. Inspired by transformer encoders, the study constructs multi-modal feature encoders to acquire high-dimensional features across different modalities. The model is trained using the AdamW optimizer, combined with early stopping techniques to optimize computational resource utilization. Hyperparameter tuning is meticulously performed through grid search to determine the best combination of hyperparameters, ensuring optimal detection accuracy for rumor posts. Further, the model’s performance is validated using datasets from two major social platforms, with rigorous comparisons with baseline models to demonstrate the model’s superiority. The study also visualizes the attention mechanism weight matrices at the end of the text and image feature extraction sub-networks to further interpret the model. t-SNE dimensionality reduction techniques are used to visualize the feature sequences output by the core modules, allowing for a detailed analysis of the model’s primary functions. Finally, the model’s robustness is strictly evaluated by introducing noisy data and combining the original data with noisy data from different modalities, comprehensively assessing its resilience against interference.
This study provides a viable deep learning approach for rumor detection, successfully developing and validating a deep learning rumor detection model that outperforms baseline models. The experimental results clearly show that the model demonstrates high accuracy and efficiency in detecting rumors on two major social media platforms, both domestic and international. Ablation experiments, conducted by selectively removing various modules of the model, verifies the unique contributions and roles of each module in handling different data types, showcasing the model’s strong generalization performance across various social platforms. Additionally, the robustness tests reveal that the model has a certain level of resistance to interference, but its performance significantly declines when dealing with noisy text data. This decline is attributed to its focus on rumor/non-rumor texts on social media platforms. In terms of application, deploying this model in a real-time rumor detection system has significant potential. It can enhance social media regulation by providing users with timely and accurate rumor alerts, thereby effectively curbing the spread of misinformation.
This study provides a potential pathway for rumor detection, particularly explores advanced feature extraction techniques, and further optimizes the model to enhance its performance and robustness. The text feature extraction part of the model may be overly focused on the specific domain of rumor detection. Thus, introducing pre-trained models in the future could enhance their generalization ability and address the issue of resistance to textual noise. Regarding the Weibo dataset, many images may not be directly related to the text content of posts, which could lead to poor image feature extraction in the initial stages. Therefore, more sophisticated feature extraction methods could be considered to extract more effective image features from the outset. We extend our heartfelt gratitude to the invaluable data sources used in this study and the pioneering contributions in the fields of rumor detection and deep learning. Additionally, we sincerely thank the expert reviewers and editors for their meticulous efforts, which have significantly improved the quality and rigor of this research.

Key words: opinion management, deep learning, rumor detection, social platforms, multi-modal, attention mechanism

中图分类号:

TP391

张耀曾, 马静. 基于多模态多层级注意力网络的社交平台谣言检测[J]. 运筹与管理, 2025, 34(10): 17-23.

ZHANG Yaozeng, MA Jing. Rumor Detection on Social Platforms Using Multi-modal Multi-layer Attention Networks[J]. Operations Research and Management Science, 2025, 34(10): 17-23.

参考文献

[1] 中央纪委国家监委网站.文生视频大模型引发广泛关注[EB/OL].(2024-03-03)[2024-03-06].https://www.ccdi.gov.cn/toutiaon/202403/t20240303_331704.html.
[2] ALTURAYEIF N, LUQMAN H, AHMED M. A systematic review of machine learning techniques for stance detection and its applications[J]. Neural Computing and Applications, 2023, 35(7): 5113-5144.
[3] ZENG J, ZHANG Y, MA X. Fake news detection for epidemic emergencies via deep correlations between text and images[J]. Sustainable Cities and Society, 2021, 66: 102652.
[4] SONG C, NING N, ZHANG Y, et al. A multimodal fake news detection model based on crossmodal attention residual and multichannel convolutional neural networks[J]. Information Processing & Management, 2021, 58(1): 102437.
[5] CHEN J, WU Z, YANG Z, et al. Multimodal fusion network with contrary latent topic memory for rumor detection[J]. IEEE MultiMedia, 2022, 29(1): 104-113.
[6] 戚力鑫,万书振,唐斌,等.基于注意力机制的多模态融合谣言检测方法[J].计算机工程与应用,2022,58(19):209-217.
[7] ZHANG H, QIAN S, FANG Q, et al. Multi-modal meta multi-task learning for social media rumor detection[J]. IEEE Transactions on Multimedia, 2021, 24: 1449-1459.
[8] GUO Y. A mutual attention based multimodal fusion for fake news detection on social network[J]. Applied Intelligence, 2023, 53(12): 15311-15320.
[9] BAZMI P, ASADPOUR M, SHAKERY A. Multi-view co-attention network for fake news detection by modeling topic-specific user and news source credibility[J]. Information Processing & Management, 2023, 60(1): 103146.
[10] YAO L, MAO C, LUO Y. Graph convolutional networks for text classification[C] //AAAI Conference Committee. Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019: 7370-7377.
[11] 乔若羽.基于神经网络的股票预测模型[J].运筹与管理,2019,28(10):132-140.
[12] XU F, ZENG L, HUANG Q, et al. Hierarchical graph attention networks for multi-modal rumor detection on social media[J]. Neurocomputing, 2024, 569: 127112.
[13] YANG H, ZHANG J, ZHANG L, et al. MRAN: Multimodal relationship-aware attention network for fake news detection[J]. Computer Standards & Interfaces, 2024, 89: 103822.
[14] XUE J, WANG Y, TIAN Y, et al. Detecting fake news by exploring the consistency of multimodal data[J]. Information Processing & Management, 2021, 58(5): 102610.

[1]	梁雨欣, 甘明鑫, 张雄涛. 基于多属性感知图神经网络的会话推荐方法研究[J]. 运筹与管理, 2025, 34(9): 17-24.
[2]	陈万志, 崔黛玉. 融合GCN和改进Informer的地铁客流量预测模型[J]. 运筹与管理, 2025, 34(8): 206-211.
[3]	张大斌, 曾芷媚, 凌立文, 余泽汇. 集成多元网络信息的期货价格波动预测:农产品玉米期货实证[J]. 运筹与管理, 2025, 34(3): 183-189.
[4]	张鹏, 杨洋, 何嘉怡. 基于深度学习收益预测的均值—下偏差投资组合优化研究[J]. 运筹与管理, 2025, 34(1): 221-226.
[5]	吴丽丽, 邰庆瑞, 卞洋, 李言辉. 基于GA-VMD与CNN-BiLSTM-Attention模型的区域碳排放交易价格预测研究[J]. 运筹与管理, 2024, 33(9): 134-139.
[6]	李哲, 王超, 张卫国, 易志高. 基于深度学习的上证50ETF期权定价研究[J]. 运筹与管理, 2024, 33(9): 201-207.
[7]	吴彬溶, 王林. 基于JADE-TFT模型的可解释性旅游需求预测研究[J]. 运筹与管理, 2024, 33(8): 148-154.
[8]	刘鹏, 桂亮, 王慧蓉, 夏昊翔. 融合网络结构与节点属性的关系预测深度学习方法研究[J]. 运筹与管理, 2024, 33(7): 158-165.
[9]	周谧, 周雅婧, 贺洋, 方必和. 基于ER Rule的多分类器汽车评论情感分类研究[J]. 运筹与管理, 2024, 33(5): 161-168.
[10]	王永, 刘岽, 杜锡为, 肖玲. 融合注意力机制的自编码器推荐算法[J]. 运筹与管理, 2024, 33(2): 57-63.
[11]	高宇星, 宗威, 胡凯, 杨旭. 基于CNN-ATTBiLSTM网约车需求短时预测[J]. 运筹与管理, 2024, 33(11): 211-217.
[12]	王永, 李行健, 邓江洲. 融合注意力机制的残差神经协同过滤推荐模型[J]. 运筹与管理, 2024, 33(10): 201-208.
[13]	吴彬溶, 王林. 基于注意力机制的ADE-Bi-IndRNN模型的中国粮食产量预测[J]. 运筹与管理, 2024, 33(1): 102-107.
[14]	郭小宇, 马静. 基于深度学习的电商商品购买意图识别模型[J]. 运筹与管理, 2024, 33(1): 145-150.
[15]	杨蓦, 王静. 基于时空注意力机制的双向长短期记忆神经网络的股指预测研究[J]. 运筹与管理, 2023, 32(8): 174-180.