摘要
【目的】提高微博谣言事件检测的准确率和时效性。【方法】提出一种基于分层语义特征学习模型的微博谣言事件检测方法。首先,基于BERT预训练模型抽取事件中单条文本信息的语义特征;其次,基于时间域对事件传播数据进行动态划分,利用卷积神经网络挖掘各时间域中的文本集合的语义相关性特征;然后,把各时间域内的语义相关性特征输入深层双向门控循环神经网络,学习事件传播过程中的深层语义时序特征;最后,融合Attention机制使模型更加关注于语义时序特征中具有谣言特征的部分。【结果】在Weibo公开数据集上的实验结果表明,该模型的检测准确率达到95.39%,检测时延在12h以内。【局限】模型需要一定数量的转发评论信息,事件热度不够时检测效果不突出。【结论】分层语义特征学习模型实现了从局部语义到全局语义的学习过程,提升了微博谣言事件检测的性能。
[Objective]This paper tries to improve the accuracy and timeliness of Weibo rumor detection.[Methods]We proposed a rumor detection method based on the hierarchical semantic feature learning model(BCGA).Firstly,we extracted the semantic features of a single text in an event based on the BERT model.Secondly,we dynamically grouped the event propagation data based on the time domain.Next,we used the convolutional neural network to learn the semantic correlation features of the text sets in each time domain.Fourth,we input the semantic correlation features in each time domain into the deep bidirectional gated recurrent neural network to learn the deep semantic temporal features of the event propagation process.Finally,we integrated the attention mechanism to make the model focus on the rumor feature in semantic temporal features.[Results]Experiments on the Weibo public data sets show that the detection accuracy of the model reached 95.39%,while the detection delay was within 12 hours.[Limitations]The model requires a certain amount of forwarding and commenting information and the detection effect is not prominent when the event is not popular enough.[Conclusions]The hierarchical semantic feature learning model achieves a learning process from local to global semantics,improving the performance of Weibo rumor detection.
作者
黄学坚
马廷淮
王根生
Huang Xuejian;Ma Tinghuai;Wang Gensheng(College of Software,Nanjing University of Information Science&Technology,Nanjing 210044,China;VR College of Modern Industry,Jiangxi University of Finance and Economics,Nanchang 330013,China;College of Humanities,Jiangxi University of Finance and Economics,Nanchang 330013,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2023年第5期81-91,共11页
Data Analysis and Knowledge Discovery
基金
国家重点研发计划(项目编号:2021YFE0104400)
国家自然科学基金项目(项目编号:72061015)
江西省教育厅科技项目(项目编号:GJJ200539)的研究成果之一。
关键词
谣言检测
深度学习
语义特征
时序数据
分层语义
Rumor Detection
Deep Learning
Semantic Features
Temporal Data
Hierarchical Semantic