摘要
【目的】探究基于内容的深度谣言检测模型能否真正识别谣言的关键语义。【方法】基于谣言检测任务的中英文基准数据集,本文分别利用基于局部代理模型的可解释工具LIME和基于合作博弈论的可解释工具SHAP,分析BERT模型所识别出的关键特征,并判断其是否能反映谣言特性。【结果】可解释工具在不同模型与数据集上计算得出的关键特征差异性较大,无法辨别模型识别的重要特征和谣言之间的语义关系。【局限】本文验证的数据集和模型数量都十分有限。【结论】基于深度学习的谣言检测模型仅拟合了训练集的特征,面向多样的真实场景缺少足够的泛化性和可解释性。
[Objective]This study explores whether content-based deep detection models can identify the semantics of rumors.[Methods]First,we use the BERT model to identify the key features of rumors from benchmark datasets in Chinese and English.Then,we utilized two interpretable tools,LIME,based on local surrogate models,and SHAP,based on cooperative game theory,to analyze whether these features can reflect the nature of rumors.[Results]The key features calculated by the interpretable tools on different models and datasets showed significant differences,and it is challenging to decide the semantic relationship between the features and rumors.[Limitations]The datasets and models examined in this study need to be expanded.[Conclusion]Deep learning-based rumor detection models only work with the features of the training set and lack sufficient generalization and interpretability for diverse real-world scenarios.
作者
贺国秀
任佳渝
李宗耀
林晨曦
蔚海燕
He Guoxiu;Ren Jiayu;Li Zongyao;Lin Chenxi;Yu Haiyan(Faculty of Economics and Management,East China Normal University,Shanghai 200062,China)
出处
《数据分析与知识发现》
EI
CSSCI
CSCD
北大核心
2024年第4期1-13,共13页
Data Analysis and Knowledge Discovery
基金
国家自然科学基金项目(项目编号:72204087)
上海市哲学社会科学规划青年课题(项目编号:2022ETQ001)
中央高校基本科研业务费专项资金资助项目的研究成果之一