摘要
微博突发话题检测是当前网络舆情的重要研究领域,从海量的微博数据中快速准确的检测出突发话题是个亟待解决的问题。针对微博特征词提取不全问题,提出基于突发词共现的微博突发话题检测方法。首先根据文档和词语频次抽取候选突发词;然后根据微博影响力、文本信息及词权重增长率抽取突发词;最后,通过突发词共现法完成突发话题检测。实验结果表明突发词共现法提高了微博突发话题检测的查准率、查全率及F值。
The detection of burst topics in Micro-blog has become of great importance in current network public opinion.It is an urgent problem to detect burst topics quickly and accurately from massive Micro-blog data.Aiming at the problem of incomplete extraction of feature words in Micro-blog,a novel Micro-blog topic detection method is proposed based on co-occurrence of burst words.Firstly,the candidate burst words are extracted according to the frequency of documents and words.Then,according to the influence of Micro-blog,text information and the growth rate of word weight,burst words are extracted.Finally,the burst topic detection is completed by the co-occurrence method of burst words.The experimental results show that the burst word co-occurrence method improves the precision,recall and F value of Micro-blog burst topic detection.
作者
魏景璇
WEI Jing-xuan(Modern Educational Technology Center,Binzhou Polytechnic,Binzhou 256603,China)
出处
《滨州学院学报》
2020年第4期74-79,共6页
Journal of Binzhou University
基金
山东省自然科学基金资助项目(ZR2014FL010)。
关键词
突发话题
突发词共现
候选突发词
突发词
burst topics
co-occurrence of burst words
candidate burst words
burst words