期刊文献+

一种基于共词网络的社交媒体数据主题挖掘方法 被引量:11

A New Social Media Topic Mining Method Based on Co-word Network
原文传递
导出
摘要 对社交媒体所包含文本数据的深入挖掘,有利于有效地进行后续的时空分析。提出了一种新的基于共词网络的社交媒体数据主题挖掘方法,依据词频-逆文档频率分析,自动筛选出与主题相关的关键词汇,基于微博间是否包含相同的关键词汇,提出构建以微博为节点的共词网络,并结合Louvain社区探测算法进行文本主题挖掘。所提出的方法是一种无监督方法,且具有不需要指定聚类数目的优点。实验表明,该方法在主题挖掘表现上,准确率和召回率均优于常用的文档主题生成模型。以收集的2012年北京暴雨期间包含关键词的微博为例,利用提出的方法对微博数据集进行挖掘和时空分析,结果表明所提方法在实际应用中的有效性。 The in-depth exploration of the text data contained in social media facilitates efficient analysis of time and space. This paper proposes a new social media topic mining method based on the concept of co-word network and community detection. The method uses term frequency-inverse document frequency(TF-IDF) analysis to identify the key words of the messages automatically. Based on the problem whether the microblogs contain the same key words or not, we put forward the concept of microblog co-word network with microblog as the node. The network combined with the Louvain community detection algorithm is used to classify the microblogs into different clusters with topics. The proposed method is an unsupervised method. The advantage of this method is that there is no need to specify the number of clusters. Experiments demonstrate that the performance of the proposed method is better than the commonly used latent dirichlet allocation(LDA) model on both precision and recall. Taking the collected microblogs during the 2012 Beijing rainstorm as the case study, the method is used to conduct in-depth mining and time-space analysis of the microblogs dataset. The results demonstrate that the proposed method is effective in real world applications.
作者 王艳东 付小康 李萌萌 WANG Yandong;FU Xiaokang;LI Mengmeng(State Key Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing Wuhan University,Wuhan 430079,China;Collaborative Innovation Center of Geospatial TechnoIogy,Wuhan 430079,China;Faculty of Geomatics,East China Universlty of Technology,Nanchang 330013,China)
出处 《武汉大学学报(信息科学版)》 EI CSCD 北大核心 2018年第12期2287-2294,共8页 Geomatics and Information Science of Wuhan University
基金 国家重点研发计划(2016YFB0501403) 国家自然科学基金(41271399) 测绘地理信息公益性行业科研专项经费(201512015)~~
关键词 共词网络 社交媒体 Louvain社区探测 主题挖掘 co-word network social media Louvain community detection topic mining
  • 相关文献

参考文献6

二级参考文献84

共引文献375

同被引文献165

引证文献11

二级引证文献43

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部