期刊文献+

基于核心标签的可重叠微博网络社区划分方法 被引量:7

An Overlapping Microblog Community Detection Algorithm via Core Tags
下载PDF
导出
摘要 针对传统微博社区发现算法内聚低重叠度不可控制等问题,以自顶向下的策略,提出一种基于核心标签的可重叠微博社区发现策略Tag Cut.先利用用户标签的共现关系及逆用户频率对标签进行加权,并基于标签之间的内联及外联关系并将用户的标签进行扩充,然后在整体社区中提取包含某一标签的用户作为临时分组并利用评价函数评估划分的优劣,最后选出最合适的核心标签根据其对应分组与其他分组距离的远近来决定将其划分为新的分组还是并入其他分组.用此策略反复迭代直到满足要求.该算法划分的组由若干个拥有核心标签的分组组成且综合利用微博用户已声明的及隐含的兴趣、用户之间的关注规律、结果的实用性对划分结果进行修正.经真实数据实验表明该方法内聚高社区重叠度可控且拥有实际意义. The traditional microblog community detection algorithm has the characteristic of low coupled clustering and the overlapping degree can not be controlled. In this paper, we present a divisive approach for overlapping microblog community detection algorithm via core tags. Firstly,the key idea is to develop a tag weighing strategy by taking advantage of the co-occur- rence of tags and inverse user frequency. Then tag correlation can be exploited,which investigates both inter and intra correlation of tags ,and the tags for users can therefore be expanded. Users containing certain tag in the whole community are extracted as a temporary group and the quality value is calculated under the current partition. The most appropriate core tag is selected and the corresponding group is then updated until certain requirements are satisfied. The commtmity detected by this algorithm share com- mon core tags and the partition results can be revised based on the explicit and implicit interest of users ,together with the users' attention and practical application. Experimental results show that the method is effective and has practical significance.
出处 《电子学报》 EI CAS CSCD 北大核心 2017年第4期769-776,共8页 Acta Electronica Sinica
基金 国家自然科学基金(No.61363058 No.61163039) 甘肃省青年科技基金(No.145RJYA259 No.1606RJYA269) 甘肃省自然科学研究基金(No.145RJZA232) 中国科学院计算技术研究所智能信息处理重点实验室开放基金(No.IIP2014-4)
关键词 微博网络 可重叠社区划分 核心标签 用户关注关系 标签划分 microblog network overlapping community detection core tag user attention relationship tag cut
  • 相关文献

参考文献9

二级参考文献161

  • 1鲁明羽,沈抖,郭崇慧,陆玉昌.面向网页分类的网页摘要方法[J].电子学报,2006,34(8):1475-1480. 被引量:5
  • 2CNN Facebook nearly as large as U. S. population [OL]. (2009-09 -16)[2011-04-27]. http..//edition, cnn. com/2009/ TECH/09/16/facebook. profit/.
  • 3Raghavan S, Molina G H. Representing Web graphs [C] // Proc of Int Conf on Data Engineering 2003. Piscataway, NJ: IEEE, 2003:405-416.
  • 4Rjeili A A, Karypis G. Multilevel algorithms for partitioning power-law graphs [C] //Proc of Int Parallel and Distributed Processing Symp 2006. Piscataway, NJ: IEEE, 2006: 16- 575.
  • 5Tian Y Y, Hankins R A, Patel M J. Effcient aggregation for graph summarization [C] //Proc of ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2008 : 567-580.
  • 6Zhang N, Tian Y Y, Patel M J. Discovery-driven graph summarization [C] //Proc of Int Conf on Data Engineering 2010. Piscataway, NJ: IEEE, 2010: 880-891.
  • 7Chakrabarti D, Faloutsos C. Graph mining: Laws, generators, and algorithms [J]. ACM Computing Surveys, 2006, 38(1): article No. 2.
  • 8Newman M E J, The structure and function of complex networks [J]. ACM Sigcsim Installation Management Review, 2003, 45: 167-256.
  • 9Chakrabarti D, Faloutsos C, Zhan Y. Visualization of large networks with min-cut plots, A-plots and R MAT [J]. Int Journal of Man-machine Studies, 2007, 65(5): 434-445.
  • 10Jun H, Wang W, Prins J, et al. Spin: Mining maximal frequent subgraphs from graph databases [C] //Proc of Knowledge Discovery and Data Mining 2004. New York: ACM, 2004:581-586.

共引文献97

同被引文献24

引证文献7

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部