摘要
提出了一种融合标签平均划分距离和结构关系的微博用户可重叠社区发现算法.首先从信息论与距离的概念出发,定义基于核心标签平均划分距离的准划分算法;再根据用户关注关系定义结构属性向量,并计算用户结构相异度,进而对核心标签平均划分距离和用户结构相异度进行权重调节,得到综合划分相异度;最后将综合划分相异度最低的标签所划分出的分组作为本次循环的新社区;实验表明,该方法能够识别可重叠社区且具有实际应用意义.
In this paper,a microblog user community detection algorithm via tag mean partition distance and social structure is proposed.Firstly,through the concept of information theory and distance,a community pre-partition algorithm based on the mean partition distance of core tags is established.Furthermore,a structure attribute vector is defined according to the user's following and follower relationships,based on which the user structure dissimilarity is calculated.Then,the comprehensive division dissimilarity is derived by adjusting the weight of mean distance of core tag and user structure dissimilarity.Finally,the subgroup corresponding to the tag with the lowest comprehensive division dissimilarity degree is considered as a new community for one iteration.Experiments show that the proposed method is effective and has practical significance.
作者
马慧芳
陈海波
赵卫中
邴睿
黄乐乐
MA Hui-fang;CHEN Hai-bo;ZHAO Wei-zhong;BING Rui;HUANG Le-le(Computer Science and Engineering,Northwest Normal University,Lanzhou,Gansu 730070,China;Guangxi Key Laboratory of Trusted Software,Guilin University of Electronic Technology,Guilin,Guangxi 541004,China;College of Information Engineering,Xiangtan University,Xiangtan,Hunan 411105,China)
出处
《电子学报》
EI
CAS
CSCD
北大核心
2018年第11期2612-2618,共7页
Acta Electronica Sinica
基金
国家自然科学基金(No.61762078,No.61762080)
广西可信软件重点实验室研究课题(No.kx201705)。
关键词
可重叠划分
核心标签
平均划分距离
结构相异度
综合划分相异度
overlapping community detection
core tag
mean partition distances(MPD)structure dissimilarity
comprehensive division dissimilarity(CDS)