摘要
针对传统基于用户的博文内容和共同好友数在计算微博用户的相似度时存在潜在误差过大的问题、而基于用户多源背景信息的相似度计算模型有计算复杂度高且忽略了用户的兴趣等问题,提出了一种结合用户兴趣和背景信息的综合相似度计算方法(BIBS)。首先从用户的标签中提取用户的兴趣,当用户的标签缺失时,通过对用户关注关系网络中的重要用户聚类来间接获取用户的兴趣点,以此计算用户的兴趣相似度;其次根据用户的性别、年龄和地点等背景属性计算用户的背景相似度,层次化地挖掘出最相似的用户;最后基于新浪微博的数据进行实验分析。结果表明,与基于多源信息相似度的微博用户推荐算法(MISUR)相比,该方法在用时更少的情况下,准确率、召回率和F值分别提高了8.1%、16.7%和13.6%,证明了提出的BIBS方法的有效性和准确性。
The traditional method of calculating the similarity of the Microblog users based on the user’s blog content and the number of common friends has the problem of excessive potential error,and the similarity calculation model based on the user’s multi-source background information has high computational complexity and ignores the user’s interest and other issues.this paper put forward a method to calculate the comprehensive similarity combining user’s interest and background information(BIBS).The method extracted the user’s interest from the user’s tag.When the user’s tag was missing,it indirectly obtained the user’s interest by clustering the important user’s in the user’s attention network,and calculated the user’s interest similarity.Then it calculated the background similarity of the user according to the background information such as the gender,age and location of the user,so that it hierarchically mined the most similar users.Experiments and analysis based on the data of Sina Microblog show that compared with MISUR algorithm based on the similarity of multi-source information,the proposed method can improve the accuracy,recall rate and F-measure by 8.1%,16.7%and 13.6%respectively with less time consuming,which proves the effectiveness and accuracy of the BIBS method.
作者
黄贤英
阳安志
刘小洋
刘广峰
Huang Xianying;Yang Anzhi;Liu Xiaoyang;Liu Guangfeng(College of Computer Science&Engineering,Chongqing University of Technology,Chongqing 400054,China)
出处
《计算机应用研究》
CSCD
北大核心
2020年第1期66-70,106,共6页
Application Research of Computers
基金
重庆市教育委员会人文社会科学研究项目(17SKG144、18SKGH110)
国家教育部人文社科青年基金资助项目(16YJC860010)
国家社科基金资助项目(17XXW004)
2018年重庆市科委技术创新与应用示范项目(cstc2018jscx-msybX0049).
关键词
微博
兴趣
用户聚类
相似度计算
Microblog
interest
user clustering
similarity calculation