摘要
【目的】结合链路预测与机器学习,提出推荐未来科研合作的新方法,以提高单独基于链路预测方法的推荐精确度。【方法】构建加权作者合作网,以不同的链路预测指标作为特征输入,运用极端随机树(Extremely Randomized Trees,ET)机器学习算法训练分类,并利用遍历算法求取分类结果的最优权重组合,选取TOP准确度的预测作为合作推荐结果。【结果】选取纳米科技领域2008年–2010年SCI论文数据进行实证。在城市合作推荐中,改进的ET方法优于已有方法,有良好的推荐成功率;预测方法受网络结构等因素影响较小,适用范围更广泛。【局限】科研合作受合作动机、地域、语言等诸多因素影响,加权作者合作网没有反映在一篇论文中同城市、同机构的多个作者,也没有反映上述因素。【结论】改进算法能够比单个预测指标产生更准确的合作推荐建议,也为推广到大学等机构、个人等更微观的应用层面提供参考。
[Objective] This paper proposes a method to recommend scientific research collaborators based on link prediction and machine learning, which improves the precision of traditional method. [Methods] First, we used Link Prediction Algorithm index to build the feature input, and adopted the Extremely Randomized Trees Algorithm to train the classifier. Then, we obtained the optimal weight combination with the traversal algorithm to combine the classification results linearly. Finally, we received the best recommendation of collaborators. [Results] The improved ET method had better performance than the existing ones in recommending the collaboration cities. Besides, the proposed method was less affected by factors such as the network structure, and could be used with more applications. [Limitations] Scientific research collaboration is affected by the cooperation motivation, geographical, language and many other factors. The weighted author network did not examine authors from the same cities or with the same organizations. [Conclusions] The propsoed method could produce better recommendation results, which might help universities, institutions and individuals identify academic collabortors.
出处
《数据分析与知识发现》
CSSCI
CSCD
2017年第4期38-45,共8页
Data Analysis and Knowledge Discovery
基金
国家自然科学基金面上项目"科学结构特征及其演化动力学分析方法与应用研究"(项目编号:71173211)的研究成果之一
关键词
科研合作网络
链路预测
机器学习
随机森林
极端随机树
推荐
Scientific Research Collaboration Network Link Prediction Machine Learning Random ForestExtremely Randomized Trees Recommendation