摘要
为了解决大型学术数据库中重名作者的歧义消解问题,提出了基于元路径异构网络嵌入的姓名实体消歧模型。使用大型在线学术搜索系统DBLP上的公开数据集,首先抽取学术出版物的作者信息、标题和会议期刊名称等特征属性,再利用word2vec模型工具生成的特征属性词嵌入输入到GRU网络中进行训练,构造出一个PHNet矩阵网络进行随机游走操作,从而捕捉不同类型节点之间的关系,最后进行相似节点的划分,完成姓名消歧工作。实验结果显示,新方法的精确度为0.865,召回率为0.792,F 1值为0.815。基于元路径的异构网络嵌入模型的精确度、召回率等指标都优于对比模型。因此,所提出的模型在提高大型学术数据库的消歧精准度方面具有良好的应用前景。
In order to solve the problem of disambiguation of duplicate authors in large academic databases,a name entity disambiguation model based on meta-path heterogeneous network was proposed.Based on the public data of the large online academic search system DBLP,the author information,title,name of conference journal and other characteristic attributes of academic publications were extracted first.Then the characteristic attribute words generated by the word2vec model tool were embedded into the GRU network for training,so that a PHNet matrix network for random walk operation was constructed to capture the relationship between different types of nodes and finally similar nodes were divided to complete the name disambiguation.The experimental results show that the accuracy of the method is 0.865,the recall rate is 0.792,and the F 1 value is 0.815.The meta-path-based heterogeneous network embedding model is superior to the comparison model in terms of accuracy and recall rate.Therefore,the proposed model has a good application prospect in improving the accuracy of disambiguation of large academic databases.
作者
王建霞
张玉璇
许云峰
WANG Jianxia;ZHANG Yuxuan;XU Yunfeng(School of Information Science and Engineering,Hebei University of Science and Technology,Shijiazhuang,Hebei 050018,China)
出处
《河北科技大学学报》
CAS
2020年第3期233-241,共9页
Journal of Hebei University of Science and Technology
基金
中国留学基金委地方合作项目(201808130283)
中国教育部人工智能协同育人项目(201801003011)
河北科技大学校立课题(82/1182108)
河北科技大学雾霾与空气污染防治科研项目(82/1182169)
河北省科技支撑计划项目(17210104D,18210109D)
河北省高等学校科学技术研究项目(ZD2015099)
河北省高层次人才资助项目(A2016002015)。
关键词
自然语言处理
计算机神经网络
实体消歧
网络嵌入
异构网络
natural language processing
computer neural network
entity disambiguation
network embedding heterogeneous network