摘要
针对句子语义相似度计算问题,综合考虑句子的结构信息与语义信息,提出一种多特征融合的句子语义相似度计算方法。提取句子的词形特征、词序特征及句长特征,使用层次分析法进行权重分配,计算结构相似度;利用本体图中最短路径定义语义距离,基于语义距离计算句子语义相似度;对结构相似度和语义相似度进行特征加权,构建多特征融合的句子语义相似度计算方法。实验结果表明,该方法取得了72.5%的F-度量值,与传统余弦相似度和基于关键词的相似度算法相比提高了12%。
Aiming at the problem of sentence semantic similarity computation,a similarity computation method based on multi feature fusion was proposed,considering the structural similarity and semantic similarity of the sentence.The morphological features,word order features and sentence length features of sentences were extracted,analytic hierarchy process was used to perform weight distribution,and thus the structural similarity was calculated.The semantic distance was defined according to the shortest path in the ontology graph,and the semantic similarity was calculated based on the semantic distance.The structural similarity and semantic similarity were weighted,and the semantic similarity calculation method of multi-feature fusion was constructed.Experimental results show that the F-measure of this algorithm can reach 72.5%,which is 12% higher than that of traditional cosine similarity and keyword based similarity algorithm.
作者
翟社平
李兆兆
段宏宇
李婧
董迪迪
ZHAI She-ping;LI Zhao-zhao;DUAN Hong-yu;LI Jing;DONG Di-di(School of Computer Science and Technology,Xi’an University of Posts and Telecommunications,Xi’an 710121,China;Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing,Xi’an University of Posts and Telecommunications,Xi’an 710121,China)
出处
《计算机工程与设计》
北大核心
2019年第10期2867-2873,2884,共8页
Computer Engineering and Design
基金
陕西省社会科学基金项目(2016N008)
西安市社会科学规划基金项目(17X63)
陕西省自然科学基金项目(2012JM8044)
陕西省教育厅科学研究计划基金项目(12JK0733)
西安邮电大学研究生创新基金项目(CXL2016-24)
关键词
句子相似度
结构相似度
语义相似度
本体
层次分析法
sentence similarity
structure similarity
semantic similarity
ontology
analytic hierarchy process