摘要
为提升提取文本关键词的准确性,文中提出了一种文本关键词提取方法。该方法融合词频、词长、词语位置及词性等关键词提取影响因素,提出了候选关键词的权重公式;通过实验获取权重公式的相对最优权重系数;将权重公式应用到TextRank算法的候选关键词得分公式中,以提升提取文本关键词的准确性。通过实验对比了OPW-Text-Rank算法与TextRank算法对单文本关键词提取的准确率、召回率及F值,结果表明,OPW-TextRank算法在窗口大小为6时,提取关键词的准确率高于TextRank算法。在以文本关键词提取为基础的自然语言处理系统中所提算法具有一定的实用性。
To improve the accuracy of keyword extraction,a text keyword extraction me-thod was proposed.This method combines the influence factors such as word frequency,word length,word position and word length,proposes the weight formula of candidate keywords.Then it obtains the relative optimal weight coefficient in the weight formula by experiment,applies the weight formula to the candidate keyword scoring formula of TextRank algorithm,and extracts the accuracy of text keywords.The accuracy,recall and F value of OPW-TextRank algorithm and TextRank algorithm in single text keyword extraction were compared through the experiment.The results show that the accuracy of OPW-TextRank algorithm is higher than that of TextRank algorithm when the window size is 6.It is useful in natural language processing keyword system based on text keyword extraction.
作者
徐立
XU Li(School of Software,Shangqiu Polytechnic,Shangqiu,Henan 476100,China;Suzhou Research Institute,University of Science and Technology of China,Suzhou,Jiangsu 215000,China)
出处
《计算机科学》
CSCD
北大核心
2019年第B06期142-145,共4页
Computer Science