摘要
文中介绍了现有几种比较流行的关键词提取技术,提出了基于隐马尔科夫模型的加权Textrank的单文档关键词抽取算法。对比分析了三种算法的效果:基于词频的关键词提取算法,基于词性、位置、频度的关键词提取算法,加权Textrank算法。实验结果表明加权Textrank算法在单文档提取中有较好的效果,并且在单篇文章提取较少的关键词时准确率较高。
The article introduces several existing popular keyword extraction techniques,and puts forward a single document keyword extraction algorithm based on weighted Textrank based on hidden Markov model. Specifically,it includes comparative analysis of keyword extraction based on word frequency algorithm; comparative analysis of keyword extraction based on POS,location,frequency of keyword extraction algorithm; weighted Textrank algorithm results. The experiments show that the weighted Textrank algorithm has good performances in a single document extraction,and the extraction of keywords in less accuracy rate is high.
出处
《信息技术》
2015年第4期114-116,120,共4页
Information Technology