期刊文献+

基于改进TextRank的关键句提取方法 被引量:2

The Method of Key Sentence Extraction Based on Improved TextRank
下载PDF
导出
摘要 在进行文本挖掘时,通常根据关键词分析文本,这种方式容易忽略词语之间的关联性,影响文本挖掘的准确性。TextRank算法是提取关键词或者摘要的主要方法,该算法基于网络图考虑了句子间相似性,但是忽略了词语的特征。基于此,提出了一种改进TextRank算法,将相似语句合并后,考虑多种词特征进行关键句选取。首先,计算语句相似度,并且去除文中相似性较高的语句;然后,根据词频、词义、词位置对词语打分,构建有向图;最后,计算语句平均得分进行排序,选出关键句。实验结果表明,改进后的算法准确性优于其他算法,算法的时间复杂度降低,并且解决了关键词对文本描述片面和摘要烦琐的问题。 In text mining,text was analyzed according to keywords.However,this way was easy to ignore the relevance between words,and the affect the accuracy of text mining.Based on network graph TextRank algorithm was the main method for extracting keywords or abstracts,whice taking into account the similarity between sentences.But the algorithm neglected the characteristics of words.Based on this,an improved TextRank algorithm was proposed,which could consider multiple word features while merging similar sentences.Firstly,it calculated the similarity of sentences and removed the sentences with high similarity.Then,according to word frequency,word meaning and word position,the words were scored and a directed graph was made.Finally,this algorithm calculated the average score of the sentences and sorted them to select key sentence.The results showed that the accuracy of the proposed algorithm was better than others,the time complexity was reduced,and the problems of one-sided description by keywords extraction and redundancy by abstract extraction were solved.
作者 陈梦彤 谷晓燕 刘甜甜 CHEN Mengtong;GU Xiaoyan;LIU Tiantian(School of Information Management,Beijing Information Science&Technology University,Beijing 100192,China)
出处 《郑州大学学报(理学版)》 CAS 北大核心 2023年第1期15-20,共6页 Journal of Zhengzhou University:Natural Science Edition
基金 国家自然科学基金项目(71701020) 国家重点研发计划项目(2019YFB1405003) 北京市社科项目(19YJB015)。
关键词 关键句提取 改进TextRank算法 相似句合并 词特征 key sentence extraction improved TextRank algorithm similar sentences merging word feature
  • 相关文献

参考文献9

二级参考文献51

共引文献97

同被引文献23

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部