期刊文献+

一种基于LexRank算法的改进的自动文摘系统 被引量:15

Automatic Abstracting System Based on Improved LexRank Algorithm
下载PDF
导出
摘要 自动文摘是计算机语言学领域的一个研究重点,其研究和应用受到了计算机科学、语言学、情报信息学等相关学科的广泛关注。首先介绍了基于LexRank算法的自动文摘方法。针对该方法的不足,从句子相似度计算方法、句子权重计算方法以及冗余处理等方面对它进行了改进,从而可以根据输入文本内容动态地调整相关影响因子。实现的文摘系统,可以对中文和英文的单文本或多文本进行自动文摘。在哈工大和DUC的测评语料上进行了实验,结果表明该系统在一定程度上改进了文摘的质量,在多文本文摘中的抗噪声方面也有一定的优越性。最后讨论了自动摘要研究存在的问题,并指出了自动文摘的研究趋势。 Automatic abstracting has been a priority research point in computational linguistics field, and the study and application of automatic summarization have widely attracted the attention of interrelated academic subjects such as computer science, linguistics, informatics. I}his article firstly brought out how LexRank algorithm works in automatic summarization, then improved the method in three aspects including sentence similarity computing, sentence weight computing and redundancy resolution. And the factors of influence could be dynamically adjusted according to the documents content. The system described in this article could deal with single or multi-document summarization both in English and Chinese. With evaluations on two corpuses, our methods could produce better summaries than the original LexRank algorithm to a certain degree. We also show that our system is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents. And in the end, existing problem and the developing trend of automatic summarization technology were discussed.
出处 《计算机科学》 CSCD 北大核心 2010年第5期151-154,218,共5页 Computer Science
基金 国家自然科学基金项目(60573057 60473057 90604007)资助
关键词 自动文摘 LexRank 句子相似度 动态调整 冗余处理 Automatic abstracting LexRank Sentence similarity Dynamic adjustment Redundancy resolution
  • 相关文献

参考文献20

  • 1Luhn H P. The Automatic Creation of Literature Abstracts[J]. IBM Journal of Research and Development, 1958 : 159-165.
  • 2Edmundson W. Automatic Abstracting and Indexing:Survey and Recommendations[J]. Communication of the ACM, 1961,4 (5): 226-234.
  • 3Edmundson W. New methods in automatic abstracting [J].Journal of the Association for Computing Machinery, 1996,16(2): 264-285.
  • 4Pollock J J, Zamora A. Automatic Abstracting Research at Chemical Abstracts Service[J]. Journal of Chemical Information and Computer Sciences, 1975,15(4) : 226-232.
  • 5Paice C D. The Automatic Generation of Literature Abstracts: An Approach Based on the Identification of Self-indicating Phrases[J]. Information Retrieval Research.
  • 6Schank C, Abelson P. Scripts, Plans, Goals, and Understanding: An Inquiry into Human Knowledge Structures[M]. Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1977.
  • 7Lisa F R, Jacobs P S. SCISOR.. Extracting Information Online News[J]. Communication of the ACM, 1990,33 (11): 88-97.
  • 8Blair-Goldensohn S. Columbia University at DUC 2004[C]//DUC 2004. 2004.
  • 9Gunes E, Radev D R. LexRank: Graph-based Centrality as Salience in Text Summarization [J]. Journal of Artificial Intelligence Research, 2004,22.
  • 10Lin Chin-Yew, Hovy E H. Automatic Evaluation of Summaries Using N-gram Co-oeeurrence Statistics[C]//Proeeeding of 2003 Language Technology Conference (HLT-NAACL 2003). Canada, 2003.

二级参考文献24

  • 1苏海菊,王永成.中文科技文献文摘的自动编写[J].情报学报,1989,8(6):433-439. 被引量:26
  • 2徐永东,徐志明,王晓龙,刘远超.中文文本时间信息获取及语义计算[J].哈尔滨工业大学学报,2007,39(3):438-442. 被引量:10
  • 3J Kupiec. J Pedersen et al. A trainable document summarizer. In: Proc of the 18th Annual Int'l ACM SIGIR Conf on Research and Development in Information Retrieval (SIGIR'95). Seattle, Washington, USA: ACM Press, 1995. 68~73
  • 4R Brandow, K Mitze, L F Rau. Automatic condensation of electronic publication by sentence selection. Information Processing and Management, 1995, 34(5): 575~685
  • 5Radev D R et al.Experiments in single and multiple documents summarization using MEAD//Proceedings of the Document Understanding Conference.New Orleans,2001
  • 6McKeown K,Radev D R.Generating summaries of multiple news articles//Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Seattle,Washington,1995:74-82
  • 7Hardy H et al.Cross-document summarization by concept classification//Proceedings of the Workshop on Text Summarization(DUC 2001).New Orleans,2001:65-69
  • 8Boros E et al.A clustering based approach to creating multidocument summaries//Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New Orleans,LA,2001:34-42
  • 9Yi G,Stylios George.A new multi-document summarization system//Proceedings of the Document Understanding Conference.Edmonton,Canada,2003:102-109
  • 10Radev D R.A common theory of information fusion from multiple text sources step one:Cross-document structure//Proceedings of the 1st ACL SIGDIAL Workshop on Discourse and Dialogue.Hong Kong,China,2000:74-83

共引文献99

同被引文献146

引证文献15

二级引证文献123

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部