摘要
把衰减词共现图方法应用于多文档摘要.该共现图算法结合了统计和语义分析,并试图找出多文档集合的主题词及不同主题间的连接信息.通过MMR思想优化句子选择,生成既全面反映文档集主要内容又极小化信息冗余的摘要.通过DUC2005测试,该方法取得了令人满意的效果.
This paper applies the method of decaying word co-occurrence graph to multi-document summarization. The algorithm of co-occurrence graph combines statistics with semantic dnalysis and tries to get subject-words of the cluster of multiple documents and linkage information of different text subjects. It optimizes sentence selection by using the idea of MMR so as to obtain summarization with coverage of the main content while least information redundancy. The experiment of DUC2005 evaluation shows that the result is satisfactory.
出处
《小型微型计算机系统》
CSCD
北大核心
2009年第1期173-177,共5页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(60573077)资助
关键词
衰减词共现图
潜在语义分析
主题词
自然语言处理
the decaying word co-occurrence graph
latent semantic analysis
subject-word
natural language processing