期刊文献+

基于种子自扩展的命名实体关系抽取方法 被引量:25

Named Entity Relation Extraction Method Based on Seed Self-expansion
下载PDF
导出
摘要 命名实体间关系的抽取是信息抽取中的一个重要研究问题,该文提出了一种从大量的文本集合中自动抽取命名实体间关系的方法,找出了所有出现在同一句子内、词语之间的距离在一定范围之内的命名实体对,把它们的上下文转化成向量。手工选取少量具有抽取关系的命名实体对,把它们作为初始关系的种子集合,通过自学习,关系种子集合不断扩展。通过计算命名实体对和关系种子之间的上下文相似度来得到所要抽取的命名实体对。通过扩展关系种子集合的方法,抽取的召回率和准确率都得到了提高。该方法在对《人民日报》语料库的测试中,取得了加权平均值F-Score为0.813的效果。 Named entity relation extraction is an important issue in inforlnation extraction, This paper proposes a special method that extracts named entity relation from large text rendezvous. It finds out the named entity pairs, which appear in the same sentences and the distances of them is under a certain value, and converts their contexts into vectors. It selects a few named entity pair instances that have the relation wanted to extract and make them as initial relation seed set, The relation seed set is extended automatically in sell-study process. It gets the named entity pairs, which have the relation wanted to extract, by calculating the similarity of context vectors between named entity pairs and relation seed set. By the method of bootstrapping, the recall and precision are enhanced. It verifies the method with the PFR corpora and achieves an average weighted F-Score of 0.813.
出处 《计算机工程》 EI CAS CSCD 北大核心 2006年第21期183-184,193,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60442005) 教育部科学技术研究基金资助重点项目(105117)
关键词 命名实体 关系抽取 自学习 Named entity Relation extraction: Self-study
  • 相关文献

参考文献6

  • 1Zelenko D,Aone C,Richardella A.Kernel Methods for Relation Extraction[C].Proc.of the Conference on Empirical Methods in Natural Language Processing,Barcelona,Spain,2002.
  • 2Brin S.Extracting Patterns and Relations from WWW[C].Proc.of WebDB Workshop at the 6th International Conference on Extending Database Technology,Valencia,Spain,1998:172-183.
  • 3Agichtein E,Gravano L.Snowball:Extracting Relations from Large Plain-text Collections[C].Proc.of the 5th ACM International Conference on Digital Libraries,2000:85-94.
  • 4鲁松,白硕,黄雄.基于向量空间模型中义项词语的无导词义消歧[J].软件学报,2002,13(6):1082-1089. 被引量:37
  • 5Manning C D,Schutze H.苑春法,李庆中,王昀等译.统计自然语言处理[M].北京:电子工业出版社,2005:335-337.
  • 6Gupta C,Grossman R.GenIc:A Single Pass Generalized Incre-mental Algorithm for Clustering[C].Proc.of International Conference on Data Mining,Brighton,UK,2004.

二级参考文献17

  • 1Schutze, H. Word space. In: Stephen, J.H., Cowan, J., Giles, C.L., eds. Advances in Neural Information Processing Systems 5. San Mateo, CA: Morgan Kaufmann, 1993. 895~902.
  • 2Salton, G., Buckley, B. Term-Weighting approaches in automatic text retrieval. Information Processing and Management, 1988,24(5):513~523.
  • 3Miller, G.A., Charles, W. Contextual Correlates of Semantic Similarity. Language and Cognitive Processes, 1991,6(1):1~28.
  • 4李娟子.汉语词义消歧方法研究[博士学位论文].北京:清华大学,1999.
  • 5Li, Juan-zi. The research on Chinese word sense disambiguation [Ph.D. Thesis]. Beijing: Tsinghua University, 1999 (in Chinese).
  • 6Ide, N., Veronis, J. Introduction to the special issue on word sense disambiguation: the state of the art. Computational Linguistics, 1998,24(1):1~40.
  • 7Schutze, H., Pedersen, J. Information retrieval based on word senses. In: Andew, H., Mooery, K., eds. Proceedings of the 4th Annual Symposium on Document Analysis and Information Retrieval. Las Vegas: University of Nevada at Las Vegas, 1995. 161~175.
  • 8Black, E. An experiment in computational discrimination of English word senses. IBM Journal of Research and Development, 1988, 32(2):185~194.
  • 9Yarowsky, D. Decision lists for Lexical ambiguity resolution: application to accent restoration in Spanish and French. In: Mooney, R., ed. Proceedings of the 32nd Annual Meeting of Association for Computational Linguistics. Las Cruces, NJ: Association for Computational Linguistics, 1994. 88~95. http://www.cs.jhu.edu/~yarowsky/pubs.html.
  • 10Mooney, R.J. Comparative experiments on disambiguating word senses: an illustration of the role of bias in machine learning. In: Brill, E., Church, K., eds. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Somerset, NJ: Association for Computational Linguistics, 1996. 82~91.

共引文献36

同被引文献255

引证文献25

二级引证文献293

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部