期刊文献+

词语间依存关系的定量识别 被引量:3

To Identify the Dependent Relationship Between Words Quantificationally
下载PDF
导出
摘要 本文扩展和改进了现有的词语间依存关系定量识别算法,充分考虑词项概率分布的影响;明确区分词项之间的搭配关系、并列关系和从属关系,针对它们不同的特点,提出不同的识别算法;提出字串匹配模型;充分考虑两个词项之间相互位置的离散分布和距离的影响、以及它们的概率分布特性,提出词项间的依存强度模型,并据此构建词语间依存关系树;提出更新策略,对已经建好的依存关系树进行裁剪,并挖掘出潜在的依存关系。应用实验结果表明,本文提出的算法可以有效地识别出词语间的依存关系。 In order to identify the dependent relationship between words based on statistics efficiently and accurately, this paper has rectified part of the shortcomings of present algorithms by making the best of the distribution characteristic between words, distinguishing the collocation, coordinate and affiliation relationship between words, identifying them respectively by different strategies, presenting a new module of matching between strings and a new module of dependent intensity between words, constructing the tree of dependent relationship, pruning the constructed tree of dependent relationship and identifying some latent dependent relationship. The experiment confirmed that, the new algorithm can identify the dependent relationship between words very accurately.
出处 《中文信息学报》 CSCD 北大核心 2005年第4期31-38,共8页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60173027)
关键词 计算机应用 中文信息处理 词语搭配 依存关系 定量识别 computer application Chinese information processing collocation dependent relationship quantificational identification
  • 相关文献

参考文献13

  • 1Bourigault D., Gonzalez-Mullier I. & Gros C. LEXTER: a Natural Language Tool for Terminology Extraction[C].In: Proceedings of the 7th EURALEX International Congress on Lexicography (EURALEX96), Gteborg, Sweden,1996:771-779.
  • 2David, S. & Plante, P.. Termino Version 1.0. Research Report of Centre d'Analyse de Textes par Ordinateur[EB]. Universit6 du Qu6bec. Montreal, 1990. [3] Enguehard, C. Acquisition de Terminologie a partir de Gros Corpus[C]. In Proceedings of Informatique & Langue Naturelle, 1993:373-384.
  • 3J.S. Justeson and S.L. Katz. Technical terminology: some linguistic properties and an algorithm for identification in text[J]. Natural Language Engineering, 1996,3(2):259-289.
  • 4Daille, B. Study and Implementation of Combined Techniques for Automatic Extraction of Terminology[A]. In The balancing act combining symbolic and statistical approaches to language. MIT Press, 1995.
  • 5Heid, U. Extracting Terminologically Relevant Collocations from German Technical Texts[A]. Proceedings Fifth International Congress on Terminology and Knowledge Engineering[C], 23-27 August 1999:241-255.
  • 6Church, K.W. & Hanks P. Word Association Norms Mutual Information and Lexicography[J]. Computational Linguistics, 1990,16(1):23-29.
  • 7Dover, New York. Dunning, Ted. Accurate methods for the statistics of surprise and coincidence[J]. Computational Linguistics, 1993,19(1):61-74.
  • 8Smadja, F. Retrieving Collocations From Text. XTRACr[J]. Computational Linguistics, 1993,19(1): 143-177.
  • 9Shimohata, S. Retrieving Collocations by Co-occurrences and Word Order Constraints[C]. Proceedings of ACL-EACL'97, 1997: 476-481.
  • 10孙健,王伟,钟义信.基于统计的常用词搭配(Collocation)的发现方法[J].情报学报,2002,21(1):12-16. 被引量:15

二级参考文献23

  • 1高惠璇.统计计算[M].北京大学出版社,1997..
  • 2E F T K Sang, W Daelemans, H Déjean et al. Applying system combination to base noun phrase identification. In: Proc of COLING 2000. Saarbrücken, Germany: Morgan Kaufmann Publishers, 2000. 857~863
  • 3周明 .基于语料库的中文最长名词短语的自动抽取.见:计算语言进展与应用.北京,清华大学出版社,1995. 50-55(Zhou Ming. Corpus-based Chinese maximum noun phrase extraction. In: Computer Linguistic Development and Application(in Chinese). Beijing: Tsinghua University Press, 1995. 50-55)
  • 4K W Church. A stochastic parts program and noun phrase for unrestricted test. In: Proc of the 2nd Conf on Applied Natural Language Processing. Austin, TX, USA: Kluwer Academic Publishers, 1988. 136~143
  • 5S P Abney. Parsing by Chunks. In: R C Berwick, S P Abney eds. PrincipleBased Parsing: Computation and Psycholinguistics. Boston, USA: Kluwer Academic Publishers, 1991. 257~278
  • 6L A Ramshaw, M P Marcus. Text chunking using transformation-based learning. In: Proc of the 3rd Workshop on Very Large Corpora. Kluwer Academic Publishers, 1995. 82~94
  • 7A Ratnaparkhi. Learning to parse natural language with maximum entropy models. Machine Learning, 1999, 34(1/2/3): 151~176
  • 8范晓.静态短语和动态短语. 见:三个平面的语法观 .北京:北京语言文化大学出版社,1996(Fan Xiao. Static phrase and dynamic phrase. In: Grammar Concept from Three Sides(in Chinese). Beijing: Beijing Linguistic Culture College Publisher, 1996)
  • 9R Koeling. Chunking with maximum entropy models. In: Proc of CoNLL 2000. Lisbon, Portagal: Lingustic Association for Computation, 2000
  • 10A L Berger, S A D Pietra, V J D Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 1996, 22(1):39~71

共引文献103

同被引文献30

引证文献3

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部