期刊文献+

Discriminative Latent Model Based Chinese Multiword Expression Extraction 被引量:2

Discriminative Latent Model Based Chinese Multiword Expression Extraction
下载PDF
导出
摘要 Discriminative Latent Model(DLM) is proposed for Multiword Expressions(MWEs) extraction in Chinese text to improve the performance of Machine Translation(MT) system such as Template Based MT(TBMT).For MT systems to become of further practical use,they need to be enhanced with MWEs processing capability.As our study towards this goal,we propose DLM,which is developed for sequence labeling task including hidden structures,to extract MWEs for MT systems.DLM combines the advantages of existing discriminative models,which can learn hidden structures in sequence labeling task.In our evaluations,DLM achieves precisions ranging up to 90.73% for some type of MWEs,which is higher than state-of-the-art discriminative models.Such results demonstrate that it is feasible to automatically identify many Chinese MWEs using our DLM tool.With MWEs processing model,BLEU score of MT system has also been increased by up to 0.3 in close test. Discriminative Latent Model (DLM) is proposed for Multiword Expressions (MWEs) ex- traction in Chinese text to improve the performance of Machine Translation (MT) system such as Tem- plate Based MT (TBMT). For MT systems to be- come of further practical use, they need to be en- hanced with MWEs processing capability. As our study towards this goal, we propose DLM, which is developed for sequence labeling task including hid- den structures, to extract MWEs for MT systems. DLM combines the advantages of existing discrimi- native models, which can learn hidden structures in sequence labeling task. In our evaluations, DLM a- chieves precisions ranging up to 90.73% for some type of MWEs, which is higher than state-of-the-art discriminative models. Such results demonstrate that it is feasible to automatically identify many Chinese MWEs using our DLM tool. With MWEs processing model, BLEU score of MT system has also been in- creased by up to 0.3 in close test.
作者 Xiao, Sun
出处 《China Communications》 SCIE CSCD 2012年第3期124-133,共10页 中国通信(英文版)
基金 supported by Liaoning Province Doctor Startup Fund under Grant No.20101021 the Fund of the State Ethic Affairs Commissions under Grant No.10DL08 AnHui Provincie Key Laboratory of Affective Computing and Advanced Intelligent Machine
关键词 informationguage processing MT sions processing natural lan-DLM multiword expres- 模型基 提取 中国 机器翻译系统 DLM 中文文本 判别模型 自动识别
  • 相关文献

参考文献24

  • 1ZHAO Tiejun, ZHU Conghui, YANG Muyun. Chinese- English Translation of Company Names and Addresses in a Large-scale Database International[J]. Journal of Ad- vanced Intelligence, 2011, 3(2): 229-241.
  • 2SUN Xiao, REN Fuji, HUANG Degen. Extended Super Function Based Chinese Japanese Machine Translation [C]//Proceedings of the IEEE International Conference on Natural Language Processing and Knowledge Engi- neering, 2009: 39-46.
  • 3SAG I, BALDWIN T, BOND F, et al. Multiword Expres- sions: A Pain in the Neck for NLP[J]. Computational Lin- guistics and Intelligent Text Processing, 2002:189-206.
  • 4REN Fuji. From Cloud Computing to Language Engineer- ing, Affective Computing and Advanced Intelligence[J]. Journal of Advanced Intelligence, 2010, 2(1): 1-14.
  • 5KATZ G, GIESBRECHT E. Automatic Identification of Non-compositional Multi-word Expressions Using Latent Semantic Analysis[C]//Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Under- lying Properties, 2006: 12-19.
  • 6REN Zhixiang, LV Yajuan, CAO Jie, et al. Improving Sta- tistical Machine Translation using Domain Bilingual Multi- word Expressions[C]//Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Dis- ambiguation and Applications, 2009:47-54.
  • 7LAMBERT P, BANCHS R. Data Inferred Multiword Ex- pressions for Statistical Machine Translation [C]// Pro- ceedings of Machine Translation Summit X, 2005: 396- 403.
  • 8MORENCY L, QUATTONI A, DARRELL T. Latent-dy- namic Discriminative Models for Continuous Gesture Recognition [ C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2007:1-8.
  • 9SUN Xiao, NAN Xiaoli. Chinese Base Phrases Chunking Based on Latent semi-CRF Model[C]//Proceedings of the IEEE International Conference on Natural Language Pro- c es sing and Know ledge Engineering, 2 010 : 3 5 5 -3 61.
  • 10SANTOS D. Lexical Gaps And Idioms In Machine Translation I C]//Proceedings of the 13th Conference on Computational Linguistics, 1990, 2: 330-335.

二级参考文献19

  • 1徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量:123
  • 2KAMPS J,,MARX M.Words with Attitude. Proceedings of the 1st International Conference on Global WordNet MultiWordNet: Jan 21-25,2002 . 2002
  • 3Dave K,LAWRENCE S,PENNOCK D M.Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. Proceedings of the 12th International Conference on World Wide Web . 2003
  • 4ZHUANG L,JING F,ZHU X Y.Movie Review Mining and Summarization. Proceedings of the 15th ACM International Conference on Information and Knowledge Management: Nov 6-11,2006 . 2006
  • 5LIU G S,LI J H,LI X.New Feature Selection and Weighting Methods Based on Category Information. Lecture on Notes in Computer Science . 2004
  • 6http: //nlp. stanford. edu/software/lex-parser. shtml .
  • 7TANG H F,TAN S B,CHEN X Q.A Survey on Sentiment Detection of Reviews. Expert Systems With Applications . 2009
  • 8CHANG C C,LIN C J.LIBSVM: A Library for Support Vector Machines. http: //www. csie. ntu. edu. tw .
  • 9PANG B,LEE L.Cornell Movie ! Review Datasets. http: //www. cs. cornell. edu/people/pabo/ movie-reviewdata/ .
  • 10ZHAO J,XU H B,HUANG X J.Overview of Chinese Opinion Analysis Evaluation 2008. Chinese Opinion Analysis Evaluation,COAE2008 . 2008

共引文献4

同被引文献10

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部