期刊文献+

基于词干提取的维吾尔语事件类时间短语识别 被引量:6

Uyghur event-anchored temporal expressions recognition using stemming method
下载PDF
导出
摘要 针对维吾尔语事件类时间短语没有明显时间词特征词而引起的识别困难和边界定位不准确等问题,提出了一种统计结合词干提取的针对黏着性语言的事件类时间短语的识别方法。根据维吾尔语典型的黏着性语言形态特点,对时间短语构成进行分析和分类,采用机器学习的方法将难于识别的事件类隐性时间短语识别问题转换为基于统计方法的序列标注;通过对维吾尔语事件类时间要素分析及维吾尔语构词的研究,引入黏着语特有的词干特征,选定实验特征集合,对比分析不同的特征集合的自动识别准确率的平均值。实验结果表明,该方法对维吾尔语事件类时间短语识别的F-值达到85.37%。这一结果对其它黏着性语言的研究具有参考意义。 To deal with the problems of Uyghur event-anchored temporal expressions recognition and boundary localization, which caused by no obvious number features, a statistical combination of stemming method for agglutinative language event-an- chored temporal expressions recognition is proposed. Uyghur morphological characteristics, as a typical agglutinative language, are analyzed. The structure and classification of event-anchored temporal expressions are also taken into consideration. Based on machine learning methods, the recognition of event-anchored temporal expressions problem is transformed into sequence tagging problem. After analyzing Uyghur sentences temporal elements and morphological struts, using roots and morphological features instead of words are decided. The training results of F-score with different features and template files are discussed. Experiment results show that the F-measure of Uyghur event-anchored temporal expressions recognition reach 85.37 ~. The results of study have certain reference significance for other agglutinative languages.
出处 《计算机工程与设计》 CSCD 北大核心 2014年第2期625-630,共6页 Computer Engineering and Design
基金 国家自然科学基金项目(61063026 1262061 61063043 61262060) 国家社科重点基金项目(10AYY006)
关键词 自然语言处理 时间短语 条件随机场 黏着语 事件类时间 特征选择 词干提取 natural language processing (NLP) temporal expression conditional random fields (CRFs) agglutinative lan-guage~ event-anchored temporal expressions~ feature selection stemming
  • 相关文献

参考文献2

二级参考文献39

共引文献37

同被引文献93

引证文献6

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部