摘要
该文提出一种基于语言知识评价的分类器集成方法,利用自动获得的搭配资源和人工评价规则,融合了基于支持向量机的最长名词短语识别结果和基于条件随机场的归约识别结果,进一步基于确定性规则有针对性地识别了分类器易出错的特殊结构,提高了对连续动词介词和连续名词造成的边界歧义的识别能力。实验取得了89.30%的正确率和89.62%的召回率,多词结构F1值较归约方法提高了0.75%。
This paper proposed a classifier ensemble method based on the language evaluation, and fused the MNP recognition results of SVMs and cascade CRFs based on reduction method, using the automatically obtained collocations and the manual assess rules. It then further targeted recognized the error-prone structures of the classifiers based on deterministic rules. The methods improve the recognition ability of boundary ambiguities of continuous verbs and prepositions as well as continuous nouns. The experiment is successful with a precision rate of 89.30% and a recall rate of 89.62%, especially it improves Fl-score of multi-words MNPs by 0.75% in contrast with the reduction method.
出处
《中文信息学报》
CSCD
北大核心
2013年第6期16-22,共7页
Journal of Chinese Information Processing
基金
上海市哲学社会科学规划青年课题资助项目(2013EYY005)
国家语言资源监测与研究中心科研项目(YZYS08-04)
关键词
最长名词短语识别
语言知识评价
分类器集成
规则
maximal noun phrase recognition
language knowledge assess
classifier ensemble
rule