摘要
鉴于生物医学命名实体识别的多数模型使用单机器学习算法时识别效果不好,提出一种基于条件随机域(CRFs)与最大熵(Maxent)分类器融合的方法,利用基分类器之间的相关性和互补性,结合有效的特征集合,进行再学习,得到融合模型.实验表明,该模型的识别性能与单一分类器和JNLPBA专题会议相关的系统比较,取得很好成绩,F测度达到70.7%,证明该融合方法有效.
Currently,most of methods for bio-entity recognition are based on a single machine learning algorithm and it can not achieve better performance.Therefore,in this paper,we propose a cascade generalization method based on the CRFs and Maxent which makes use of the compensation and relativity among different classifiers.Experimental results show that the cascade generalization method is obviously superior to the individual classifier based method and the most state of the art systems in JNLPBA conferences.F value reached 70.7%,showing that the fusion method is effective.
出处
《大庆石油学院学报》
CAS
北大核心
2011年第2期91-94,122,共4页
Journal of Daqing Petroleum Institute
基金
黑龙江省自然科学基金项目(F200603)
关键词
条件随机域
最大熵
分类器融合
特征提取
生物医学命名实体识别
conditional random fields
maximum entropy
cascade generalization
feature extraction
bio-entity recognition