摘要
使用多信息源评分来对分析阶段出现的歧义进行消歧.多信息源评分综合运用了词汇、词性、句法和语义信息.并针对不同规模的训练集和测试集,分别给出按最大可能原理和输出多个评分结果的正确率.实验结果表明:对于训练集和测试集,两种方法在考虑上下文时都比不考虑上下文的正确率高.对于训练集,当语料规模越来越大时。正确率也逐渐在增加.
Ambiguities are disambiguated using multi-information score for analytic stage. Multi-information score takes into account of lexical, part-of speech, syntactic and semantic information. Accurate rates of different scale training set and test set are given according to maximum likelihood principle and multi-score adults respectively. experimental results show thst for training sets and testing set the accurate rate taking into account context is higher than the accurate rate taking into account no context using two kinds of method, the accurate rate gradually increases when the size of corpus increases for training sets.
出处
《计算机工程》
CAS
CSCD
北大核心
2001年第1期13-14,32,共3页
Computer Engineering
基金
国家"863"高科技项目!(863-306-03-06-2)