摘要
探讨了一种自然语言理解(NLU)切词系统的设计思路。首先,综合运用各种传统分词方法,提出所有可能的切分结果,同时建立切词领域本体知识库;然后,结合切词领域本体知识库并运用概率统计和聚类的思想对切词结果进行划分,进入对应领域内,在自然语言理解的过程中进行基于语义分析的歧义排除;最后将理解结果反馈回切词系统,从而实现切词系统的自反馈和自我完善。本切词系统的特点是使切词系统和其服务的自然语言理解系统随着其运行共同获得可拓性发展,逐步达到最优化。
A way for the designing of Chinese word segmentation system is discussed here. First of all,this word segmentation system produced all possible segment resulting with the methods of same traditional word segmentation. It constructs the on- to- logic knowledgebase at the same time. Then dispatched the resulting of segment to the corresponding NLU module in the specific domain with the method of lexieal cohesion and statistic method to exclude the wrong resulting of segment. At last, it feeds back the true resulting of segment to the words segmentation system. As a result the self - feeding back of the segment system comes true, The feature of this words segmentation is to obtain the common durative progress with the NLU which this words segmentation serves, and achieves the optimization step by step with the use of this words segmentation.
出处
《计算机技术与发展》
2006年第5期7-9,共3页
Computer Technology and Development
基金
科技部科技型中小企业创新基金(01c26226111002)
关键词
自然语言理解
切词
本体
聚类
语义分析
NLU
words segmentation
on- to - logic
lexical cohesion
semantic analysis