摘要
本文首先提出一种对中文句子进行分词预处理的方法 ,在预处理过程中就能完成分词过程中所有的数据库访问操作 ,这种方法可以不加修改地应用于所有机械分词算法以及消除歧义 ;然后在预处理的基础上实现一种改进的MM法 ,更加完全地遵照“长词优先”的原则 ,使分词系统在机械分词阶段能有比
In this paper, a pretreatment method for Chinese word segmentation is introduced. Using it, all operations of access to the database in the segmentation process tare completed in the pretreatment phase. This method can also be used for algorithms of automatic segmentation and ambiguity diminishing. Then this paper proposes an improved Maximum Matching Method (MM) based on the pretreatment method, which achieves better effects based on the principle of ″Longer Word First″.
出处
《微型电脑应用》
2002年第1期13-15,共3页
Microcomputer Applications