摘要
文字模糊匹配技术是计算机文本处理中一项重要的应用,在欧美国家被广泛研究。然而,中文以其独特的复杂性,使其难以由传统的模糊匹配技术准确高效的实现。此文旨在设计并检验一种创新的汉字的模糊匹配方法,能够有效地匹配关键字并屏蔽敏感词甚至其音近字。这种方法基于建立一个匹配表和一个路径状态转换体系,凡是满足特定路径的,均判断为匹配成功,文中将详细介绍此算法基于C语言的实现。这一方法定位清晰,实现简单,成本微小,能在日益庞大的计算机互联网文字处理中起到积极作用。
Verbal approximate matching,as an important application in computer document processing is widely studied in the West.However,the Chinese is so implicit that it can not be approximately matched by traditional method.The authors aim to design and test a new method for fuzzy matching of Chinese characters,and this method could effectively match the Keywords and sheld the sensitive words.This method,based on a matching table and a path state transition system,makes a judgement of success match on al those in satisfying with the specific path.This paper gives detailed description of the algorithm implementation based on C language,and experiment indicates that this method is clear in definition,simple in implementation,and quite cost-effective,and could play an active role in the computer text processing.
出处
《通信技术》
2011年第6期89-91,共3页
Communications Technology
关键词
汉字全拼
模糊匹配
状态转换
full selling of Chinese
approximate string matching
state transition