摘要
提出了一种词匹配和语法分析相结合的中文文本自动校对法 ,采用规则与统计相结合的方法 ,不使用大规模语料库 ;而且根据原文的输入方式 ,用逆向最大匹配和局部语料统计的算法 ,找出散串 ,通过词匹配和语法分析处理散串 ,得到错误串的候选串 ,通过人机交互的方法对错误串进行自动校正 .实验表明 ,系统的查错率达 80 %以上 ,误报率在 5 %左右 ,基本满足了应用要求 .
An automatic Chinese text proof-reading system based on word matching and syntax analysis is presented, which uses the regulation-based and count-based methods to avoid the use of large scale corpus. According to the input means of text, the disperse string found through word matching and syntax analysis is treated, so error strings is found out and then is corrected by interaction between man and computer. Experimental results show that 80% of errors can be found and about 5% are false-find errors, and the system can fulfill the general requirement for proof-reading.
出处
《哈尔滨工业大学学报》
EI
CAS
CSCD
北大核心
2001年第1期60-64,共5页
Journal of Harbin Institute of Technology
关键词
散串
中文自动校对系统
词匹配
语法分析
Computer aided engineering
Error correction
Error detection
Natural language processing systems
Typewriters