摘要
提出了基于连通体的区域聚类方法来解决版面印刷字符区域的准确提取问题,提出了基于自然语言理解的错误纠正和信息分类算法解决字符识别率较低和信息分类困难的问题。同时对系统的各个模块进行相应的分析,给出了一套完整的实现方案。对于随机选取的1589张样张,识别准确率达到90.54%。
One approach based on block clustering is proposed to locate the printed character regions and the other approach based on natural language understanding is presented to improve the recognization rate. Meanwhile, each model in the system is briefly analyzed and a complete frame is carried out. Performance evaluation is processed based on 1589 pages of scanned bankcheck images and the accuracy rate attains 90.54%.
出处
《计算机工程》
EI
CAS
CSCD
北大核心
2005年第9期163-166,共4页
Computer Engineering
关键词
票据识别
版面分析
信息分类
Cheque processing
Page structure analysis
Information classification