摘要
目的在脱机手写体文字识别系统中,由于自由书写的字符不可避免地受到图像背景不均匀、图像倾斜和字符粘连及大小不一等因素的影响,为了确保字符切分和识别的正确性,对EMS表单中手写体汉字字符图像预处理方法进行探讨,展示了EMS表单图像预处理的全过程。方法采用最小二乘法作拟合直线的方法,对目标图像进行定位和分割,用基于大津阈值的分块阈值算法处理目标图像的背景不均问题,并减少噪声干扰。结果该图像预处理方法在1020张真实EMS图像上进行测试,识别正确率达到了86.3%。结论该方法有一定的灵活性和抗干扰性,减少了图像噪声对汉字字符切分和识别的影响。
Objective In OCR system, image preprocessing is particularly important for recognition of unlimited handwritten Chinese characters. Some unavoidable factors from image background, image skew and touching characters bring in errors for character segmentation, recognition and post-processing. In this paper, we focused on the image preprocessing method for recognition of handwritten Chinese characters in EMS Forms and the whole process was shown. Methods Method of finding fitting Straight Line by Least Squares was used to deal with the relocation and segmentation of the target image. Block threshold based Otsu's method was adopted to remove the image background and to eliminate the noise interferences. Results The proposed method was tested on 1024 real EMS envelope images. The system achieved a recognition rate of 86.3%. Conclusion The experimental results demonstrated that the proposed method effectively reduced the influence of image noises on the segmentation and recognition of handwritten Chinese characters.
出处
《包装工程》
CAS
CSCD
北大核心
2014年第21期80-85,共6页
Packaging Engineering
关键词
手写中文字符
识别
图像分割
图像预处理
handwritten Chinese characters
recognition
image relocation and segmentation
image preprocessing