摘要
为了解决光学字符识别(OCR)过程中,因文字采集质量偏低导致文字识别精度低的技术问题,提出了一种将传统方法与神经网络相结合的OCR技术。本文针对图像中的文字待识别区域进行文字识别:首先将待识别图像处理成无损位图文件;随后将无损位图文件进行方向校正、去噪、字符切割等预处理操作;最后基于预处理后的文本图像进行文字识别。实验表明,本文提出的方法,降低了OCR系统处理数据的负荷、提升了识别精度。不仅节约了时间成本和硬件成本,而且可以有效的识别文字图像中的密集文字和模糊文字。
To solve the technical problem of low accuracy of character recognition caused by the low quality of character collection in the process of optical character recognition( OCR),this paper proposes an OCR technology that combines traditional methods with neural networks. This article aims at character recognition on the areas in the text image: firstly,the text image is changed into a lossless bitmap file in which a series of preprocessing should be done,such as direction correction,denoising,and character segmentation;then the file can be identified. Experiments show that the method we proposed reduces the data processing load of OCR recognition and improves the recognition accuracy. It not only saves time and cost of hardware,but also can effectively recognize dense and fuzzy text in text images.
作者
张焱
郭梦琰
王峰
邱雄
贺桢
蔡立志
张娟
ZHANG Yan;GUO Mengyan;WANG Feng;QIU Xiong;HE Zhen;CAI Lizhi;ZHANG Juan(School of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201620,China;Shanghai computer software technology development center,Shanghai 200235,China)
出处
《智能计算机与应用》
2020年第10期37-43,共7页
Intelligent Computer and Applications
关键词
文字图像
卷积神经网络
图像处理
Text image recognition
Convolutional neural network
Image processing