摘要
针对目前文本识别网络中参数多、识别速度较慢的不足,提出了一种改进的端到端文本识别网络结构。该结构采用MobileNet-V3替代VGG模型,即采用深度可分离卷积替换掉标准卷积,同时在网络中嵌入了空间注意力模块,使网络能够更多地关注输入图像中的字符部分。通过多个测试数据集,包括ICDAR2003、ICDAR2013和SVT与典型算法进行了实验对比分析,结果表明,改进的模型可以在不降低准确率的情况下,实现网络参数量下降为原来的1/6,速度提升约50%。
Aiming at the shortcomings of the current text recognition network,such as too many parameters and slow recognition speed,an improved end-to-end text recognition network structure is proposed.In this structure,VGG model is replaced by Mobilenet-V3,that is,the standard convolution is replaced by deep separable convolution.At the same time,the spatial attention module is embedded in the network,so that the network can pay more attention to the characters in the input image.Based on several test data sets including ICDAR2003,ICDAR2013 and SVT,compared with the typical algorithm,the results show that the improved model can reduce the network parameters to 1/6 of the original and increase the speed by 50%under the condition of not reducing the accuracy rate.
作者
周兴杰
罗印升
宋伟
ZHOU Xingjie;LUO Yinsheng;SONG Wei(School of Mechanical Engineering,Jiangsu University of Technology,Changzhou 213001,China;School of Electrical and Information Engineering,Jiangsu University of Technology,Changzhou 213001,China)
出处
《江苏理工学院学报》
2020年第6期44-49,共6页
Journal of Jiangsu University of Technology
基金
江苏省研究生实践创新计划项目“基于激光SLAM与视觉SLAM相融合的植保机定位与建图设计”(SJCX19_0691)。