摘要
针对粘连和搭接字符切分算法的不足,提出一种基于折线切分路径的字符切分算法。该算法利用投影法将粘连搭接字符与非粘连搭接字符分离开,而后结合粘连搭接字符独有的外形特征,通过引入惩罚权重的路径搜索算法快速而准确地得到粘连搭接字符间的折线切分路径;为了避免一些字符在以上的切分过程中被误切碎,利用识别反馈信息对一些字符子图像进行合并。实验结果表明,该算法对印刷体日英混排字符切分有很强的适应性,取得了较理想的切分效果。
Segmentation of touching and kerned characters has been the most difficult problem in character segmentation.This paper presented a novel approach based on exploiting non-linear partitioning paths to segment touching and kerned characters.Firstly,employed character projection to isolate touching and kerned characters with other characters.Then in order to find the correct non-linear segmentation path of touching and kerned characters,used a heuristic method seeking minimal-penalty curved cut to determine candidate paths from all possible segmentation paths to remove redundant paths and reduce the computational cost.Some characters might be segmented into several regions in above process.So evoked a merging procedure to combine some neighboring regions that belong to a single character.Experimental results demonstrate that our algorithm is robust in segmenting touching and kerned characters with respect to different orientation and language.
出处
《计算机应用研究》
CSCD
北大核心
2011年第10期3998-4000,共3页
Application Research of Computers
关键词
字符切分
字符识别
粘连搭接字符
折线切分路径
character segmentation
character recognition
touching and kerned characters
curve-based partitioning path