摘要
提出一种新的用于识别视频中字幕文字的方法。鉴于视频中文字的大小、颜色、渲染风格和分辨率的不同,以及视频中各种复杂背景的影响,识别视频中的叠加文字是一个尚未解决的问题。目前,大多数视频叠加文字识别方法都基于视频文字的二值化和传统OCR引擎的结合。然而,二值化过程容易引入噪声和文字笔划信息的丢失。另外,传统OCR技术主要专注于高分辨率的扫描打印文档,这些文档具有背景单一、噪声少和笔划信息较完整的特点。因此,传统OCR引擎用于识别叠加文字二值化后的结果可能不够鲁棒。为解决这个问题,直接从未二值化的叠加视频文字图像中提取Gabor特征用于训练二层字符识别器。实验结果表明,本文提出的方法在多字体视频叠加中文文字识别上有良好的效果。
In this paper,a new method for recognizing caption texts in videos is proposed. Due to varying font sizes,colors,styles,and resolutions and complex backgrounds in videos,it is still a challenging problem to recognize overlaid texts in videos. Most existing overlaid text recognition methods are based on the combination of text binarization and traditional OCR engine. However,the process of text binarization may incur noises and text stroke information loss. Additionally,techniques of traditional OCRs are mainly focused on high-resolution scans of printed documents,which have the characteristics of single color background,little noise,and more complete stroke information. Hence, traditional OCR engines might not be robust enough to recognize the binarization results of overlaid text images. In order to solve this problem,we directly extract Gabor features from overlaid text images without binarization for training the two-level character recognizer.The final experimental results demonstrate that the proposed method makes a great progress in overlaid Chinese text recognition with multiple fonts.
作者
田洁
王伟强
孙翼
TIAN Jie;WANG Weiqiang;SUN Yi(School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beifing 101408, China)
出处
《中国科学院大学学报(中英文)》
CSCD
北大核心
2018年第3期402-408,共7页
Journal of University of Chinese Academy of Sciences
基金
国家自然科学基金(61271434)资助