摘要
扫描已装订的积厚文档时 ,由于页面不能紧贴于扫描平面 ,会产生两个问题 :(1)扫描图像中离装订线较近的一侧出现黑色的阴影区域 ;(2 )阴影区域中的文本产生扭曲 基于图像信息和几何变形信息 ,提出一种去除阴影和校正文本的算法 首先采用分块自动阈值算法去除阴影 ;然后通过垂直投影函数、有效包围盒和标记点提取文本行中心线 ,中心线被用于全局几何参数的估计 ;最后 ,扭曲的文本通过估计的几何参数和分片四边形映射进行校正
While scanning thick bound documents, the pages are not flat on the document glass of the scanner. The physical deformation of the scanned page can results in two kinds of degradation for the scanned image. One is the shadow incurred near the spine of the book; and another is the text being bended. In this paper, we propose a method to combine the information both from the scanned image and from the geometric distortion to remove the shadow as well as restore the warped words to the right positions. First, the shadow is removed by patch-based auto-threshold binarization. Then the central lines of text are directly extracted from the binarization image. This goal is achieved by using vertical projection function, valid bounding boxes, and markers. Finally, the bended lines and the warped words are restored by the geometric parameters evaluated from the central lines and the piecewise quadrilateral map. Experiments show that the proposed algorithm gives satisfactory results.
出处
《计算机辅助设计与图形学学报》
EI
CSCD
北大核心
2005年第1期42-48,共7页
Journal of Computer-Aided Design & Computer Graphics
基金
国家"八六三"高技术研究发展计划(2001AA231031)
国家科技攻关计划课题奥运科技专项(2001BA904B08)
国家重点基础研究发展规划(G1998030608)
中国科学院计算技术研究所青年创新基金(200261804)
关键词
积厚文档
文本行中心线
垂直投影函数
有效包围盒
标记点
几何参数
thick bound documents
central lines of the text
vertical projection function
valid bounding boxes
markers
geometric parameters