摘要
纸质文档通过图像获取设备转换为文档图像,由于人为因素和一些其它原因,文档图像不可避免地包含一定的倾斜角度。为了便于计算机处理,有必要对文档图像进行倾斜校正。文档版面十分复杂,包含文字、图像、图形、表格等内容。建立一个较为通用的文档图像倾斜校正算法是很困难的。提出了基于内容的文档倾斜自动校正方法,通过小波变换、游长平滑和细化处理,提取表格中的水平线和垂直线或文字行。针对不同的文档版面采用相应的倾斜校正策略。实验表明该方法具有倾斜校正速度快、精度高和适应性强的特点。
When transferring paper document into document image via image acquisition equipment, it is inevitable to introduce some inclines because of many indefinite reasons. To make it smooth to process those document images utilizing computer, it is necessary to rectify the incline first. The document images, which may contain characters, images, graphics, forms etc. , are always very complex and it is very difficult to find a generally - used algorithm to rectify those document images. This paper puts forward a content - based method to rectify the inclines automatically, in which wavelet transform, thinning and other related algorithms are applied. The experimental results show that the proposed method, which varies according to different types of the document images, can achieve high speed and high accuracy.
出处
《计算机仿真》
CSCD
2006年第12期192-196,共5页
Computer Simulation
关键词
版面分析
文档处理
倾斜校正
Layout analysis
Document image processing
Skew adjustment