摘要
研究文档图像的分辨率提高问题,针对数字化文档图像在采集过程中遇到的低分辨率、噪声、纸张质量蜕化和形变等因素影响,提出了一种新的贝叶斯估计的最大后验概率算法对文档图像进行恢复和重构。首先利用聚类方法对文档中文字进行自动分类,然后依据每个类别中相同字符的先验知识,例如出现频率,几何特性等,利用一个能量方程来求取最终的MAP最优解,然后一个新颖的MAP迭代算法,反复利用对高分辨率图像的估计来逼近最优解,从而使得最终的高分辨率字符图像获得很高的清晰度。仿真结果表明提出的算法能稳定地提高文字的分辨率,提高文档的识别准确率,并且具有高的运算效率。在此基础上利用本文方法,可以方便的实现多文档或者书籍图像的重建和恢复。
Restoration of documents is a key step for applications in document processing, retrieval understanding as well as digital libraries, for example as in book readers. In this paper, we present a method to restore document images, by using a Maximum a Posteriori (MAP) framework. The prior probability of the characters is learned from the training document images. The extraction of a single high-quality enhanced text inmge from a set of degraded ima- ges can benefit from a strong prior knowledge. The restoration process should allow for discontinuities and discourage oscillations at the same time. These properties were represented in a total variation based prior model. Results indi- cate that our method is appropriate for document image restoration, where resolution enhancement is an added gain.
出处
《计算机仿真》
CSCD
北大核心
2011年第9期298-301,共4页
Computer Simulation