摘要
针对目前古籍数字化过程中,原色高清文件过大、黑白处理后的文件文字不清、背面文字“透射”、正面与背面文字交叉重叠、图像噪点过多等诸多问题,本文研究了一种基于人工免疫算法的古籍文本数字化处理方法.该方法通过模拟免疫学的模型和原理,采用基于二进制编码的图像边缘检测算法,追踪文字边缘,寻找古籍文本数字图像上感兴趣的文字或图片,同时去除其他不感兴趣的部分,舍弃冗余信息.实测的结果表明,与其他方法相比,本方法处理后的文字没有空心,笔划连续,文档大小仅为原色文件的1.82%.本方法的处理结果对提高古籍文本的阅读体验、降低储存成本等需求具有良好的应用价值.
Aiming at many problems in the digitization process of ancient books,such as large primary color high-definition files,unclear text after black and white processing,"transmission"of back text,overlapping of front and back text,and excessive image noise etc,a digital processing method of ancient books text based on artificial immune algorithm is studied.Based on the model and principle of analog immunology,this method adopts the Image Edge Detection Algorithm(IEDA)based on binary coding to track the text edge,finding the interested text or picture on the digital image of ancient books,and at the same time removing other uninterested parts and discarding redundant information.The processing results show that,compared with other methods,the text processed by this method is not hollow,the strokes are continuous,and the document size is only 1.82%of the original color file.The results of this method have good application value for improving the reading experience and reducing the storage cost of ancient books.
作者
焦佳琛
包能胜
姜佳华
JIAO Jiachen;BAO Nengsheng;JIANG Jiahua(Key Laboratory of Intelligent Manufacturing,Shantou University,Ministry of Education,Shantou 515063,Guangdong,China;Department of Mechanical Engineering,College of Engineering,Shantou University,Shantou 515063,Guangdong,China)
出处
《汕头大学学报(自然科学版)》
2021年第1期3-11,共9页
Journal of Shantou University:Natural Science Edition
基金
李嘉诚基金会交叉研究资助项目(2020LKSFG06D)。
关键词
免疫算法
古籍数字化
边缘检测
阳性选择
图像处理
immune algorithm
digitization of ancient books
edge detection
positive selection
image processing