期刊文献+

卷积神经网络在古籍汉字识别中的应用实践 被引量:14

CNN-Based Recognition of Chinese Characters in Ancient Books
下载PDF
导出
摘要 文章尝试将卷积神经网络用于数字人文古籍汉字的元数据加工,将古籍汉字识别问题转换为卷积神经网络的分类问题,在缺乏训练集的情况下通过数据生成技术构建训练集进行模型训练,并用于古籍汉字的识别。通过TensorFlow平台,对773个汉字生成约24万个训练样本,网络模型可自行判定不可识别的图片;在提高精确率同时,对这部分数据可直接转由人工识别,系统更为可靠,作为数字人文古籍元数据加工的半自动化工具,旨在提高古籍资源在数字人文应用研究中的效率。 Convolutional neural network (CNN) is used to index the metadata of Chinese characters in ancient books in the field of digital humanities, so that the recognition of Chinese characters in ancient books is transformed into the classification of CNN. As a result of the absence of training sets,data generation technology is used for model training, and then for the recognition of Chinese characters in ancient books. In detail, the TensorFlow platform is used to generate about 240,000 training samples for 773 Chinese characters, and the adopted network model can be used to pick out those unrecognizable character pictures automatically. Then,the unrecognizable character pictures would be transferred for manual recognition,which would be more reliable. In short,though still a semi-automatic tool,it can save the manpower cost to a certain extent in the indexing of digital humanistic metadata.
作者 郭利敏 葛亮 刘悦如 GUO Limin;GE Liang;LIU Yueru
出处 《图书馆论坛》 CSSCI 北大核心 2019年第10期142-148,共7页 Library Tribune
关键词 智慧图书馆 人工智能 卷积神经网络 数字人文 古籍汉字识别 smart library artificial intelligence convolution neural network digital humanities recognition of Chinese characters in ancient books
  • 相关文献

参考文献3

二级参考文献17

共引文献7

同被引文献475

引证文献14

二级引证文献72

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部