This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingl...This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingly.The study describes the characteristics of the Arabic language,different types of OCR systems,different stages of the Arabic OCR system,the researcher’s contributions in each step,and the evaluationmetrics for OCR.The study reviews the existing datasets for the Arabic OCR and their characteristics.Additionally,this study implemented some preprocessing and segmentation stages of Arabic OCR.The study compares the performance of the existing methods in terms of recognition accuracy.In addition to researchers’OCRmethods,commercial and open-source systems are used in the comparison.The Arabic language is morphologically rich and written cursive with dots and diacritics above and under the characters.Most of the existing approaches in the literature were evaluated on isolated characters or isolated words under a controlled environment,and few approaches were tested on pagelevel scripts.Some comparative studies show that the accuracy of the existing Arabic OCR commercial systems is low,under 75%for printed text,and further improvement is needed.Moreover,most of the current approaches are offline OCR systems,and there is no remarkable contribution to online OCR systems.展开更多
Optical Character Recognition(OCR)refers to a technology that uses image processing technology and character recognition algorithms to identify characters on an image.This paper is a deep study on the recognition effe...Optical Character Recognition(OCR)refers to a technology that uses image processing technology and character recognition algorithms to identify characters on an image.This paper is a deep study on the recognition effect of OCR based on Artificial Intelligence(AI)algorithms,in which the different AI algorithms for OCR analysis are classified and reviewed.Firstly,the mechanisms and characteristics of artificial neural network-based OCR are summarized.Secondly,this paper explores machine learning-based OCR,and draws the conclusion that the algorithms available for this form of OCR are still in their infancy,with low generalization and fixed recognition errors,albeit with better recognition effect and higher recognition accuracy.Finally,this paper explores several of the latest algorithms such as deep learning and pattern recognition algorithms.This paper concludes that OCR requires algorithms with higher recognition accuracy.展开更多
Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.T...Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.The Arabic language includes 28 characters.Each character has up to four shapes according to its location in the word(at the beginning,middle,end,and isolated).This paper proposed 12 CNN architectures for recognizing handwritten Arabic characters.The proposed architectures were derived from the popular CNN architectures,such as VGG,ResNet,and Inception,to make them applicable to recognizing character-size images.The experimental results on three well-known datasets showed that the proposed architectures significantly enhanced the recognition rate compared to the baseline models.The experiments showed that data augmentation improved the models’accuracies on all tested datasets.The proposed model outperformed most of the existing approaches.The best achieved results were 93.05%,98.30%,and 96.88%on the HIJJA,AHCD,and AIA9K datasets.展开更多
The purpose of the paper is to develop a mobile Android application--"Car Log" that gives to users the ability to track all the costs for a vehicle and the ability to add fuel cost data by taking a photo of the cash...The purpose of the paper is to develop a mobile Android application--"Car Log" that gives to users the ability to track all the costs for a vehicle and the ability to add fuel cost data by taking a photo of the cash receipt from the respective gas station where the charging was performed. OCR (optical character recognition) is the conversion of images of typed, handwritten or printed text into machine-encoded text. Once we have the text machine-encoded we can further use it in machine processes, like translation, or extracted, meaning text-to-speech transformed, helping people in simple everyday tasks. Users of the application will be able to enter other completely different costs grouped into categories and other charges. Car Log application quickly and easily can visualize, edit and add different costs for a ear. It also supports the ability to add multiple profiles, by entering data for all ears in a single family, for example, or a small business. The test results are positive thus we intend to further develop a cloud ready application.展开更多
This study presents a single-class and multi-class instance segmentation approach applied to ancient Palmyrene inscriptions,employing two state-of-the-art deep learning algorithms,namely YOLOv8 and Roboflow 3.0.The go...This study presents a single-class and multi-class instance segmentation approach applied to ancient Palmyrene inscriptions,employing two state-of-the-art deep learning algorithms,namely YOLOv8 and Roboflow 3.0.The goal is to contribute to the preservation and understanding of historical texts,showcasing the potential of modern deep learning methods in archaeological research.Our research culminates in several key findings and scientific contributions.We comprehensively compare the performance of YOLOv8 and Roboflow 3.0 in the context of Palmyrene character segmentation—this comparative analysis mainly focuses on the strengths and weaknesses of each algorithm in this context.We also created and annotated an extensive dataset of Palmyrene inscriptions,a crucial resource for further research in the field.The dataset serves for training and evaluating the segmentation models.We employ comparative evaluation metrics to quantitatively assess the segmentation results,ensuring the reliability and reproducibility of our findings and we present custom visualization tools for predicted segmentation masks.Our study advances the state of the art in semi-automatic reading of Palmyrene inscriptions and establishes a benchmark for future research.The availability of the Palmyrene dataset and the insights into algorithm performance contribute to the broader understanding of historical text analysis.展开更多
The purpose of this paper is to propose a new multi stage algorithm for the recognition of isolated characters. It was similar work done before using only the center of gravity (This paper is extended version of “A f...The purpose of this paper is to propose a new multi stage algorithm for the recognition of isolated characters. It was similar work done before using only the center of gravity (This paper is extended version of “A fast recognition system for isolated printed characters using center of gravity”, LAP LAMBERT Academic Publishing 2011, ISBN: 978-38465-0002-6), but here we add using principal axis in order to make the algorithm rotation invariant. In my previous work which is published in LAP LAMBERT, I face a big problem that when the character is rotated I can’t recognize the character. So this adds constrain on the document to be well oriented but here I use the principal axis in order to unify the orientation of the character set and the characters in the scanned document. The algorithm can be applied for any isolated character such as Latin, Chinese, Japanese, and Arabic characters but it has been applied in this paper for Arabic characters. The approach uses normalized and isolated characters of the same size and extracts an image signature based on the center of gravity of the character after making the character principal axis vertical, and then the system compares these values to a set of signatures for typical characters of the set. The system then provides the closeness of match to all other characters in the set.展开更多
An optical imaging system and a configuration characteristic algorithm are presented to reduce the difficulties in extracting intact characters image with weak contrast, in recognizing characters on fast moving beer b...An optical imaging system and a configuration characteristic algorithm are presented to reduce the difficulties in extracting intact characters image with weak contrast, in recognizing characters on fast moving beer bottles. The system consists of a hardware subsystem, including a rotating device, CCD, 16 mm focus lens, a frame grabber card, a penetrating lighting and a computer, and a software subsystem. The software subsystem performs pretreatment, character segmentation and character recognition. In the pretreatment, the original image is filtered with preset threshold to remove isolated spots. Then the horizontal projection and the vertical projection are used respectively to retrieve the character segmentation. Subsequently, the configuration characteristic algorithm is applied to recognize the characters. The experimental results demonstrate that this system can recognize the characters on beer bottles accurately and effectively; the algorithm is proven fast, stable and robust, making it suitable in the industrial environment.展开更多
In today’s digital era,the text may be in form of images.This research aims to deal with the problem by recognizing such text and utilizing the support vector machine(SVM).A lot of work has been done on the English l...In today’s digital era,the text may be in form of images.This research aims to deal with the problem by recognizing such text and utilizing the support vector machine(SVM).A lot of work has been done on the English language for handwritten character recognition but very less work on the under-resourced Hindi language.A method is developed for identifying Hindi language characters that use morphology,edge detection,histograms of oriented gradients(HOG),and SVM classes for summary creation.SVM rank employs the summary to extract essential phrases based on paragraph position,phrase position,numerical data,inverted comma,sentence length,and keywords features.The primary goal of the SVM optimization function is to reduce the number of features by eliminating unnecessary and redundant features.The second goal is to maintain or improve the classification system’s performance.The experiment included news articles from various genres,such as Bollywood,politics,and sports.The proposed method’s accuracy for Hindi character recognition is 96.97%,which is good compared with baseline approaches,and system-generated summaries are compared to human summaries.The evaluated results show a precision of 72%at a compression ratio of 50%and a precision of 60%at a compression ratio of 25%,in comparison to state-of-the-art methods,this is a decent result.展开更多
The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the...The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the absence of a standard publicly available dataset for several low-resource lan-guages,including the Pashto language remained a hurdle in the advancement of language processing.Realizing that,a clean dataset is the fundamental and core requirement of character recognition,this research begins with dataset generation and aims at a system capable of complete language understanding.Keeping in view the complete and full autonomous recognition of the cursive Pashto script.The first achievement of this research is a clean and standard dataset for the isolated characters of the Pashto script.In this paper,a database of isolated Pashto characters for forty four alphabets using various font styles has been introduced.In order to overcome the font style shortage,the graphical software Inkscape has been used to generate sufficient image data samples for each character.The dataset has been pre-processed and reduced in dimensions to 32×32 pixels,and further converted into the binary format with a black background and white text so that it resembles the Modified National Institute of Standards and Technology(MNIST)database.The benchmark database is publicly available for further research on the standard GitHub and Kaggle database servers both in pixel and Comma Separated Values(CSV)formats.展开更多
目的在影像归档和通信系统(Picture Archiving and Communication System,PACS)数据库文件丢失或损坏后,实现影像资料和PDF报告关键信息的快速识别和重组,供患者回诊使用。方法利用基于深度学习的光学字符识别技术和Pydicom技术分别读取...目的在影像归档和通信系统(Picture Archiving and Communication System,PACS)数据库文件丢失或损坏后,实现影像资料和PDF报告关键信息的快速识别和重组,供患者回诊使用。方法利用基于深度学习的光学字符识别技术和Pydicom技术分别读取PDF和DCOM文件中的基本信息,重新建立起患者、影像、报告三者之间的联系,并将关联数据写入数据库。结果经抽样验证,该方法识别同类图像精度的准确度、精准度及召回率均为100%,综合指标F1值为1,在不同组别独立样本间的识别精度表现出一致性。平均每份报告识别时间约为0.14 s(t=-1.005,P=0.315),说明不同组别独立样本间的识别时间表现出一致性。结论该方法的使用能有效缩短数据库故障后患者等待时长,能够在短时间内恢复医疗秩序,可用于PACS数据库数据丢失后的应急处置,也为PACS的数据整合提供依据,为医学影像数据恢复和数据整合提供一种新思路。展开更多
目的:设计一种基于光学字符识别(optical character recognition,OCR)模型的医疗救治装备数据采集平台,以实现应急灾害救援条件下医疗数据的自动化采集。方法:该平台以医疗物联网“感知—网络—平台”架构为基础构建。首先,选取Raspberr...目的:设计一种基于光学字符识别(optical character recognition,OCR)模型的医疗救治装备数据采集平台,以实现应急灾害救援条件下医疗数据的自动化采集。方法:该平台以医疗物联网“感知—网络—平台”架构为基础构建。首先,选取Raspberry Pi 4B作为边缘节点,使用视频采集卡、摄像头、平板计算机等搭建硬件环境。其次,基于卷积循环神经网络(convolutional recurrent neural network,CRNN)优化OCR模型,通过软硬件协同方式实现医疗终端视频流处理与数据提取。最后,采用FineBI工具实现交互界面设计与数据库链接。结果:经实验验证,该平台的硬件环境可靠、稳定,优化后的OCR模型文本识别准确率提升,且采用该平台能够实现对医疗设备数据的快速、自动化采集。结论:采用该平台能够为医护人员提供全面、准确的医疗救治装备数据支撑,有利于提升医疗救治效率。展开更多
目前通信机房图片归档,人工操作占据了主导地位,然而这种方式存在效率低、易出错等缺陷。在此背景下,文章提出了一种基于光学字符识别(Optical Character Recognition,OCR)模型的通信机房图片归档系统。该系统通过自动识别图片中的文字...目前通信机房图片归档,人工操作占据了主导地位,然而这种方式存在效率低、易出错等缺陷。在此背景下,文章提出了一种基于光学字符识别(Optical Character Recognition,OCR)模型的通信机房图片归档系统。该系统通过自动识别图片中的文字信息,分析图片所属的机房位置,进而按照机柜位置分类归档图片,实现自动化管理。经过测试,该系统的归档准确率达到了98%以上,显著提高了通信机房图片归档的效率。展开更多
目的为了提高纸质医疗设备质控检测原始记录表手写数据的电子化录入效率,替代传统手工录入方式,实现手写检测数据的批量化自动录入。方法基于Python语言,开发一套基于深度学习光学字符识别(Optical Character Recognition,OCR)的医疗设...目的为了提高纸质医疗设备质控检测原始记录表手写数据的电子化录入效率,替代传统手工录入方式,实现手写检测数据的批量化自动录入。方法基于Python语言,开发一套基于深度学习光学字符识别(Optical Character Recognition,OCR)的医疗设备质控检测原始数据记录表智能识别系统。深度学习OCR技术采用百度智能云OCR云服务,实现批量识别质控检测记录表电子图片,获取结构化的检测数据识别结果,并将识别结果以电子表格的形式导出。结果该系统已实现8种常用医疗设备质控检测原始记录表的智能化识别,经实验测试,8种质控检测记录表平均识别耗时为5.45 s,平均识别正确率为95.94%。系统应用后,医疗设备质控检测原始记录表手写数据电子化录入用时显著低于传统手工录入方式,且差异有统计学意义(P<0.001)。结论该系统识别速度快,识别正确率高,实现了医疗设备质控检测原始记录表批量化、智能化、电子化自动录入,节省了大量人力,提高了质控检测数据整理效率,为质控检测数据的深度分析打下坚实基础。展开更多
农村房地一体档案是对农村宅基地、集体建设用地使用权及房屋所有权进行确权登记的重要依据,将签章后的纸质档案转为电子档案进行存储对不动产权证书办理具有重要意义。由于目前缺乏能识别档案内容并进行分类归档的工具,设计并实现了基...农村房地一体档案是对农村宅基地、集体建设用地使用权及房屋所有权进行确权登记的重要依据,将签章后的纸质档案转为电子档案进行存储对不动产权证书办理具有重要意义。由于目前缺乏能识别档案内容并进行分类归档的工具,设计并实现了基于Tesseract-OCR的农村房地一体归档系统。使用光学字符识别(Optical Character Recognition,OCR)对档案扫描图像进行识别,训练校正字库,提取图像中的文字信息,实现档案资料的分类存储。运用四川省某县的部分房地一体档案进行系统测验,应用结果表明,系统的识别归档准确率为96.5%,能满足房地一体档案归档需求,降低了人工识别归档的繁琐性,极大提高了归档的工作效率,提升了档案分类的准确度。展开更多
License plate recognition (LPR) applies image processing and character recognition technology to identify vehicles by automatically reading their license plates. The work presented in this paper aims to create a compu...License plate recognition (LPR) applies image processing and character recognition technology to identify vehicles by automatically reading their license plates. The work presented in this paper aims to create a computer vision system capable of taking real-time input image from a static camera and identifying the license plate from extracted image. This problem is examined in two stages: First the license plate region detection and extraction from background and plate segmentation to sub-images, and second the character recognition stage. The method used for the license plate region detection is based on the assumption that the license plate area is a high concentration of smaller details, making it a region of high intensity of edges. The Sobel filter and their vertical and horizontal projections are used to identify the plate region. The result of testing this stage was an accuracy of 67.5%. The final stage of the LPR system is optical character recognition (OCR). The method adopted for this stage is based on template matching using correlation. Testing the performance of OCR resulted in an overall recognition rate of 87.76%.展开更多
Cards Recognition Systems,(CRSs)are representative computer vision-based applications.They have a broad range of usage scenarios.For example,they can be used to recognize images containing business cards,personal iden...Cards Recognition Systems,(CRSs)are representative computer vision-based applications.They have a broad range of usage scenarios.For example,they can be used to recognize images containing business cards,personal identification cards,and bank cards etc.Even though CRSs have been studied for many years,it is still difficult to recognize cards in camera-based images taken by ordinary devices,e.g.,mobile phones.Diversity of viewpoints and complex backgrounds in the images make the recognition task challenging.Existing systems employing traditional image processing schemes are not robust to varied environment,and are inefficient in dealing with natural images,e.g.,taken by mobile phones.To tackle the problem,we propose a novel framework for card recognition by employing a Convolutional Neutral Network(CNN)based approach.The system localizes the foreground of the image by utilizing a Fully Convolutional Network(FCN).With the help of the foreground map,the system localizes the corners of the card region and employs perspective transformation to alleviate the effects from distortion.Text lines in the card region are detected and recognized by utilizing CNN and Long Short Term Memory,(LSTM).To evaluate the proposed scheme,we collect a large dataset which contains 4,065 images in a variety of shooting scenarios.Experimental results demonstrate the efficacy of the proposed scheme.Specifically,it is able to achieve an accuracy of 90.62%in the end-toend test,outperforming the state-of-the-art.展开更多
电子评标过程中,由于目前的辅助招评标系统在智能化程度方面有所欠缺,在评标效率、准确率等方面仍有提升进步的区间。例如,在获取招投标文件图片信息中,现有的辅助招评标系统识别效果较差。为解决现有问题,提出了一种通过使用光学字符识...电子评标过程中,由于目前的辅助招评标系统在智能化程度方面有所欠缺,在评标效率、准确率等方面仍有提升进步的区间。例如,在获取招投标文件图片信息中,现有的辅助招评标系统识别效果较差。为解决现有问题,提出了一种通过使用光学字符识别(Optical Character Recognition,OCR)技术获取招投标文件内容,并对上传图片进行灰度值、图像预处理。该方法可大幅度增强系统智能辅助招评标功能,使用公章检测算法判断招投标文件中公章使用情况,划分标书文字块,从而缩短评标时间,减轻评审标书的工作强度,解决了评标过程中的评审不公正、评标效率低等问题,使招投标项目的评标更加公平、公正、公开。展开更多
文摘This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingly.The study describes the characteristics of the Arabic language,different types of OCR systems,different stages of the Arabic OCR system,the researcher’s contributions in each step,and the evaluationmetrics for OCR.The study reviews the existing datasets for the Arabic OCR and their characteristics.Additionally,this study implemented some preprocessing and segmentation stages of Arabic OCR.The study compares the performance of the existing methods in terms of recognition accuracy.In addition to researchers’OCRmethods,commercial and open-source systems are used in the comparison.The Arabic language is morphologically rich and written cursive with dots and diacritics above and under the characters.Most of the existing approaches in the literature were evaluated on isolated characters or isolated words under a controlled environment,and few approaches were tested on pagelevel scripts.Some comparative studies show that the accuracy of the existing Arabic OCR commercial systems is low,under 75%for printed text,and further improvement is needed.Moreover,most of the current approaches are offline OCR systems,and there is no remarkable contribution to online OCR systems.
基金supported by science and technology projects of Gansu State Grid Corporation of China(52272220002U).
文摘Optical Character Recognition(OCR)refers to a technology that uses image processing technology and character recognition algorithms to identify characters on an image.This paper is a deep study on the recognition effect of OCR based on Artificial Intelligence(AI)algorithms,in which the different AI algorithms for OCR analysis are classified and reviewed.Firstly,the mechanisms and characteristics of artificial neural network-based OCR are summarized.Secondly,this paper explores machine learning-based OCR,and draws the conclusion that the algorithms available for this form of OCR are still in their infancy,with low generalization and fixed recognition errors,albeit with better recognition effect and higher recognition accuracy.Finally,this paper explores several of the latest algorithms such as deep learning and pattern recognition algorithms.This paper concludes that OCR requires algorithms with higher recognition accuracy.
文摘Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.The Arabic language includes 28 characters.Each character has up to four shapes according to its location in the word(at the beginning,middle,end,and isolated).This paper proposed 12 CNN architectures for recognizing handwritten Arabic characters.The proposed architectures were derived from the popular CNN architectures,such as VGG,ResNet,and Inception,to make them applicable to recognizing character-size images.The experimental results on three well-known datasets showed that the proposed architectures significantly enhanced the recognition rate compared to the baseline models.The experiments showed that data augmentation improved the models’accuracies on all tested datasets.The proposed model outperformed most of the existing approaches.The best achieved results were 93.05%,98.30%,and 96.88%on the HIJJA,AHCD,and AIA9K datasets.
文摘The purpose of the paper is to develop a mobile Android application--"Car Log" that gives to users the ability to track all the costs for a vehicle and the ability to add fuel cost data by taking a photo of the cash receipt from the respective gas station where the charging was performed. OCR (optical character recognition) is the conversion of images of typed, handwritten or printed text into machine-encoded text. Once we have the text machine-encoded we can further use it in machine processes, like translation, or extracted, meaning text-to-speech transformed, helping people in simple everyday tasks. Users of the application will be able to enter other completely different costs grouped into categories and other charges. Car Log application quickly and easily can visualize, edit and add different costs for a ear. It also supports the ability to add multiple profiles, by entering data for all ears in a single family, for example, or a small business. The test results are positive thus we intend to further develop a cloud ready application.
基金The results and knowledge included herein have been obtained owing to support from the following institutional grant.Internal grant agency of the Faculty of Economics and Management,Czech University of Life Sciences Prague,Grant No.2023A0004-“Text Segmentation Methods of Historical Alphabets in OCR Development”.https://iga.pef.czu.cz/.Funds were granted to T.Novák,A.Hamplová,O.Svojše,and A.Veselýfrom the author team.
文摘This study presents a single-class and multi-class instance segmentation approach applied to ancient Palmyrene inscriptions,employing two state-of-the-art deep learning algorithms,namely YOLOv8 and Roboflow 3.0.The goal is to contribute to the preservation and understanding of historical texts,showcasing the potential of modern deep learning methods in archaeological research.Our research culminates in several key findings and scientific contributions.We comprehensively compare the performance of YOLOv8 and Roboflow 3.0 in the context of Palmyrene character segmentation—this comparative analysis mainly focuses on the strengths and weaknesses of each algorithm in this context.We also created and annotated an extensive dataset of Palmyrene inscriptions,a crucial resource for further research in the field.The dataset serves for training and evaluating the segmentation models.We employ comparative evaluation metrics to quantitatively assess the segmentation results,ensuring the reliability and reproducibility of our findings and we present custom visualization tools for predicted segmentation masks.Our study advances the state of the art in semi-automatic reading of Palmyrene inscriptions and establishes a benchmark for future research.The availability of the Palmyrene dataset and the insights into algorithm performance contribute to the broader understanding of historical text analysis.
文摘The purpose of this paper is to propose a new multi stage algorithm for the recognition of isolated characters. It was similar work done before using only the center of gravity (This paper is extended version of “A fast recognition system for isolated printed characters using center of gravity”, LAP LAMBERT Academic Publishing 2011, ISBN: 978-38465-0002-6), but here we add using principal axis in order to make the algorithm rotation invariant. In my previous work which is published in LAP LAMBERT, I face a big problem that when the character is rotated I can’t recognize the character. So this adds constrain on the document to be well oriented but here I use the principal axis in order to unify the orientation of the character set and the characters in the scanned document. The algorithm can be applied for any isolated character such as Latin, Chinese, Japanese, and Arabic characters but it has been applied in this paper for Arabic characters. The approach uses normalized and isolated characters of the same size and extracts an image signature based on the center of gravity of the character after making the character principal axis vertical, and then the system compares these values to a set of signatures for typical characters of the set. The system then provides the closeness of match to all other characters in the set.
基金This project is supported by Municipal Science Foundation of Wuhan(No.T20001101005).
文摘An optical imaging system and a configuration characteristic algorithm are presented to reduce the difficulties in extracting intact characters image with weak contrast, in recognizing characters on fast moving beer bottles. The system consists of a hardware subsystem, including a rotating device, CCD, 16 mm focus lens, a frame grabber card, a penetrating lighting and a computer, and a software subsystem. The software subsystem performs pretreatment, character segmentation and character recognition. In the pretreatment, the original image is filtered with preset threshold to remove isolated spots. Then the horizontal projection and the vertical projection are used respectively to retrieve the character segmentation. Subsequently, the configuration characteristic algorithm is applied to recognize the characters. The experimental results demonstrate that this system can recognize the characters on beer bottles accurately and effectively; the algorithm is proven fast, stable and robust, making it suitable in the industrial environment.
文摘In today’s digital era,the text may be in form of images.This research aims to deal with the problem by recognizing such text and utilizing the support vector machine(SVM).A lot of work has been done on the English language for handwritten character recognition but very less work on the under-resourced Hindi language.A method is developed for identifying Hindi language characters that use morphology,edge detection,histograms of oriented gradients(HOG),and SVM classes for summary creation.SVM rank employs the summary to extract essential phrases based on paragraph position,phrase position,numerical data,inverted comma,sentence length,and keywords features.The primary goal of the SVM optimization function is to reduce the number of features by eliminating unnecessary and redundant features.The second goal is to maintain or improve the classification system’s performance.The experiment included news articles from various genres,such as Bollywood,politics,and sports.The proposed method’s accuracy for Hindi character recognition is 96.97%,which is good compared with baseline approaches,and system-generated summaries are compared to human summaries.The evaluated results show a precision of 72%at a compression ratio of 50%and a precision of 60%at a compression ratio of 25%,in comparison to state-of-the-art methods,this is a decent result.
文摘The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the absence of a standard publicly available dataset for several low-resource lan-guages,including the Pashto language remained a hurdle in the advancement of language processing.Realizing that,a clean dataset is the fundamental and core requirement of character recognition,this research begins with dataset generation and aims at a system capable of complete language understanding.Keeping in view the complete and full autonomous recognition of the cursive Pashto script.The first achievement of this research is a clean and standard dataset for the isolated characters of the Pashto script.In this paper,a database of isolated Pashto characters for forty four alphabets using various font styles has been introduced.In order to overcome the font style shortage,the graphical software Inkscape has been used to generate sufficient image data samples for each character.The dataset has been pre-processed and reduced in dimensions to 32×32 pixels,and further converted into the binary format with a black background and white text so that it resembles the Modified National Institute of Standards and Technology(MNIST)database.The benchmark database is publicly available for further research on the standard GitHub and Kaggle database servers both in pixel and Comma Separated Values(CSV)formats.
文摘目的在影像归档和通信系统(Picture Archiving and Communication System,PACS)数据库文件丢失或损坏后,实现影像资料和PDF报告关键信息的快速识别和重组,供患者回诊使用。方法利用基于深度学习的光学字符识别技术和Pydicom技术分别读取PDF和DCOM文件中的基本信息,重新建立起患者、影像、报告三者之间的联系,并将关联数据写入数据库。结果经抽样验证,该方法识别同类图像精度的准确度、精准度及召回率均为100%,综合指标F1值为1,在不同组别独立样本间的识别精度表现出一致性。平均每份报告识别时间约为0.14 s(t=-1.005,P=0.315),说明不同组别独立样本间的识别时间表现出一致性。结论该方法的使用能有效缩短数据库故障后患者等待时长,能够在短时间内恢复医疗秩序,可用于PACS数据库数据丢失后的应急处置,也为PACS的数据整合提供依据,为医学影像数据恢复和数据整合提供一种新思路。
文摘目的:设计一种基于光学字符识别(optical character recognition,OCR)模型的医疗救治装备数据采集平台,以实现应急灾害救援条件下医疗数据的自动化采集。方法:该平台以医疗物联网“感知—网络—平台”架构为基础构建。首先,选取Raspberry Pi 4B作为边缘节点,使用视频采集卡、摄像头、平板计算机等搭建硬件环境。其次,基于卷积循环神经网络(convolutional recurrent neural network,CRNN)优化OCR模型,通过软硬件协同方式实现医疗终端视频流处理与数据提取。最后,采用FineBI工具实现交互界面设计与数据库链接。结果:经实验验证,该平台的硬件环境可靠、稳定,优化后的OCR模型文本识别准确率提升,且采用该平台能够实现对医疗设备数据的快速、自动化采集。结论:采用该平台能够为医护人员提供全面、准确的医疗救治装备数据支撑,有利于提升医疗救治效率。
文摘目前通信机房图片归档,人工操作占据了主导地位,然而这种方式存在效率低、易出错等缺陷。在此背景下,文章提出了一种基于光学字符识别(Optical Character Recognition,OCR)模型的通信机房图片归档系统。该系统通过自动识别图片中的文字信息,分析图片所属的机房位置,进而按照机柜位置分类归档图片,实现自动化管理。经过测试,该系统的归档准确率达到了98%以上,显著提高了通信机房图片归档的效率。
文摘目的为了提高纸质医疗设备质控检测原始记录表手写数据的电子化录入效率,替代传统手工录入方式,实现手写检测数据的批量化自动录入。方法基于Python语言,开发一套基于深度学习光学字符识别(Optical Character Recognition,OCR)的医疗设备质控检测原始数据记录表智能识别系统。深度学习OCR技术采用百度智能云OCR云服务,实现批量识别质控检测记录表电子图片,获取结构化的检测数据识别结果,并将识别结果以电子表格的形式导出。结果该系统已实现8种常用医疗设备质控检测原始记录表的智能化识别,经实验测试,8种质控检测记录表平均识别耗时为5.45 s,平均识别正确率为95.94%。系统应用后,医疗设备质控检测原始记录表手写数据电子化录入用时显著低于传统手工录入方式,且差异有统计学意义(P<0.001)。结论该系统识别速度快,识别正确率高,实现了医疗设备质控检测原始记录表批量化、智能化、电子化自动录入,节省了大量人力,提高了质控检测数据整理效率,为质控检测数据的深度分析打下坚实基础。
文摘农村房地一体档案是对农村宅基地、集体建设用地使用权及房屋所有权进行确权登记的重要依据,将签章后的纸质档案转为电子档案进行存储对不动产权证书办理具有重要意义。由于目前缺乏能识别档案内容并进行分类归档的工具,设计并实现了基于Tesseract-OCR的农村房地一体归档系统。使用光学字符识别(Optical Character Recognition,OCR)对档案扫描图像进行识别,训练校正字库,提取图像中的文字信息,实现档案资料的分类存储。运用四川省某县的部分房地一体档案进行系统测验,应用结果表明,系统的识别归档准确率为96.5%,能满足房地一体档案归档需求,降低了人工识别归档的繁琐性,极大提高了归档的工作效率,提升了档案分类的准确度。
文摘License plate recognition (LPR) applies image processing and character recognition technology to identify vehicles by automatically reading their license plates. The work presented in this paper aims to create a computer vision system capable of taking real-time input image from a static camera and identifying the license plate from extracted image. This problem is examined in two stages: First the license plate region detection and extraction from background and plate segmentation to sub-images, and second the character recognition stage. The method used for the license plate region detection is based on the assumption that the license plate area is a high concentration of smaller details, making it a region of high intensity of edges. The Sobel filter and their vertical and horizontal projections are used to identify the plate region. The result of testing this stage was an accuracy of 67.5%. The final stage of the LPR system is optical character recognition (OCR). The method adopted for this stage is based on template matching using correlation. Testing the performance of OCR resulted in an overall recognition rate of 87.76%.
基金This work was supported by the National Natural Science Foundation of China(Grant No.61702046)National Key R&D Program of China(Grant No.2017YFB1401500 and 2017YFB1400800).
文摘Cards Recognition Systems,(CRSs)are representative computer vision-based applications.They have a broad range of usage scenarios.For example,they can be used to recognize images containing business cards,personal identification cards,and bank cards etc.Even though CRSs have been studied for many years,it is still difficult to recognize cards in camera-based images taken by ordinary devices,e.g.,mobile phones.Diversity of viewpoints and complex backgrounds in the images make the recognition task challenging.Existing systems employing traditional image processing schemes are not robust to varied environment,and are inefficient in dealing with natural images,e.g.,taken by mobile phones.To tackle the problem,we propose a novel framework for card recognition by employing a Convolutional Neutral Network(CNN)based approach.The system localizes the foreground of the image by utilizing a Fully Convolutional Network(FCN).With the help of the foreground map,the system localizes the corners of the card region and employs perspective transformation to alleviate the effects from distortion.Text lines in the card region are detected and recognized by utilizing CNN and Long Short Term Memory,(LSTM).To evaluate the proposed scheme,we collect a large dataset which contains 4,065 images in a variety of shooting scenarios.Experimental results demonstrate the efficacy of the proposed scheme.Specifically,it is able to achieve an accuracy of 90.62%in the end-toend test,outperforming the state-of-the-art.
文摘电子评标过程中,由于目前的辅助招评标系统在智能化程度方面有所欠缺,在评标效率、准确率等方面仍有提升进步的区间。例如,在获取招投标文件图片信息中,现有的辅助招评标系统识别效果较差。为解决现有问题,提出了一种通过使用光学字符识别(Optical Character Recognition,OCR)技术获取招投标文件内容,并对上传图片进行灰度值、图像预处理。该方法可大幅度增强系统智能辅助招评标功能,使用公章检测算法判断招投标文件中公章使用情况,划分标书文字块,从而缩短评标时间,减轻评审标书的工作强度,解决了评标过程中的评审不公正、评标效率低等问题,使招投标项目的评标更加公平、公正、公开。