期刊文献+
共找到32,434篇文章
< 1 2 250 >
每页显示 20 50 100
基于Vision Transformer的小麦病害图像识别算法
1
作者 白玉鹏 冯毅琨 +3 位作者 李国厚 赵明富 周浩宇 侯志松 《中国农机化学报》 北大核心 2024年第2期267-274,共8页
小麦白粉病、赤霉病和锈病是危害小麦产量的三大病害。为提高小麦病害图像的识别准确率,构建一种基于Vision Transformer的小麦病害图像识别算法。首先,通过田间拍摄的方式收集包含小麦白粉病、赤霉病和锈病3种病害在内的小麦病害图像,... 小麦白粉病、赤霉病和锈病是危害小麦产量的三大病害。为提高小麦病害图像的识别准确率,构建一种基于Vision Transformer的小麦病害图像识别算法。首先,通过田间拍摄的方式收集包含小麦白粉病、赤霉病和锈病3种病害在内的小麦病害图像,并对原始图像进行预处理,建立小麦病害图像识别数据集;然后,基于改进的Vision Transformer构建小麦病害图像识别算法,分析不同迁移学习方式和数据增强对模型识别效果的影响。试验可知,全参数迁移学习和数据增强能明显提高Vision Transformer模型的收敛速度和识别精度。最后,在相同时间条件下,对比Vision Transformer、AlexNet和VGG16算法在相同数据集上的表现。试验结果表明,Vision Transformer模型对3种小麦病害图像的平均识别准确率为96.81%,相较于AlexNet和VGG16模型识别准确率分别提高6.68%和4.94%。 展开更多
关键词 小麦病害 vision Transformer 迁移学习 图像识别 数据增强
下载PDF
Dual-Path Vision Transformer用于急性缺血性脑卒中辅助诊断
2
作者 张桃红 郭学强 +4 位作者 郑瀚 罗继昌 王韬 焦力群 唐安莹 《电子科技大学学报》 EI CAS CSCD 北大核心 2024年第2期307-314,共8页
急性缺血性脑卒中是由于脑组织血液供应障碍导致的脑功能障碍,数字减影脑血管造影(DSA)是诊断脑血管疾病的金标准。基于患者的正面和侧面DSA图像,对急性缺血性脑卒中的治疗效果进行分级评估,构建基于Vision Transformer的双路径图像分... 急性缺血性脑卒中是由于脑组织血液供应障碍导致的脑功能障碍,数字减影脑血管造影(DSA)是诊断脑血管疾病的金标准。基于患者的正面和侧面DSA图像,对急性缺血性脑卒中的治疗效果进行分级评估,构建基于Vision Transformer的双路径图像分类智能模型DPVF。为了提高辅助诊断速度,基于EdgeViT的轻量化设计思想进行了模型的构建;为了使模型保持轻量化的同时具有较高的精度,提出空间-通道自注意力模块,促进Transformer模型捕获更全面的特征信息,提高模型的表达能力;此外,对于DPVF的两分支的特征融合,构建交叉注意力模块对两分支输出进行交叉融合,促使模型提取更丰富的特征,从而提高模型表现。实验结果显示DPVF在测试集上的准确率达98.5%,满足实际需求。 展开更多
关键词 急性缺血性脑卒中 视觉Transformer 双分支网络 特征融合
下载PDF
Collaborative positioning for swarms:A brief survey of vision,LiDAR and wireless sensors based methods
3
作者 Zeyu Li Changhui Jiang +3 位作者 Xiaobo Gu Ying Xu Feng zhou Jianhui Cui 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第3期475-493,共19页
As positioning sensors,edge computation power,and communication technologies continue to develop,a moving agent can now sense its surroundings and communicate with other agents.By receiving spatial information from bo... As positioning sensors,edge computation power,and communication technologies continue to develop,a moving agent can now sense its surroundings and communicate with other agents.By receiving spatial information from both its environment and other agents,an agent can use various methods and sensor types to localize itself.With its high flexibility and robustness,collaborative positioning has become a widely used method in both military and civilian applications.This paper introduces the basic fundamental concepts and applications of collaborative positioning,and reviews recent progress in the field based on camera,LiDAR(Light Detection and Ranging),wireless sensor,and their integration.The paper compares the current methods with respect to their sensor type,summarizes their main paradigms,and analyzes their evaluation experiments.Finally,the paper discusses the main challenges and open issues that require further research. 展开更多
关键词 Collaborative positioning vision LIDAR Wireless sensors Sensor fusion
下载PDF
Exploring Deep Learning Methods for Computer Vision Applications across Multiple Sectors:Challenges and Future Trends
4
作者 Narayanan Ganesh Rajendran Shankar +3 位作者 Miroslav Mahdal Janakiraman SenthilMurugan Jasgurpreet Singh Chohan Kanak Kalita 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期103-141,共39页
Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than ot... Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than other traditional machine learning(ML)methods inCV.DL techniques can produce state-of-the-art results for difficult CV problems like picture categorization,object detection,and face recognition.In this review,a structured discussion on the history,methods,and applications of DL methods to CV problems is presented.The sector-wise presentation of applications in this papermay be particularly useful for researchers in niche fields who have limited or introductory knowledge of DL methods and CV.This review will provide readers with context and examples of how these techniques can be applied to specific areas.A curated list of popular datasets and a brief description of them are also included for the benefit of readers. 展开更多
关键词 Neural network machine vision classification object detection deep learning
下载PDF
Early Detection of Colletotrichum Kahawae Disease in Coffee Cherry Based on Computer Vision Techniques
5
作者 Raveena Selvanarayanan Surendran Rajendran Youseef Alotaibi 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期759-782,共24页
Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease ... Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease is hard to control because wind,rain,and insects carry spores.Colombian researchers utilized a deep learning system to identify CBD in coffee cherries at three growth stages and classify photographs of infected and uninfected cherries with 93%accuracy using a random forest method.If the dataset is too small and noisy,the algorithm may not learn data patterns and generate accurate predictions.To overcome the existing challenge,early detection of Colletotrichum Kahawae disease in coffee cherries requires automated processes,prompt recognition,and accurate classifications.The proposed methodology selects CBD image datasets through four different stages for training and testing.XGBoost to train a model on datasets of coffee berries,with each image labeled as healthy or diseased.Once themodel is trained,SHAP algorithmto figure out which features were essential formaking predictions with the proposed model.Some of these characteristics were the cherry’s colour,whether it had spots or other damage,and how big the Lesions were.Virtual inception is important for classification to virtualize the relationship between the colour of the berry is correlated with the presence of disease.To evaluate themodel’s performance andmitigate excess fitting,a 10-fold cross-validation approach is employed.This involves partitioning the dataset into ten subsets,training the model on each subset,and evaluating its performance.In comparison to other contemporary methodologies,the model put forth achieved an accuracy of 98.56%. 展开更多
关键词 Computer vision coffee berry disease colletotrichum kahawae XG boost shapley additive explanations
下载PDF
Clinical usefulness of the baby vision test in young children and its correlation with the Snellen chart
6
作者 Ya-Lan Wang Jia-Jun Wang +2 位作者 Xi-Cong Lou Han Zou Yun-E Zhao 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2024年第2期348-352,共5页
AIM:To investigate the efficacy of a new visual acuity(VA)screening method,the baby vision test for young children.METHODS:A total 105 eyes of 65 children aged 2-8y were included in the study.Acuity testing was conduc... AIM:To investigate the efficacy of a new visual acuity(VA)screening method,the baby vision test for young children.METHODS:A total 105 eyes of 65 children aged 2-8y were included in the study.Acuity testing was conducted using a standardized recognition acuity chart(Snellen visual chart:at 3 m)and the baby vision model assessment.The baby vision device includes a screen,a near infrared camera and a computer.Children were seated at a measured distance of 33-40 cm from a display for testing.VA was estimated according to the highest resolution the children could follow.Decimal VA data were converted to logarithm of the minimum angle of resolution(logMAR)for statistical analysis.The VA results for each child were recorded and analyzed for consistency.RESULTS:The mean VA measured using the Snellen visual chart was 0.62±0.32,and that assessed using the baby vision test was 0.66±0.27.The 95%limit of agreement was-0.609 to 0.695,with 95.2%(100/105)plots within the 95%limits of agreement.VA values of the baby vision test were significantly correlated with those of the Snellen chart(R=0.274,P=0.005).CONCLUSION:The baby vision test can be used as a relatively reliable method for estimating VA in young children.This new acuity assessment might be a valid predictor of optotype-measured acuity later in preverbal children. 展开更多
关键词 baby vision test acuity assessment fix-and-follow system Snellen chart
原文传递
Enhancing ChatGPT’s Querying Capability with Voice-Based Interaction and CNN-Based Impair Vision Detection Model
7
作者 Awais Ahmad Sohail Jabbar +3 位作者 Sheeraz Akram Anand Paul Umar Raza Nuha Mohammed Alshuqayran 《Computers, Materials & Continua》 SCIE EI 2024年第3期3129-3150,共22页
This paper presents an innovative approach to enhance the querying capability of ChatGPT,a conversational artificial intelligence model,by incorporating voice-based interaction and a convolutional neural network(CNN)-... This paper presents an innovative approach to enhance the querying capability of ChatGPT,a conversational artificial intelligence model,by incorporating voice-based interaction and a convolutional neural network(CNN)-based impaired vision detection model.The proposed system aims to improve user experience and accessibility by allowing users to interact with ChatGPT using voice commands.Additionally,a CNN-based model is employed to detect impairments in user vision,enabling the system to adapt its responses and provide appropriate assistance.This research tackles head-on the challenges of user experience and inclusivity in artificial intelligence(AI).It underscores our commitment to overcoming these obstacles,making ChatGPT more accessible and valuable for a broader audience.The integration of voice-based interaction and impaired vision detection represents a novel approach to conversational AI.Notably,this innovation transcends novelty;it carries the potential to profoundly impact the lives of users,particularly those with visual impairments.The modular approach to system design ensures adaptability and scalability,critical for the practical implementation of these advancements.Crucially,the solution places the user at its core.Customizing responses for those with visual impairments demonstrates AI’s potential to not only understand but also accommodate individual needs and preferences. 展开更多
关键词 Accessibility in conversational AI CNN-based impair vision detection ChatGPT voice-based interaction recommender system
下载PDF
Frequency and associated factors of accommodation and non-strabismic binocular vision dysfunction among medical university students
8
作者 Jie Cai Wen-Wen Fan +5 位作者 Yun-Hui Zhong Cai-Lan Wen Xiao-Dan Wei Wan-Chen Wei Wan-Yan Xiang Jin-Mao Chen 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2024年第2期374-379,共6页
AIM:To investigate the frequency and associated factors of accommodation and non-strabismic binocular vision dysfunction among medical university students.METHODS:Totally 158 student volunteers underwent routine visio... AIM:To investigate the frequency and associated factors of accommodation and non-strabismic binocular vision dysfunction among medical university students.METHODS:Totally 158 student volunteers underwent routine vision examination in the optometry clinic of Guangxi Medical University.Their data were used to identify the different types of accommodation and nonstrabismic binocular vision dysfunction and to determine their frequency.Correlation analysis and logistic regression were used to examine the factors associated with these abnormalities.RESULTS:The results showed that 36.71%of the subjects had accommodation and non-strabismic binocular vision issues,with 8.86%being attributed to accommodation dysfunction and 27.85%to binocular abnormalities.Convergence insufficiency(CI)was the most common abnormality,accounting for 13.29%.Those with these abnormalities experienced higher levels of eyestrain(χ2=69.518,P<0.001).The linear correlations were observed between the difference of binocular spherical equivalent(SE)and the index of horizontal esotropia at a distance(r=0.231,P=0.004)and the asthenopia survey scale(ASS)score(r=0.346,P<0.001).Furthermore,the right eye's SE was inversely correlated with the convergence of positive and negative fusion images at close range(r=-0.321,P<0.001),the convergence of negative fusion images at close range(r=-0.294,P<0.001),the vergence facility(VF;r=-0.234,P=0.003),and the set of negative fusion images at far range(r=-0.237,P=0.003).Logistic regression analysis indicated that gender,age,and the difference in right and binocular SE did not influence the emergence of these abnormalities.CONCLUSION:Binocular vision abnormalities are more prevalent than accommodation dysfunction,with CI being the most frequent type.Greater binocular refractive disparity leads to more severe eyestrain symptoms. 展开更多
关键词 optometry clinic non-strabismic binocular vision dysfunction college students convergence insufficiency
原文传递
A Novel 6G Scalable Blockchain Clustering-Based Computer Vision Character Detection for Mobile Images
9
作者 Yuejie Li Shijun Li 《Computers, Materials & Continua》 SCIE EI 2024年第3期3041-3070,共30页
6G is envisioned as the next generation of wireless communication technology,promising unprecedented data speeds,ultra-low Latency,and ubiquitous Connectivity.In tandem with these advancements,blockchain technology is... 6G is envisioned as the next generation of wireless communication technology,promising unprecedented data speeds,ultra-low Latency,and ubiquitous Connectivity.In tandem with these advancements,blockchain technology is leveraged to enhance computer vision applications’security,trustworthiness,and transparency.With the widespread use of mobile devices equipped with cameras,the ability to capture and recognize Chinese characters in natural scenes has become increasingly important.Blockchain can facilitate privacy-preserving mechanisms in applications where privacy is paramount,such as facial recognition or personal healthcare monitoring.Users can control their visual data and grant or revoke access as needed.Recognizing Chinese characters from images can provide convenience in various aspects of people’s lives.However,traditional Chinese character text recognition methods often need higher accuracy,leading to recognition failures or incorrect character identification.In contrast,computer vision technologies have significantly improved image recognition accuracy.This paper proposed a Secure end-to-end recognition system(SE2ERS)for Chinese characters in natural scenes based on convolutional neural networks(CNN)using 6G technology.The proposed SE2ERS model uses the Weighted Hyperbolic Curve Cryptograph(WHCC)of the secure data transmission in the 6G network with the blockchain model.The data transmission within the computer vision system,with a 6G gradient directional histogram(GDH),is employed for character estimation.With the deployment of WHCC and GDH in the constructed SE2ERS model,secure communication is achieved for the data transmission with the 6G network.The proposed SE2ERS compares the performance of traditional Chinese text recognition methods and data transmission environment with 6G communication.Experimental results demonstrate that SE2ERS achieves an average recognition accuracy of 88%for simple Chinese characters,compared to 81.2%with traditional methods.For complex Chinese characters,the average recognition accuracy improves to 84.4%with our system,compared to 72.8%with traditional methods.Additionally,deploying the WHCC model improves data security with the increased data encryption rate complexity of∼12&higher than the traditional techniques. 展开更多
关键词 6G technology blockchain end-to-end recognition Chinese characters natural scene computer vision algorithms convolutional neural network
下载PDF
Association of age at diagnosis of diabetes with subsequent risk of age-related ocular diseases and vision acuity
10
作者 Si-Ting Ye Xian-Wen Shang +8 位作者 Yu Huang Susan Zhu Zhuo-Ting Zhu Xue-Li Zhang Wei Wang Shu-Lin Tang Zong-Yuan Ge Xiao-Hong Yang Ming-Guang He 《World Journal of Diabetes》 SCIE 2024年第4期697-711,共15页
BACKGROUND The importance of age on the development of ocular conditions has been reported by numerous studies.Diabetes may have different associations with different stages of ocular conditions,and the duration of di... BACKGROUND The importance of age on the development of ocular conditions has been reported by numerous studies.Diabetes may have different associations with different stages of ocular conditions,and the duration of diabetes may affect the development of diabetic eye disease.While there is a dose-response relationship between the age at diagnosis of diabetes and the risk of cardiovascular disease and mortality,whether the age at diagnosis of diabetes is associated with incident ocular conditions remains to be explored.It is unclear which types of diabetes are more predictive of ocular conditions.AIM To examine associations between the age of diabetes diagnosis and the incidence of cataract,glaucoma,age-related macular degeneration(AMD),and vision acuity.METHODS Our analysis was using the UK Biobank.The cohort included 8709 diabetic participants and 17418 controls for ocular condition analysis,and 6689 diabetic participants and 13378 controls for vision analysis.Ocular diseases were identified using inpatient records until January 2021.Vision acuity was assessed using a chart.RESULTS During a median follow-up of 11.0 years,3874,665,and 616 new cases of cataract,glaucoma,and AMD,respectively,were identified.A stronger association between diabetes and incident ocular conditions was observed where diabetes was diagnosed at a younger age.Individuals with type 2 diabetes(T2D)diagnosed at<45 years[HR(95%CI):2.71(1.49-4.93)],45-49 years[2.57(1.17-5.65)],50-54 years[1.85(1.13-3.04)],or 50-59 years of age[1.53(1.00-2.34)]had a higher risk of AMD independent of glycated haemoglobin.T2D diagnosed<45 years[HR(95%CI):2.18(1.71-2.79)],45-49 years[1.54(1.19-2.01)],50-54 years[1.60(1.31-1.96)],or 55-59 years of age[1.21(1.02-1.43)]was associated with an increased cataract risk.T2D diagnosed<45 years of age only was associated with an increased risk of glaucoma[HR(95%CI):1.76(1.00-3.12)].HRs(95%CIs)for AMD,cataract,and glaucoma associated with type 1 diabetes(T1D)were 4.12(1.99-8.53),2.95(2.17-4.02),and 2.40(1.09-5.31),respectively.In multivariable-adjusted analysis,individuals with T2D diagnosed<45 years of age[β95%CI:0.025(0.009,0.040)]had a larger increase in LogMAR.Theβ(95%CI)for LogMAR associated with T1D was 0.044(0.014,0.073).CONCLUSION The younger age at the diagnosis of diabetes is associated with a larger relative risk of incident ocular diseases and greater vision loss. 展开更多
关键词 DIABETES Age at diagnosis CATARACT GLAUCOMA Age-related macular disease vision acuity
下载PDF
Diagnostic values of questionnaires of Convergence Insufficiency Symptom Survey and College of Optometrists Vision Development Quality of Life in the screening of convergence insufficiency
11
作者 Ling Xiong Qian Chen Ye Wu 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2024年第5期904-908,共5页
AIM:To compare and analyse the diagnostic efficacy of the College of Optometrists Vision Development Quality of Life Questionnaire(COVD-QOL)and the Convergence Insufficiency Symptom Survey(CISS)in detecting convergenc... AIM:To compare and analyse the diagnostic efficacy of the College of Optometrists Vision Development Quality of Life Questionnaire(COVD-QOL)and the Convergence Insufficiency Symptom Survey(CISS)in detecting convergence insufficiency and to compare their diagnostic value in clinical applications.METHODS:Using the diagnostic test method,62 adult patients with convergence insufficiency(age:24.74±3.75y)and 62 normal participants(age:23.61±3.13y)who visited the Optometry Clinic of West China Hospital of Sichuan University from April 2021 to January 2023 were included.All subjects completed the CISS and COVD-QOL.Statistical analysis of the sensitivity and specificity of the CISS and COVD-QOL and comparison and joint experimental analysis of their diagnostic efficacy were performed.RESULTS:The sensitivity of the CISS and COVD-QOL for convergence insufficiency was 64.5%and 71.0%,respectively,while the specificity was 96.8%and 67.7%,respectively.Compared to the CISS alone,the combination of the CISS and COVD-QOL demonstrated lower sensitivity and specificity.The areas under the receiver operating characteristic curve of CISS,COVD-QOL and CISS combined with COVD-QOL were 0.806,0.694 and 0.782,respectively.CONCLUSION:Considering the low sensitivity of the CISS and the low specificity of the COVD-QOL,it is recommended to supplement these questionnaires with other screening tests for the detection of convergence insufficiency. 展开更多
关键词 convergence insufficiency symptom survey College of Optometrists vision Development Quality of Life Questionnaire convergence insufficiency ASTHENOPIA
原文传递
基于Vision Transformer和迁移学习的垃圾图像分类研究
12
作者 郭伟 余璐 宋莉 《河南工程学院学报(自然科学版)》 2024年第1期65-71,共7页
为解决垃圾图像分类中分类准确率低及小样本类别性能差的问题,以生活垃圾图像为研究对象,以正确识别生活垃圾类别为研究目标,利用Vision Transformer模型为分类网络架构,使用迁移学习机制实现该模型在华为云垃圾分类数据集上的训练及分... 为解决垃圾图像分类中分类准确率低及小样本类别性能差的问题,以生活垃圾图像为研究对象,以正确识别生活垃圾类别为研究目标,利用Vision Transformer模型为分类网络架构,使用迁移学习机制实现该模型在华为云垃圾分类数据集上的训练及分类推理。实验结果表明,基于注意力机制的分类模型相较于基于卷积结构的ResNet、DenseNet分类模型具有更高的分类准确率,可达96%,同时测试集的混淆矩阵表明Vision Transformer分类模型在样本不均衡数据集中对于小样本类别也具有较高的准确率,具有实际部署、推理的应用价值。 展开更多
关键词 垃圾图像分类 迁移学习 卷积神经网络 注意力 vision Transformer
下载PDF
情感化设计在头戴显示设备中的应用——以Apple Vision Pro为例
13
作者 立川正博 《科技视界》 2024年第2期52-56,共5页
以苹果公司的头戴显示设备Apple.Vision.Pro为例,从情感化设计的角度对头戴显示设备的发展现状及存在问题进行了分析,并提出了情感化设计在头戴显示设备上的应用思路。从情感化设计的概念和重要性出发,针对现有设备对情感诉求的缺失,从... 以苹果公司的头戴显示设备Apple.Vision.Pro为例,从情感化设计的角度对头戴显示设备的发展现状及存在问题进行了分析,并提出了情感化设计在头戴显示设备上的应用思路。从情感化设计的概念和重要性出发,针对现有设备对情感诉求的缺失,从本能层、行为层和反思层3个层面提出了情感化设计在头戴显示设备上的应用思路。以Apple.Vision.Pro为例,分析了情感化设计在其上的应用,为情感化设计在头戴显示设备的设计和开发上提供参考和借鉴,同时也为情感化设计在其他领域的应用提供了启示。 展开更多
关键词 情感化设计 头戴显示设备 设计 心理 vision.Pro
下载PDF
论混合现实技术新产品对室内设计的影响——以Vision Pro为例
14
作者 陆辰 《鞋类工艺与设计》 2024年第7期189-191,共3页
伴随着5G时代的到来,混合现实技术也逐渐走入人们的视野。随着2023年6月苹果公司全新MR设备⸺Vision Pro的发布,预示着混合现实技术再次发展到了一个全新高度。本文将结合MR新设备探讨其对室内设计行业带来的新影响与新变革。希望本文能... 伴随着5G时代的到来,混合现实技术也逐渐走入人们的视野。随着2023年6月苹果公司全新MR设备⸺Vision Pro的发布,预示着混合现实技术再次发展到了一个全新高度。本文将结合MR新设备探讨其对室内设计行业带来的新影响与新变革。希望本文能借助新产品从混合现实技术的角度为室内设计师、软件开发者等从业人员提供新的启发,促进室内设计的创新性发展。 展开更多
关键词 混合现实技术 室内设计 visionPro
下载PDF
基于改进Vision Transformer网络的农作物病害识别方法
15
作者 王杨 李迎春 +6 位作者 许佳炜 王傲 马唱 宋世佳 谢帆 赵传信 胡明 《小型微型计算机系统》 CSCD 北大核心 2024年第4期887-893,共7页
基于DCNN模型的农作物病害识别方法在实验室环境下识别准确率高,但面对噪声时缺少鲁棒性.为了兼顾农作物病害识别的精度和鲁棒性,本文在标准ViT模型基础上加入增强分块序列化和掩码多头注意力,解决标准ViT模型缺乏局部归纳偏置和视觉特... 基于DCNN模型的农作物病害识别方法在实验室环境下识别准确率高,但面对噪声时缺少鲁棒性.为了兼顾农作物病害识别的精度和鲁棒性,本文在标准ViT模型基础上加入增强分块序列化和掩码多头注意力,解决标准ViT模型缺乏局部归纳偏置和视觉特征序列的自注意力过于关注自身的问题.实验结果表明,本文的EPEMMSA-ViT模型对比标准ViT模型可以更高效的从零学习;当添加预训练权重训练网络时,EPEMMSA-ViT模型在数据增强的PlantVillage番茄子集上能够得到99.63%的分类准确率;在添加椒盐噪声的测试数据集上,对比ResNet50、DenseNet121、MobileNet和ConvNeXt的分类准确率分别提升了6.08%、9.78%、29.78%和12.41%;在添加均值模糊的测试数据集上,对比ResNet50、DenseNet121、MobileNet和ConvNeXt的分类准确率分别提升了18.92%、31.11%、20.37%和19.58%. 展开更多
关键词 农作物病害识别 深度卷积神经网络 视觉Transformer 自注意力 局部归纳偏置
下载PDF
基于Vision Transformer的电缆终端局部放电模式识别
16
作者 唐庆华 方静 +3 位作者 李旭 宋鹏先 孟庆霖 魏占朋 《广东电力》 2023年第11期138-145,共8页
电缆终端缺陷类型一般与局部放电信号特征密切相关,因此可以通过对局部放电信号进行模式识别来实现缺陷分类。对15 kV XLPE电缆终端4种典型缺陷的放电脉冲波形和时频谱图特征进行分析处理,得到可用于识别的数据样本,然后分别采用Vision ... 电缆终端缺陷类型一般与局部放电信号特征密切相关,因此可以通过对局部放电信号进行模式识别来实现缺陷分类。对15 kV XLPE电缆终端4种典型缺陷的放电脉冲波形和时频谱图特征进行分析处理,得到可用于识别的数据样本,然后分别采用Vision Transformer模型、LeNet5、AlexNet和支持向量机对数据进行训练,对比不同算法的识别准确率。结果显示在数据充足的情况下,Vision Transformer模型的识别精度高于其他识别算法。所提方法及结论可为电缆附件的绝缘评估提供可靠依据,具有一定的指导意义。 展开更多
关键词 电缆终端 局部放电 模式识别 vision Transformer 数据训练
下载PDF
基于S-YOLO V5和Vision Transformer的视频内容描述算法
17
作者 徐鹏 李铁柱 职保平 《印刷与数字媒体技术研究》 CAS 北大核心 2023年第4期212-222,共11页
视频内容描述的自动生成是结合计算机视觉和自然语言处理等相关技术提出的一种新型交叉学习任务。针对当前视频内容生成描述模型可读性不佳的问题,本研究提出一种基于S-YOLO V5和Vison Transformer(ViT)的视频内容描述算法。首先,基于... 视频内容描述的自动生成是结合计算机视觉和自然语言处理等相关技术提出的一种新型交叉学习任务。针对当前视频内容生成描述模型可读性不佳的问题,本研究提出一种基于S-YOLO V5和Vison Transformer(ViT)的视频内容描述算法。首先,基于神经网络模型KATNA提取关键帧,以最少帧数进行模型训练;其次,利用S-YOLO V5模型提取视频帧中的语义信息,并结合预训练ResNet101模型和预训练C3D模型提取视频静态视觉特征和动态视觉特征,并对两种模态特征进行融合;然后,基于ViT结构的强大长距离编码能力,构建模型编码器对融合特征进行长距离依赖编码;最后,将编码器的输出作为LSTM解码器的输入,依次输出预测词,生成最终的自然语言描述。通过在MSR-VTT数据集上进行测试,本研究模型的BLEU-4、METEOR、ROUGEL和CIDEr分别为42.9、28.8、62.4和51.4;在MSVD数据集上进行测试,本研究模型的BLEU-4、METEOR、ROUGEL和CIDEr分别为56.8、37.6、74.5以及98.5。与当前主流模型相比,本研究模型在多项评价指标上表现优异。 展开更多
关键词 视频内容描述 S-YOLO V5 vision Transformer 多头注意力
下载PDF
Surface Characteristics Measurement Using Computer Vision:A Review
18
作者 AbdulWahab Hashmi Harlal Singh Mali +2 位作者 Anoj Meena Mohammad Farukh Hashmi Neeraj Dhanraj Bokde 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第5期917-1005,共89页
Computer vision provides image-based solutions to inspect and investigate the quality of the surface to be measured.For any components to execute their intended functions and operations,surface quality is considered e... Computer vision provides image-based solutions to inspect and investigate the quality of the surface to be measured.For any components to execute their intended functions and operations,surface quality is considered equally significant to dimensional quality.Surface Roughness(Ra)is a widely recognized measure to evaluate and investigate the surface quality of machined parts.Various conventional methods and approaches to measure the surface roughness are not feasible and appropriate in industries claiming 100%inspection and examination because of the time and efforts involved in performing the measurement.However,Machine vision has emerged as the innovative approach to executing the surface roughness measurement.It can provide economic,automated,quick,and reliable solutions.This paper discusses the characterization of the surface texture of surfaces of traditional or non-traditional manufactured parts through a computer/machine vision approach and assessment of the surface characteristics,i.e.,surface roughness,waviness,flatness,surface texture,etc.,machine vision parameters.This paper will also discuss multiple machine vision techniques for different manufacturing processes to perform the surface characterization measurement. 展开更多
关键词 Machine vision surface roughness computer vision machining parameters surface characterization
下载PDF
Human and Machine Vision Based Indian Race Classification Using Modified-Convolutional Neural Network
19
作者 Vani A.Hiremani Kishore Kumar Senapati 《Computer Systems Science & Engineering》 SCIE EI 2023年第3期2603-2618,共16页
The inter-class face classification problem is more reasonable than the intra-class classification problem.To address this issue,we have carried out empirical research on classifying Indian people to their geographica... The inter-class face classification problem is more reasonable than the intra-class classification problem.To address this issue,we have carried out empirical research on classifying Indian people to their geographical regions.This work aimed to construct a computational classification model for classifying Indian regional face images acquired from south and east regions of India,referring to human vision.We have created an Automated Human Intelligence System(AHIS)to evaluate human visual capabilities.Analysis of AHIS response showed that face shape is a discriminative feature among the other facial features.We have developed a modified convolutional neural network to characterize the human vision response to improve face classification accuracy.The proposed model achieved mean F1 and Matthew Correlation Coefficient(MCC)of 0.92 and 0.84,respectively,on the validation set,outperforming the traditional Convolutional Neural Network(CNN).The CNN-Contoured Face(CNN-FC)model is developed to train contoured face images to investigate the influence of face shape.Finally,to cross-validate the accuracy of these models,the traditional CNN model is trained on the same dataset.With an accuracy of 92.98%,the Modified-CNN(M-CNN)model has demonstrated that the proposed method could facilitate the tangible impact in intra-classification problems.A novel Indian regional face dataset is created for supporting this supervised classification work,and it will be available to the research community. 展开更多
关键词 Data collection and preparation human vision analysis machine vision canny edge approximation method color local binary patterns convolutional neural network
下载PDF
基于GigE Vision的高速图像采集传输系统设计
20
作者 李昂 赵冬青 +2 位作者 储成群 单彦虎 程洪涛 《舰船电子工程》 2023年第6期116-120,共5页
随着人们对图像质量的要求越来越高,图像数据量大幅增加,传统的数据传输接口在一定程度上难以满足图像数据快速传输的需求,因此需要传输效率更快、性能更稳定的图像采集传输系统。利用千兆以太网高速稳定的传输特性,ZYNQ-7000全可编程系... 随着人们对图像质量的要求越来越高,图像数据量大幅增加,传统的数据传输接口在一定程度上难以满足图像数据快速传输的需求,因此需要传输效率更快、性能更稳定的图像采集传输系统。利用千兆以太网高速稳定的传输特性,ZYNQ-7000全可编程系统ARM+FPGA的组合架构,设计了一款基于GigE Vision的高速图像采集传输系统,实现了图像数据的格式转换、存储和传输,最后通过RGMII接口输出到上位机,并且利用GigE Vision协议的通信方式接收上位机发出的指令,根据指令内容控制摄像头工作。经测试,系统实现了上位机对摄像头的控制功能,图像数据的传输功能,传输速率达到980Mbps,满足高速数据传输的要求且稳定可靠。 展开更多
关键词 FPGA ZYNQ-7000 IP核 图像采集 GigE vision
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部