期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Trends in Event Understanding and Caption Generation/Reconstruction in Dense Video:A Review
1
作者 Ekanayake Mudiyanselage Chulabhaya Lankanatha Ekanayake abubakar sulaiman gezawa Yunqi Lei 《Computers, Materials & Continua》 SCIE EI 2024年第3期2941-2965,共25页
Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It... Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It is also playing an essential role in devolving human-robot interaction.The dense video description is more difficult when compared with simple Video captioning because of the object’s interactions and event overlapping.Deep learning is changing the shape of computer vision(CV)technologies and natural language processing(NLP).There are hundreds of deep learning models,datasets,and evaluations that can improve the gaps in current research.This article filled this gap by evaluating some state-of-the-art approaches,especially focusing on deep learning and machine learning for video caption in a dense environment.In this article,some classic techniques concerning the existing machine learning were reviewed.And provides deep learning models,a detail of benchmark datasets with their respective domains.This paper reviews various evaluation metrics,including Bilingual EvaluationUnderstudy(BLEU),Metric for Evaluation of Translation with Explicit Ordering(METEOR),WordMover’s Distance(WMD),and Recall-Oriented Understudy for Gisting Evaluation(ROUGE)with their pros and cons.Finally,this article listed some future directions and proposed work for context enhancement using key scene extraction with object detection in a particular frame.Especially,how to improve the context of video description by analyzing key frames detection through morphological image analysis.Additionally,the paper discusses a novel approach involving sentence reconstruction and context improvement through key frame object detection,which incorporates the fusion of large languagemodels for refining results.The ultimate results arise fromenhancing the generated text of the proposedmodel by improving the predicted text and isolating objects using various keyframes.These keyframes identify dense events occurring in the video sequence. 展开更多
关键词 Video description video to text video caption sentence reconstruction
下载PDF
A Deep Learning Approach to Mesh Segmentation 被引量:1
2
作者 abubakar sulaiman gezawa Qicong Wang +1 位作者 Haruna Chiroma Yunqi Lei 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第5期1745-1763,共19页
In the shape analysis community,decomposing a 3D shape intomeaningful parts has become a topic of interest.3D model segmentation is largely used in tasks such as shape deformation,shape partial matching,skeleton extra... In the shape analysis community,decomposing a 3D shape intomeaningful parts has become a topic of interest.3D model segmentation is largely used in tasks such as shape deformation,shape partial matching,skeleton extraction,shape correspondence,shape annotation and texture mapping.Numerous approaches have attempted to provide better segmentation solutions;however,the majority of the previous techniques used handcrafted features,which are usually focused on a particular attribute of 3Dobjects and so are difficult to generalize.In this paper,we propose a three-stage approach for using Multi-view recurrent neural network to automatically segment a 3D shape into visually meaningful sub-meshes.The first stage involves normalizing and scaling a 3D model to fit within the unit sphere and rendering the object into different views.Contrasting viewpoints,on the other hand,might not have been associated,and a 3D region could correlate into totally distinct outcomes depending on the viewpoint.To address this,we ran each view through(shared weights)CNN and Bolster block in order to create a probability boundary map.The Bolster block simulates the area relationships between different views,which helps to improve and refine the data.In stage two,the feature maps generated in the previous step are correlated using a Recurrent Neural network to obtain compatible fine detail responses for each view.Finally,a layer that is fully connected is used to return coherent edges,which are then back project to 3D objects to produce the final segmentation.Experiments on the Princeton Segmentation Benchmark dataset show that our proposed method is effective for mesh segmentation tasks. 展开更多
关键词 Deep learning mesh segmentation 3D shape shape features
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部