摘要
图像内容自动描述是计算机视觉和自然语言处理领域的一个重要任务,在生活娱乐、智慧交通以及帮助视觉障碍者理解视觉内容等领域有着广泛而重要的应用价值.相比于图像分类和目标检测等感知任务,图像内容自动描述是一种更高级别、更复杂的认知任务,对帮助分析和理解图像有着重要的意义.旨在对现有的图像自动描述技术进行全面的综述.讨论图像内容自动描述中常用的数据集和评价指标,以及现有图像自动描述技术的性能、优点和局限性.
Image captioning is an important task in the field of computer vision and natural language processing.It has a wide and important application value in our life and entertainment,intelligent transportation and helping people with visual impairment.Compared with other perception tasks such as image classification and object detection,image captioning is a higher level and more complex cognitive task,which has a great significance to help analyze and understand images.In this paper,we aim to give a comprehensive overview of the existing image captioning techniques.Here we discuss the data sets and evaluation metrics commonly used in image captioning,as well as the performances,advantages and limitations of existing image captioning techniques.
作者
邓旭冉
李灵慧
唐胜
张勇东
Deng Xuran;Li Linghui;Tang Sheng;Zhang Yongdong(University of Science and Technology of China,Hefei 230026;Institute of Computing Technology Chinese Academy of Sciences,Beijing 100190)
出处
《信息安全研究》
2019年第11期988-992,共5页
Journal of Information Security Research
基金
国家自然科学基金项目(61572472,61525206)