摘要
针对当前图像多标签标注方法只能标注图像内容信息(本体),而不能同时标注图像寓意信息(隐义)的问题,提出了一种基于多任务学习的双层多标签标注模型(MTL-DMAM)。首先将图像的本体标注和隐义标注视为两个关联任务,以ResNeXt-50作为共享特征的主干网络,然后利用注意力机制分别为每个任务构建一个分支结构,实现了图像双层标注,同时为消除图像内各物体大小差异对标注结果的影响,在模型中加入ELASTIC结构,进一步提高了模型性能。在对比实验中,本文模型在单任务MS-COCO数据集和多任务传统服饰数据集上优于其他同类模型。最后,利用Grad-cam方法可视化模型MTL-DMAM在标注时重点关注的图像区域,实验结果表明本文模型能有效学习标签对应的图像显著特征。
To solve the problem that current image multi-label annotation methods can only annotate image content information(ontology),but can not simultaneously annotate image implied information(implicit),this paper proposes a double-layer multi-label annotation model based on multi-task learning(MTLDMAM). Firstly,the image ontology annotation and implicit annotation are regarded as two related tasks,and ResNeXt-50 is used as the backbone network of shared features. Then,in order to realize image double-level annotation,attention mechanism is used to construct a branch structure for each task. In order to eliminate the influence of different object sizes on labeling results in images,the ELASTIC structure is added to the model to improve the performance of the model. The comparative experiment results show that,on single task MS-COCO data set,the proposed model is superior to most advanced models in the indicators of C-R,C-F1,O-R,and mAP,and on multi-task traditional costume data set,the proposed model is superior to all other models in 10 indicators. Finally,we use the Grad-cam method to visualize the image region that MTL-DMAM focuses on when labeling,and the experimental results show that the proposed model can effectively learn the salient features of the image corresponding to labels.
作者
赵海英
周伟
侯小刚
张小利
ZHAO Hai-ying;ZHOU Wei;HOU Xiao-gang;ZHANG Xiao-li(School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China;School of Digital Media and Design Art,Beijing University of Posts and Telecommunications,Beijing 100876,China;College of Computer Science and Technology,Jilin University,Changchun 130012,China)
出处
《吉林大学学报(工学版)》
EI
CAS
CSCD
北大核心
2021年第1期293-302,共10页
Journal of Jilin University:Engineering and Technology Edition
基金
中央文化产业发展专项资金申报项目(GSSKS-2015-035).
关键词
人工智能
传统服饰
多任务学习
多标签标注
注意力机制
artificial intelligence
traditional costume
multi-task learning
multi-label annotation
attention mechanisms