摘要
细粒度图像分类任务的难点在于类间局部信息差异小.针对现有方法忽略低级特征的重要性,导致局部多样性缺失的问题,提出一种结合金字塔和长短期记忆网络的细粒度图像分类方法.首先,利用特征金字塔和挤压激励模块构建双向特征传递路径,以极少的参数量和计算量实现低级特征流动,从而提取局部的多级特征;接着,通过感兴趣区域引导的局部精炼金字塔,抑制显著区域,提高局部定位的多样性;最后在长短期记忆网络中引入注意力门控,调节各级特征中对细粒度信息的关注度,从而挖掘细粒度特征,并增强其鉴别性.在CUB-200-2011、Stanford Cars和FGVC-Aircraft数据集的分类准确率分别达到90.8%、95.9%和95.4%,明显优于目前主流的细粒度图像分类方法,相较于对比方法的最好结果分别提升1.2%、0.8%和2.0%.
The challenge of fine-grained image categorization lies in subtle inter-class diversity in part information.To address the problem that existing methods ignore the importance of low-level features and result in the loss of part diversity,this paper proposes a fine-grained image categorization method combining pyramid and long short-term memory network.Specifically,we first integrate a feature pyramid and squeeze-and-excitation blocks to construct a bidirectional path for multi-level feature extraction of parts,which a-chieves low-level feature flow with a slight increase in model complexity and computational burden and second,the part refinement pyramid guided by the region of interest is proposed to suppress the discriminative regions and improve the diversity of part localiza-tion.Finally,attention gating is introduced in long short-term memory to regulate the focus on fine-grained information at each level,aiming to mine fine-grained features and improve their discrimination.The experimental results show that our method achieves 90.8%,95.9%and 95.4%on CUB-200-2011,Stanford Cars and FGVC-Aircraft datasets,respectively.It is superior to main-stream methods and improves 1.2%,0.8%and 2.0%,respectively,when compared to state-of-the-art methods.
作者
阳治民
宋威
YANG Zhi-min;SONG Wei(School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi 214122,China;Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence,Jiangnan University,Wuxi 214122,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2023年第8期1771-1776,共6页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(62076110,61673193)资助
江苏省自然科学基金项目(BK20181341)资助
中国博士后科学基金项目(2017M621625)资助。
关键词
多级特征
双向路径
局部精炼
注意力门控
细粒度图像分类
multi-level features
bidirectional path
part refinement
attention gating
fine-grained image categorization