摘要
细粒度图像具有类内方差大、类间方差小的特点,致使细粒度图像分类(FGIC)的难度远高于传统的图像分类任务。介绍了FGIC的应用场景、任务难点、算法发展历程和相关的常用数据集,主要概述相关算法:基于局部检测的分类方法通常采用连接、求和及池化等操作,模型训练较为复杂,在实际应用中存在较多局限;基于线性特征的分类方法模仿人类视觉的两个神经通路分别进行识别和定位,分类效果相对较优;基于注意力机制的分类方法模拟人类观察外界事物的机制,先扫描全景,后锁定重点关注区域并形成注意力焦点,分类效果有进一步的提高。最后针对目前研究的不足,展望FGIC下一步的研究方向。
The fine-grained image has characteristics of large intra-class variance and small inter-class variance, which makes Fine-Grained Image Categorization(FGIC) much more difficult than traditional image classification tasks. The application scenarios, task difficulties, algorithm development history and related common datasets of FGIC were described, and an overview of related algorithms was mainly presented. Classification methods based on local detection usually use operations of connection, summation and pooling, and the model training was complex and had many limitations in practical applications. Classification methods based on linear features simulated two neural pathways of human vision for recognition and localization respectively, and the classification effect is relatively better. Classification methods based on attention mechanism simulated the mechanism of human observation of external things, scanning the panorama first, and then locking the key attention area and forming the attention focus, and the classification effect was further improved. For the shortcomings of the current research, the next research directions of FGIC were proposed.
作者
申志军
穆丽娜
高静
史远航
刘志强
SHEN Zhijun;MU Lina;GAO Jing;SHI Yuanhang;LIU Zhiqiang(School of Computer and Information Engineering,Fuyang Normal University,Fuyang Anhui 236037,China;College of Computer and Information Engineering,Inner Mongolia Agricultural University,Hohhot Inner Mongolia 010011,China)
出处
《计算机应用》
CSCD
北大核心
2023年第1期51-60,共10页
journal of Computer Applications
基金
阜阳师范大学科学研究项目(2021KYQD0028)
内蒙古自治区科技攻关项目(2021GG0090)
内蒙古农业大学博士科研启动基金资助项目(BJ2013B-1)
内蒙纪检监察大数据实验室开放课题(IMDBD2020015)。
关键词
细粒度图像分类
深度学习
卷积神经网络
注意力机制
计算机视觉
Fine-Grained Image Categorization(FGIC)
deep learning
Convolutional Neural Network(CNN)
attention mechanism
computer vision