摘要
食品图像识别是计算机视觉、数据挖掘以及食品科学与技术等领域的研究热点.基于卷积神经网络(CNN)方法的中餐食品图像识别技术在直接提取图像的视觉特征时,由于食品图像类间差异小、类内差异大等属性,而导致识别率不高.为此对CNN进行优化设计提出一种适用于中餐食品图像识别的FoodResNet18模型,该模型融合非对称卷积增强局部骨架信息学习,同时嵌入深浅层共用的注意力模块,解决整张图像信息的无差别化特征提取,从局部到全局提升了特征提取的效率.选用本领域典型的VIREO Food-172中餐基准数据集进行多次实验,结果验证了FoodResNet18模型的有效性,在平衡识别精度与模型占用空间关系的基础上,基于动态变化的固定步长学习率衰减策略加快了模型收敛速度,按照图像识别性能的top1、top5方式获得识别率,最终使食品图像识别精度达到85.26%和96.21%,且比流行的ResNet101、ResNet-18、ResNet-34模型方法提升10.06%、9.89%、16.33%,进一步表明本文的食品图像识别方法在中小规模的食品图像识别系统将具有较好的应用前景.
Food image recognition is a research hotspot in the fields of computer vision,data mining,and food science and technology.When the Chinese food image recognition technology based on the convolutional neural network(CNN)method directly extracts the visual features of the image,the recognition rate is not high due to the small differences between the food images and the large differences within the categories.For this reason,this article optimizes the design of CNN and proposes a FoodResNet18 model suitable for Chinese food image recognition.This model integrates asymmetric convolution to enhance local skeleton information learning,and at the same time embeds the attention module shared by the deep and shallow layers to solve the problem of the entire image information.Differentiated feature extraction improves the efficiency of feature extraction from local to global.The typical VIREO Food-172 Chinese food benchmark data set in this field is selected for multiple experiments,and the results verified the effectiveness of the FoodResNet18 model.On the basis of balancing the recognition accuracy and the model occupation space,the learning rate attenuation strategy based on dynamic changes is accelerated by a fixed step size.In order to achieve the model convergence speed,the recognition rate is obtained according to the top1 and top5 methods of image recognition performance,and finally the food image recognition accuracy reaches 85.26%and 96.21%,which is 10.06%,9.89%and 16.33%higher than the popular ResNet101,ResNet-18,and ResNet-34 model methods.It is further shown that the food image recognition method designed in this paper will have a good application prospect in small and medium-sized food image recognition systems.
作者
王海燕
张渺
刘虎林
陈晓
WANG Hai-yan;ZHANG Miao;LIU Hu-lin;CHEN Xiao(School of Electronic Information and Artificial Intelligence, Shaanxi University of Science & Technology, Xi′an 710021, China)
出处
《陕西科技大学学报》
北大核心
2022年第1期154-160,共7页
Journal of Shaanxi University of Science & Technology
基金
国家自然科学基金项目(62031021)。
关键词
CNN
增强块
注意残差模块
食品分类
CNN
enhanced block
attention-residual module
food classification