摘要
针对服务机器人视觉抓取时待识别目标存在密集遮挡导致识别效果差的问题,提出改进YOLOv7的服务机器人密集遮挡目标识别方法。首先,为改善密集遮挡目标特征信息丢失导致识别困难的问题,使用深度过参数化卷积构建深度过参数化高效聚合网络,利用不同卷积核对每个通道进行运算,增强网络感知能力,使网络关注目标未遮挡区域特征;其次,为抑制密集遮挡目标边界不易区分对识别造成的影响,将坐标注意力机制嵌入主干网络中,使网络获取目标位置信息并更好地关注特征图中的重要区域,增强网络特征提取能力;最后,使用Ghost网络进行轻量化改进,减少计算量并降低模型内存占用。在自建数据集与公共数据集分别对模型进行对比实验,实验结果表明,改进后模型mAP分别达到92.9%,87.8%。本文模型在降低内存占用的同时,识别精度和识别效率提升,整体性能更优。
Aiming at the problem of poor recognition effect due to dense occlusion of the target to be recognized during visual grasping of service robots,we propose to improve the dense occlusion target recognition method for service robots with YOLOv7.First,in order to improve the problem of recognition difficulties caused by the loss of feature information of densely occluded targets,a deep over-parameterized convolution was used to construct a deep over-parameterized high-efficiency aggregation network,and different convolution kernels were used to operate on each channel to enhance the network sensing ability,so that the network focused on the features of the target's uncovered area;second,in order to suppress the influence caused by dense occlusions and indistinguishable target boundaries on recognition,the coordinate attention mechanism was embedded into the backbone network.This enabled the network to obtain target position information and paid more attention to the important areas in the feature map,thereby enhancing the capability of the network to extract features;finally,the Ghost network was used to improve the lightweighting,reduce the number of parameters of the network model and the number of floating-point operations to realize the lightweighting,reduce the memory occupation of the model,and increase the model operation efficiency.Comparison experiments were conducted on the model in the self-constructed dataset and the public dataset respectively,and the experimental results show that the improved model achieves a mAP of 92.9%on the self-constructed dataset and 87.8%on the public dataset,which is better than the original method and the other commonly used methods.In this paper,the model reduces the memory footprint while the recognition accuracy and recognition efficiency are improved,and the overall performance is better.
作者
陈仁祥
邱天然
杨黎霞
余腾伟
贾飞
陈才
CHEN Renxiang;QIU Tianran;YANG Lixia;YU Tenwei;JIA Fei;CHEN Cai(Chongqing Engineering Laboratory of Traffic Engineering Application Robot,Chongqing Jiaotong University,Chongqing 400074,China;School of Business Administration,Chongqing University of Science and Technology,Chongqing 401331,China;Chongqing Intelligent Robot Research Institute,Chongqing 4000714,China)
出处
《光学精密工程》
EI
CAS
CSCD
北大核心
2024年第10期1595-1605,共11页
Optics and Precision Engineering
基金
国家自然科学基金(No.51975079)
重庆市教委科学技术研究项目(No.KJZD-M202200701)
重庆市自然科学基金创新发展联合基金(No.CSTB2023NSCQ-LZX0127)
重庆市研究生联合培养基地项目(No.JDLHPYJD2021007)
重庆市专业学位研究生教学案例库(No.JDALK2022007)
重庆市研究生科研创新项目(No.CYS23509)
重庆科技大学科研启动项目(No.ckre202212030)。