摘要
针对当前主流图像语义分割算法提取的特征图分辨率过低,上采样过程中有效语义信息损失过大,易丢失像素点和目标物体区域相关性的问题,提出了一种基于HRNet-OCR联合交叉注意力机制的图像语义分割算法。该法先采用HRNet替代ResNet作为特征提取骨干网络,保留特征提取过程中的高分辨率信息;然后融合OCR算法对图像进行初步的粗略分割,确定目标对象的大致区域;最后,引入交叉注意力机制模块对像素和对象区域的关联程度加权计算,实现像素的精准分类,保留分割区域的边缘细节。实验结果表明,与常见的分割算法FCN、PSPNet、DeepLabv3+等相比,所提算法在ADE20K、Cityscapes、PASCAL VOC 2012数据集上的mIoU分别提升5.37%、3.09%和2.71%,且可以有效保留细节信息,大幅度改善分割精度。
Aiming at the problems of low resolution of feature maps extracted by current mainstream image semantic segmentation algorithms, excessive loss of effective semantic information during the upsampling process, and easy loss of pixels and object region correlation, an image semantic segmentation algorithm based on the joint HRNet-OCR criss-cross attention mechanism is proposed. The method first adopts HRNet instead of ResNet as the feature extraction backbone network to retain the highresolution information in the feature extraction process;then fuses the OCR algorithm to perform initial rough segmentation of the image to determine the approximate region of the target object;finally, the criss-cross attention mechanism module is introduced to weight the degree of correlation between pixels and object regions to achieve accurate classification of pixels and retain the segmented regio’s edge details. The experimental results show that compared with the common segmentation algorithms FCN, PSPNet,DeepLabv3+, etc., the proposed algorithm can improve the mIoU by 5.37%, 3.09%, and 2.71% on ADE20K, Cityscapes, and PASCAL VOC 2012 datasets, respectively, and can effectively retain the detail information and significantly improve the segmentation accuracy.
作者
胡航
牛晓伟
左昊
金重阳
Hu Hang;Niu Xiaowei;Zuo Hao;Jin Chongyang(School of Electronic and Information Engineering,Chongqing Three Gorges University,Chongqing 404100)
出处
《现代计算机》
2022年第18期23-29,共7页
Modern Computer
基金
国家重点研发计划(2021YFB3901405)
科技部专项课题(2021YFB3901400)
重庆市科技局面上项目(cstc2019jcyj-msxm1328)
重庆市教委科技项目(KJQN202101215、KJQN202101226)
三峡库区地质环境监测与灾害预警—重庆市重点实验室开放基金(ZD2020A0301)。