期刊文献+

基于多尺度金字塔Transformer的人群计数方法

A crowd counting network based on multi-scale pyramid Transformer
下载PDF
导出
摘要 针对密集人群场景中背景复杂、目标尺度变化较大导致人群计数精度较低的问题,本文提出一种基于多尺度金字塔Transformer的人群计数方法(multi-scale pyramid transformer network,MSPT-Net)。在特征提取阶段设计了一种基于深度可分离自注意力的金字塔Transformer主干网络结构,该网络结构能有效捕获图像的局部和全局信息,从而有效解决人群密度图像背景复杂导致计数精度低的问题;设计了一种特征金字塔融合模块及多尺度感受野的回归头,实现了密集人群图像浅层细节特征和深层语义特征的高效融合,增强了网络对不同尺度目标的捕获能力;采用深度监督的训练方法在3个公开数据集上对提出的方法进行验证。实验结果表明,本文方法在全监督与弱监督学习策略中,与目前主流的人群计数方法相比,实现了更高精度的人群计数,克服了主流方法对背景复杂、目标尺度变化大的密集人群图像计数精度低的问题,同时本文方法保持着更小的参数量与计算量。 A crowd counting network based on multi-scale pyramid Transformer(MSPT-Net)is proposed to address the problem of low accuracy in crowd counting in dense crowd scenes caused by complex backgrounds and large target scale variations.A pyramid transformer backbone network structure based on depth separable self-attention is designed in the feature extraction phase to effectively capture local as well as global information of the image,thereby effectively addressing the problem of low counting accuracy in crowd density images caused by complex backgrounds.A feature pyramid fusion module and a regression head with multi-scale receptive fields are designed to efficiently integrate shallow detail features and deep semantic features in dense crowd scenes,enhancing the network’s ability to capture targets of different scales.Lastly,the proposed model is validated using a deep supervision training method on three publicly available datasets.The experimental results show that the proposed MSPT-Net achieves higher crowd counting accuracy in the fully supervised and weakly supervised learning strategies as compared to mainstream crowd counting networks,overcoming the issue of low counting accuracy in dense crowd images with complex backgrounds and significant changes in target scales.At the same time,the method in this paper keeps the parameter number and calculation amount smaller.
作者 张少乐 雷涛 王营博 周强 薛明园 赵伟强 ZHANG Shaole;LEI Tao;WANG Yingbo;ZHOU Qiang;XUE Mingyuan;ZHAO Weiqiang(School of Electrical and Control Engineering,Shaanxi University of Science and Technology,Xi’an 710021,China;School of Electronic Information and Artificial Intelligence,Shaanxi University of Science and Technology,Xi’an 710021,China;Shaanxi Joint Laboratory of Artificial Intelligence,Shaanxi University of Science and Technology,Xi’an 710021,China;China Electronics Technology Group Corporation Northwest Group Corporation Xi'an Branch,Xi’an 710065,China)
出处 《智能系统学报》 CSCD 北大核心 2024年第1期67-78,共12页 CAAI Transactions on Intelligent Systems
基金 国家自然科学基金项目(62271296,62201334) 陕西省重点研发计划项目(2021ZDLGY08-07) 陕西省杰出青年科学基金项目(2021JC-47)。
关键词 密集人群 人群计数 多尺度 金字塔 TRANSFORMER 自注意力 密度图 深度监督 dense crowd crowd counting multi-scale pyramid Transformer self-attention density map deep supervision
  • 相关文献

参考文献3

二级参考文献3

共引文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部