摘要
人群计数旨在准确地预测现实场景中人群的数量、分布和密度,然而现实场景普遍存在背景复杂、目标尺度多样和人群分布杂乱等问题,给人群计数任务带来极大的挑战。针对这些问题,提出了一种融合通道与空间注意力的编解码结构人群计数网络(CSANet)。该模型采用多层次编解码网络结构提取多尺度语义特征,并充分融合空间上下文信息,以此来解决复杂场景中行人尺度变化和分布杂乱的问题;为了降低复杂背景对计数性能的影响,在特征融合的过程中引入了通道与空间注意力,提高人群区域的特征权重,凸显感兴趣区域,同时降低弱相关背景区域的特征权重,抑制背景噪声干扰,最终提升人群密度图质量。为了验证算法的有效性,在多个经典人群计数数据集上进行了实验,实验结果表明,与现有的人群计数算法相比,CSANet具有良好的多尺度特征提取能力和背景噪声抑制能力,这使得密集场景下计数算法的准确性和鲁棒性均有较大提升。
The purpose of crowd counting is to accurately predict the number,distribution and density of crowds in real scenes.However,crowd counting often suffers from some problems such as complex background,diverse target scales,and cluttered crowd distribution,which strongly affects the precision of counting.To solve these problems,a channel and spatial attention-based encoder-decoder network for crowd counting(CSANet)is proposed.It uses a multi-level encoder-decoder network to extract multi-scale semantic features,and fully integrates spatial context information to solve the problem of pedestrian scale changes and messy distribution in complex scenes.To reduce the impact of complex background on counting performance,channel and spatial attention are introduced in the process of feature fusion to improve the quality of crowd density map by increasing the feature weights of crowd regions to highlight regions of interest,and decreasing the feature weights of weakly correlated background regions to suppress background noise interference.To verify the effectiveness of the proposed algorithm,experiments are conducted on several classical crowd counting datasets,and the experimental results show that CSANet performs well in multi-scale feature extraction and background noise suppression compared with existing crowd counting algorithms,which greatly improves the accuracy and robustness of counting algorithm in dense scenes.
作者
余鹰
潘诚
朱慧琳
钱进
汤洪
YU Ying;PAN Cheng;ZHU Huilin;QIAN Jin;TANG Hong(College of Software,East China Jiaotong University,Nanchang 330013,China)
出处
《计算机科学与探索》
CSCD
北大核心
2022年第11期2547-2556,共10页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金(62163016,62066014)
江西省自然科学基金(20212ACB202001,20202BABL202018)。
关键词
人群计数
编解码网络
注意力
特征融合
深度学习
crowd counting
encoder-decoder network
attention
feature fusion
deep learning