摘要
由于智能监控设备的快速发展和部署,产生的海量监控数据难以通过传统人力处理.加之近年来RGB-IR双模相机的广泛应用,红外监控视频数据得以利用来辅助相关视觉任务.为了更准确地在可视-红外两种模态监控视频中检索相同行人,提出了基于特征融合的可视-红外行人重识别算法.该算法首先设计了基于Transformer方法的特征提取器,从两种模态数据中生成具有判别力的特征.然后考虑对两种模态互补信息的使用,提出双向多模态注意力方法对齐不同模态的特征,并同时融合互补的语义信息,最终通过分类器进行分类识别.在公开数据集进行实验表明,所提算法相对于目前大多数已有算法具有更好的泛化能力和鲁棒性,在SYSU-MM01数据集上的预测精度达到99.86%,在LLCM数据集上的预测准确率达到94.13%.
Due to the rapid development and deployment of intelligent monitoring devices,the massive monitoring data generated is difficult to process throughhuman efforts.In addition,with the wide application of RGB-IR dual-mode cameras in recent years,infrared surveillancedata has been utilized to assist related computer visual tasks.In order to more accurately retrievethe same pedestrians in visible-infrared dual-modal surveillance videos,a visual-infrared person re-identification algorithm based on feature fusion is proposed.The algorithm first designs a Transformer-basedfeature extractorto generate discriminative features from two modalities of images.Then,considering the use of complementary information between the two modalities,a bidirectional multi-modal attention method is proposed to align the features of different modalities and simultaneously fuse complementary semantic information.Experiments on public datasets have shown that the proposed algorithm has better generalization ability and robustness compared to most existing algorithms.The prediction accuracy on the SYSU-MM01 dataset reaches 99.86%,and the prediction accuracy on the LLCM dataset reaches 94.13%.
作者
申汶山
王洁
黄琴
SHEN Wenshan;WANG Jie;HUANG Qin(School of Mathematics and Computer Science,Shanxi Normal University,Taiyuan 030031,Shanxi,China;Shanxi University Institute of Big Data Science and Industry,Shanxi University,Taiyuan 030000,Shanxi,China)
出处
《山西师范大学学报(自然科学版)》
2024年第1期45-53,共9页
Journal of Shanxi Normal University(Natural Science Edition)