摘要
为了更高效地获得视频中的行为信息,提出一种结合时域卷积与双流卷积网络的人体行为识别算法。利用多层时域卷积从视频获取动态信息,得到二维的深层动态特征;构建双流卷积网络并采用深层动态特征代替光流特征作为运动信息流的输入;加权融合双流结果,获得对行为的判定。在公开数据集UCF101、HMDB51与NTU-RGBD-60测试,最高准确率为94.2%、70.9%与89.1%(跨对象实验)。当与经典算法ECO(Efficient Convolutional Network)和TSM(Temporal Shift Module)准确率相近时,平均并行速度分别提高2.1倍和3.6倍。所研究算法提高了计算效率,更具有实用性。
In order to obtain the behavior information in video more efficiently,we propose a human action recognition method based on temporal convolutional neural network and dual-stream convolutional neural network.Multi-layer temporal convolution was used to obtain dynamic information from the video and obtain two-dimensional depth dynamic features.A dual-stream CNN was constructed,and depth dynamic features were used as input to the motion information stream instead of optical flow features.The dual-stream classification scores were fused in a weighted average to obtain a determination of the video action category.The algorithm was tested on public data set UCF101,HMDB51 and NTU-RGBD-60,with the highest accuracy of 94.2%,70.9%and 89.1%(cross-object experiments).When the accuracy is similar to the classical algorithms,such as ECO and TSM,the average parallel speed is increased by a factor of 2.1 and 3.6 respectively.The proposed algorithm improves the computational efficiency and is more practical.
作者
高庆吉
徐达
罗其俊
邢志伟
Gao Qingji;Xu Da;Luo Qijun;Xing Zhiwei(Institute of Robotics,Civil Aviation University of China,Tianjin 300300,China)
出处
《计算机应用与软件》
北大核心
2024年第9期175-181,189,共8页
Computer Applications and Software
基金
国家自然科学基金项目(U1533203)
天津市教委科研计划项目(2019KJ117)。