期刊文献+

Motion Enhanced Model Based on High-Level Spatial Features

下载PDF
导出
摘要 Action recognition has become a current research hotspot in computer vision.Compared to other deep learning methods,Two-stream convolutional network structure achieves better performance in action recognition,which divides the network into spatial and temporal streams,using video frame images as well as dense optical streams in the network,respectively,to obtain the category labels.However,the two-stream network has some drawbacks,i.e.,using dense optical flow as the input of the temporal stream,which is computationally expensive and extremely time-consuming for the current extraction algorithm and cannot meet the requirements of real-time tasks.In this paper,instead of the dense optical flow,the Motion Vectors(MVs)are used and extracted from the compressed domain as temporal features,which greatly reduces the extraction time.However,the motion pattern that MVs contain is coarser,which leads to low accuracy.In this paper,we propose two strategies to improve the accuracy:firstly,an accumulated strategy is used to enhance the motion information and continuity of MVs;secondly,knowledge distillation is used to fuse the spatial information into the temporal stream so that more information(e.g.,motion details,colors,etc.)is obtainable.Experimental results show that the accuracy of MV can be greatly improved by the strategies proposed in this paper and the final recognition for human actions accuracy is guaranteed without using optical flow.
出处 《Computers, Materials & Continua》 SCIE EI 2022年第12期5911-5924,共14页 计算机、材料和连续体(英文)
基金 This work is supported by the Inner Mongolia Natural Science Foundation of China under Grant No.2021MS06016 the CERNET Innovation Project(NGII20190625).
  • 相关文献

参考文献6

二级参考文献7

共引文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部