期刊文献+

一种基于深度学习的人体交互行为分层识别方法 被引量:4

A Hierarchical Approach Based on Deep Learning for Human Interactive-action Recognition
下载PDF
导出
摘要 本文把人体交互行为分解为由简单到复杂的4个层次:姿态、原子动作、复杂动作和交互行为,并提出了一种分层渐进的人体交互行为识别方法.该方法共有3层:第1层通过训练栈式降噪自编码神经网络把原始视频中的人体行为识别为姿态序列;第2层构建原子动作的隐马尔科夫模型(hidden Markov model,HMM),并利用估值定界法识别第1层输出的姿态序列中包含的原子动作;第3层以第2层输出的原子动作序列为输入,采用基于上下文无关文法(contextfree grammar,CFG)的描述方法识别原子动作序列中的复杂动作和交互行为.实验结果表明,该方法能有效地识别人体交互行为. This paper discusses the recognition of interaction-level human activities with a hierarchical approach.We classify human activities into four categories:pose,atomic action,composite action,and interaction.In the bottom layer,a new pyramidal stacked de- noising auto-encoder is adopted to recognize the poses of person with high accuracy.In the middle layer, the hidden Markov models (HMMs) of atomic actions are built, and evaluation demarcation algorithm is proposed to detect atomic actions and speed up calcu- lations.In the top layer,the context-free grammar (CFG) is used to represent and recognize interactions.In this layer,a new spatial predicate set is proposed and face orientation is introduced to describe activities.We use Kinect to capture activity videos.The experimental result from the dataset shows that the system possesses the ability to recognize human actions accurately.
出处 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2016年第3期413-419,共7页 Journal of Xiamen University:Natural Science
基金 国家自然科学基金(60975084)
关键词 人体行为识别 深度学习 隐马尔科夫模型(HMM) 上下文无关文法(CFG) KINECT human action recognition deep learning hidden Markov model (HMM) context-free grammar (CFG) Kineet
  • 相关文献

参考文献16

  • 1POPPE R. A survey on vision-based human action recog-nition [J]. Image and Vision Computing, 2010,28 (6):976-990.
  • 2AGGARWAL J K,Ryoo M S. Human activity analysis: areview[J].Acm Computing Surveys,2011 ?43(3) : 1-43.
  • 3SHEIKH Y,SHEIKH M,SHAH M.Exploring the spaceof a human action[C] // 2005 IEEE International Confer-ence on Computer Vision (ICCV). Beijing: IEEE, 2005 :144-149.
  • 4NATARAJAN P, NEVATIA R. Coupled hidden semimarkov models for activity recognition[C] // 2007 IEEEWorkshop on Motion and Video Computing (WMVC).Austin: IEEE, 2007 : 10.
  • 5OLIVER N,HORVITZ E’GARG A.Layered representa-tions for human activity recognition[C] // 2002 IEEE In-ternational Conference on Multimodal Interfaces (ICMI).Pittsburgh, PA : IEEE,2002 : 3-8.
  • 6JOO S W, CHELLAPPA R. Attribute grammar-based e-vent recognition and anomaly detection[C] // 2006 IEEEConference on Computer Vision and Pattern RecognitionWorkshops(CVPRW).New York:IEEE,2006 :107.
  • 7GUPTA A, SRINIVASAN P, JIANBO S,et al. Under-standing videos, constructing plots learning a visuallygrounded storyline model from annotated videos [C]//2009 IEEE Conference on Computer Vision and PatternRecognition(CVPR).Miami,FL: IEEE,2009 : 2012-2019.
  • 8HINTON G E,SALAKHUTDINOV R R.Reducing thedimensionality of data with neural networks [J ]. Science.2006,313(5786):504-507.
  • 9VINCENT P, LAROCHELLE H, LAJOIE I,et al.Stacked denoising autoencoders : learning useful represen-tations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Research, 2010,11:3371-3408.
  • 10ALLEN J F.Rethinking logics of action and time[C] //2013 International Symposium on Temporal Representa-tion and Reasoning (TIME).Pensacola,FL: IEEE,2013 :3-4.

二级参考文献22

  • 1苏毅,吴文虎,郑方,等.基于支持向量机的语音识别研究[C].第六届全国人机语音通讯学术会议,深圳,2001.
  • 2Ning Huazhong, Han Tony Xu, Wahher D B, et al. Hier- archical space-time model enabling efficient search for hu- man actions [ J ]. IEEE Transactions on Circuits and Sys- tems for Video Technology, 2009,19(6) :808-820.
  • 3Gupta A, Srinivasan P, Shi Jianbo, et al. Understanding videos, constructing plots: Learning a visually grounded storyline model from annotated videos [ C ]// Proceedings of the 2009 IEEE International Conference on Computer Vi- sion and Pattern Recognition. 2009:2012-2019.
  • 4Wu Jianxin, Osuntogun A, Choudhury T, et al. A scalable approach to activity recognition based on object use [ C ]// Proceedings of the 11 th IEEE International Conference on Computer Vision. 2007.
  • 5Liu Jingen, Ali S, Shah M. Recognizing human actions u- sing multiple features [ C ]//Proceedings of the 2008 IEEE International Conference on Computer Vision and Pattern Recognition. 2008.
  • 6Yao Bangpeng, Li Feifei. Modeling mutual context of object and human pose in human-object interaction activities [ C l// Proceedings of the 2010 IEEE International Conference on Computer Vision and Pattern Recognition. 2010:17-24.
  • 7Aksoy E, Abramov A, Worgotter F, et al. Categorizing ob- ject-action relations from semantic scene graphs [ C ]//Pro- ceedings of the 2010 IEEE International Conference on Ro- botics and Automation. 2010:398-405.
  • 8Jiang Yugang, Li Zhenguo, Chang Shih-Fu. Modeling scene and object contexts for human action retrieval with few exam- ples [ J ]. IEEE Transactions on Circuits and Systems for Video Technology, 2011,21 (5) :674-681.
  • 9Pirsiavash H, Ramanan D. Detecting activities of daily liv- ing in first-person camera views [ C ]// Proceedings of the 2012 IEEE International Conference on Computer Vision and Pattern Recognition. 2012:2847-2854.
  • 10Li Wanqing, Zhang Zhengyou, Liu Zicheng. Action recog- nition based on a bag of 3D points [ C]// Proceedings of the 3rd IEEE International Workshop on CVPR for Human Communicative Behavior Analysis. 2010:9-14.

共引文献3

同被引文献52

引证文献4

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部