期刊文献+

一种记忆可修剪型仿生机器人的速度跟踪算法研究 被引量:2

Research on a speed tracking algorithm for memory pruning bionic robot
下载PDF
导出
摘要 针对强化学习算法训练网络规模较大、运行时间较长、过度拟合等问题,提出一种记忆可修剪型强化学习仿生模型(H-RLM)作为两轮机器人的学习机制。该算法将神经网络输出与期望输出的最小均方差作为代价函数,采用Hessian矩阵和Markov相结合的决策进行寻优,选择最大评价值对应的最优行为。这样既可以保证初期网络学习训练内容的完整性,又降低了系统对初始条件的约束性,提高了控制算法的泛化能力。利用H-RLM和强化算法对两轮机器人进行速度跟踪实验,结果表明,H-RLM算法能够提高网络学习效率、消除延迟影响、减小输出误差,获得了良好的动态性能。 Since the reinforcement learning algorithm has the problems of large scale, long running time and over fitting for network training, a pruning reinforcement learning model (H-RLM) taken as the learning mechanism of the two-wheeled robot is proposed. The output of neural network and least mean square error of expected output are deem as the cost function of the algorithm. The Hessian matrix and Markov decision model are combined to select the optimal behavior corresponding to the maxi- mum evaluation value, which can ensure the integrity of the training content of the network learning in initial period, and reduce the system contraints for initial conditions, and improve the generalization ability of the control algorithm. The speed tracking experiments were carried on by means of H-RLM algorithm and reinforcement learning algorithm. The experimental resuits show that the H-RLM algorithm can improve the network learning efficiency, eliminate the delay effect, reduce the output error, and obtain better dynamic performance.
出处 《现代电子技术》 北大核心 2017年第15期141-145,共5页 Modern Electronics Technique
基金 国家自然科学基金项目(61203343) 河北省自然科学基金项目(E2014209106) 河北省教育厅高等学校科学技术研究项目(QN2016102 QN2016105) 华北理工大学研究生创新项目(2016S10)
关键词 强化学习 可修剪仿生模型 HESSIAN矩阵 两轮机器人 reinforcement learning pruning bionic model Hessian matrix two-wheeled robot
  • 相关文献

参考文献3

二级参考文献23

  • 1魏英姿 ,赵明扬 .强化学习算法中启发式回报函数的设计及其收敛性分析[J].计算机科学,2005,32(3):190-193. 被引量:13
  • 2马野,王孝通,戴耀.基于模糊神经网络的自适应滤波方法仿真研究[J].系统仿真学报,2005,17(10):2447-2449. 被引量:7
  • 3Chcllappa R, Wilson C L, Sirohcy S. Human and Machine Recognition of Faces: A Survey [J]. Proc. IEEE, 2005, 93(2): 705-740.
  • 4Brunelli R, Poggio T. Face Recognition: Features Versus Templates [J]. IEEE Trans. Pattern Analysis and Machine Intelligence, 2003, 25(10): 1042-1053.
  • 5Platt J. A Resource-Allocating Network for Function Interpolation [J]. Neural Computation, 2007, 19(2): 213-225.
  • 6Valentin D, Abdi H, O'Toole A J, et al. Connectionist Models of Face Processing: a Survey [J]. Patt. Recog, 2004, 27(4): 1209-1230.
  • 7Chao C T, Chen Y J, Teng C C. Simplification of Fuzzy Neural Systems Using Similarity Analysis [J]. 1EEE Trans. Syst, Man, Cybern, Part B: Cybem, 2003, 35(2): 344-354.
  • 8Jang J-S R. ANFIS: Adaptive-Network-Based Fuzzy Inference System [J]. IEEE Trans. Syst, Man, Cybem, 1993, 23(3): 665-684.
  • 9Kadirkamanathan V, Niranjan M. A Function Estimation Approach to Sequential Learning with Neural Networks [J]. Neural Computation, 2004, 16(4): 954-975.
  • 10Lu Y, Sundamrajan N, Saratchandmn P. A Sequential Learning Scheme for Function Approximation by Using Minimal Radial Basis Function Networks [J]. Neural Computation, 2007, 19(2): 461-478.

共引文献5

同被引文献24

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部