期刊文献+

基于人机交互的强化学习与实验研究 被引量:1

HCI-BASED REINFORCEMENT LEARNING ALGORITHM AND EXPERIMENT
原文传递
导出
摘要 本文研究具有人机交互能力的强化学习算法。通过人机交互给出操作者对学习结果的性能评价,智能体系统能获得当前状态与目标状态距离的度量,有效地结合操作者的先验知识和专业知识,从而使智能体在状态空间中能进行更有效的搜索,简化复杂任务的学习过程。以猜数字游戏为例,利用提出的学习框架训练智能体具有猜数字的能力。实验结果表明,结合人机交互的强化学习算法大大提高了学习效率。加快了学习过程的收敛速度。 In this paper, a reinforcement learning algorithm based on human-computer interaction is proposed. This interactive learning system can benefit from measurements of the distance between current state and goal state via operator's professional knowledge. Thus learning procedure is expected to be more efficient. A guess-number task is explored to evaluate the proposed learning system. Experimental result shows that the learning efficiency and convergence rate are both increased compared with normal reinforcement learning method.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2003年第3期363-369,共7页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金(No.60275042)
关键词 人工智能 知识库 智能学习系统 人机交互 强化学习 智能体 实验 Reinforcement Learning, Human-Cximputcr Interaction(HCI), Eligibility Trace
  • 相关文献

参考文献21

  • 1张汝波,顾国昌,刘照德,王醒策.强化学习理论、算法及应用[J].控制理论与应用,2000,17(5):637-642. 被引量:92
  • 2张汝波,周宁,顾国昌,张国印.基于强化学习的智能机器人避碰方法研究[J].机器人,1999,21(3):204-209. 被引量:23
  • 3蒋国飞,吴沧浦.Q学习算法在库存控制中的应用[J].自动化学报,1999,25(2):236-241. 被引量:19
  • 4阎平凡.再励学习——原理、算法及其在智能控制中的应用[J].信息与控制,1996,25(1):28-34. 被引量:30
  • 5Onat A, Kosino N, Kuramitu M, Kita H. Reinforcement Learning under Incomplete Perception Using Stochastic Gradient Ascent and Recurrent Neural Networks. In: Proc of the IEEE International Conference on System, Man, and Cybernetics, Tokyo, Japan,1999, 5:481-486.
  • 6Peng J, Williams R J. Incremental Multi-Step Q-Learning.Machine Learning, 1996, 22( 1 - 3) : 283 - 290.
  • 7Sun R, Littman M L. Value-Function Reinforcement Learning in Markov Games. Journal of Cognitive Systems Research, 2001, 2:55 - 66.
  • 8Yoshimoto J, Ishii S, Sato M. On-Line EM Reinforcement Learning. In: Proc of the IEEE International Joint Conference on Neural Networks, Como, Italy, 2000, III : 163 - 168.
  • 9Zhu W, Levinson S. Vision-Based Reinforcement Learning for Robot Navigation. In: Procof the International Joint Conference on Neural Networks, Washington, DC, USA, 2001, II: 1025-1030.
  • 10Kaelbling L P, Littman M L, Moore A W. Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research,1996, 4 : 237 - 285.

二级参考文献15

  • 1杨璐,洪家荣,黄梯云.用加强学习方法解决基于神经网络的时序实时建模问题[J].哈尔滨工业大学学报,1996,28(4):136-139. 被引量:2
  • 2阎平凡.再励学习——原理、算法及其在智能控制中的应用[J].信息与控制,1996,25(1):28-34. 被引量:30
  • 3Peng J,博士学位论文,1993年
  • 4Lin L J,Machine Learning,1992年,8卷,293页
  • 5Leslie Pack Kaelbling. Associative Reinforcement Learning: Functions in k-DNF[J] 1994,Machine Learning(3):279~298
  • 6Leslie Pack Kaelbling. Associative Reinforcement Learning: A Generate and Test Algorithm[J] 1994,Machine Learning(3):299~319
  • 7Leslie Pack Kaelbling. Associative reinforcement learning: Functions ink-DNF[J] 1994,Machine Learning(3):279~298
  • 8Ronald J. Williams. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning[J] 1992,Machine Learning(3-4):229~256
  • 9Christopher J.C.H. Watkins,Peter Dayan. Technical Note: Q-Learning[J] 1992,Machine Learning(3-4):279~292
  • 10Richard S. Sutton. Learning to predict by the methods of temporal differences[J] 1988,Machine Learning(1):9~44

共引文献151

同被引文献13

  • 1蒋科艺,郝建平.沉浸式虚拟维修仿真系统及其实现[J].计算机辅助设计与图形学学报,2005,17(5):1120-1123. 被引量:18
  • 2阎平凡.再励学习——原理、算法及其在智能控制中的应用[J].信息与控制,1996,25(1):28-34. 被引量:30
  • 3余胜泉.从知识传递到认知建构、再到情境认知——三代移动学习的发展与展望[J].中国电化教育,2007(6):7-18. 被引量:303
  • 4KRAUS D C,GRAMOPADHYE A K.Effect of team training on aircraft maintenance technicians:computer-based training versus instructor-based training[J].International Journal of Industrial Ergonomics,2001,27(3):141-157.
  • 5CHO V,CHENG T C E,LAI W M J.The role of perceived user-interface design in continued usage intention of self-paced e-learning tools[J].Computers & Education,2009,53 (2):216-227.
  • 6KOL(A)S L,STAUPE A.A personalized e-learning interface[C] //Proceedings of the International Conference on "Computer as a Tool".Washington,D.C.,USA:IEEE,2007:2670-2675.
  • 7电子工程专辑.哪些技术将是未来主导?ICT2006为您揭开谜底![EB/OL].[2010-03-05].http://www.eet-china.com/ART_8800445979_865371_NT_738ff067.HTM.
  • 8EYSENCK M W,KEANE M T.Cognitive psychology:a student's handbook[M].London,UK:Psychology Press,2000.
  • 9BROADBENT D E.Perception and communication[M].New York,N.Y.,USA:Oxford University Press,1958.
  • 10COVER T M,THOMAS J A.Elements of information theory[M].2nd ed.New York,N.Y.,USA:John Wiley &Sons,2006.

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部