期刊文献+

一个因素化SARSA(λ)激励学习算法 被引量:8

A FACTORED SARSA(λ)ALGORITHM OF REINFORCEMENT LEARNING
下载PDF
导出
摘要 基于状态的因素化表达 ,提出了一个新的 SARSA(λ)激励学习算法 .其基本思想是根据状态的特征得出状态相似性启发式 ,再根据该启发式对状态空间进行聚类 ,大大减少了状态空间搜索与计算的复杂度 ,因此比较适用于求解大状态空间的 MDPs问题 . Based on the factored representation of a state, a new SARSA( λ ) algorithm is proposed. The main principle of the algorithm is that a heuristics on the state similarities can be gained from the features of the state, and according to the heuristics, the state space is aggregated, significantly reducing the searching and computing complexity for the state space. Therefore the algorithm is a promise for solving large scale MDPs problems which are of a huge state space.
出处 《计算机研究与发展》 EI CSCD 北大核心 2001年第1期88-92,共5页 Journal of Computer Research and Development
关键词 激励学习 状态聚类 MDPs SARSA(λ)学习 reinforcement learning state aggregate Markov decision processes SARSA(λ) learning
  • 相关文献

参考文献4

  • 1陈焕文 谢建平 等.在策略激励学习算法的POMDPs实验研究[J].南京大学学报(自然科学版)(计算机专辑),2000,36:219-223.
  • 2Chen Huanwen,南京大学学报,2000年,36卷,219页
  • 3陈焕文,南京大学学报,2000年,36卷,计算机专辑,219页
  • 4Peng J,Machine Learning,1996年,22卷,4期,283页

同被引文献35

  • 1Bertsekas D P 李人厚(译).动态规划-确定和随机模型[M].西安:西安交通大学学报,1990..
  • 2Sutton R S,Barto A G.Reinforcement Learning:An Introduction[M].MA:MIT Press,1998
  • 3Watkins C J C H,Dayan P.Q-learning[J].Machine Learning,1992;8(3):279~292
  • 4Sutton R S.Learning to predict by the method of temporal difference[J].Machine Learning,1988 ;3 (1) :9~44
  • 5Peng J,Williams R.Incremental multi-step Q-learning[J].Machine Learning,1996 ;22(4) :283~290
  • 6Watkins C J C H.Leaming from delayed rewarfs[D].University of Cambridge,England,1989
  • 7Wiering M,Schmidhuber J.Speeding up Q-learnind[C].In:Proc of the 10 European Conf on Machine Learning,1998
  • 8Sutton R S.Open theoretical questions in reinforcement learning[C].In:Proc of EuroCOLT'99(Computational Learning Theory),Cambridge,A:MIT Press,1999:11~17
  • 9Singh S.Reinforcement Learning Algorithm for Average-Payoff Mar~kovian Decision Processes[C].In:Proc of the 12' AAAI,1994
  • 10Sutton R S,Barto A G.Reinforcement Learning:An introduction[M].MA: MIT Press, 1998

引证文献8

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部