期刊文献+

MDP 中非时齐折扣模型向时齐折扣模型的转化问题

Transformation from Nonstationary Discounted Markov Decision Processes to Stationary Discounted Markov Decision Processes
下载PDF
导出
摘要 在状态和行动集均可数,报酬函数有界条件下,建立起非时齐折扣MDP的对应时齐折扣MDP模型,并证明两者等价从而把非时齐折扣MDP问题转化为一个与之等价的时齐折扣MDP问题,使时齐折扣MDP的结果对非时齐情况也成立鉴于时齐折扣模型的讨论比较充分,这就带来了非时齐折扣模型的完满结论。 This paper is concentrated on the study of transformation for nonstationary discounted Markov decision processes Hera, the state spaces and cation spaces are countable, and the reward functions are bounded Through the transformation of models from nonstationary to stationary, a specially structured stationary discounted MDP is worked out Thus the intrinisic relationship between the two models is provided and they are proven equivalent Accordingly, the results about ε optimal policies and optimal policies in the stationary discounted can be applied to the nonstationary discounted MDP
出处 《昆明工学院学报》 1997年第6期30-36,共7页
关键词 非时齐折扣MDP 时齐折扣MDP 模型转化 nonstationary discounted MDP stationary discounted MDP transformation of models (S t, ε)optimal policy optimal policy ε optimal policy
  • 相关文献

参考文献1

二级参考文献2

  • 1郭世贞.折扣马氏决策规划的方差最小最优策略问题[J]应用数学学报,1987(02).
  • 2郭世贞.折扣目标马氏决策的最优策略问题[J]经济数学,1984(00).

共引文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部