摘要
HAMs体系的一个主要问题是:它的状态空间是由机器状态与环境状态共同生成的联合状态空间,而基于子过程的状态抽象方法也不能完全解决这个问题.本文对此进行了详细的分析,并从策略耦合SMDPs的观点分析与描述了HAMs模型,提出一系列基于HAMs的同态变换的形式化定义及证明了几个较为实用的定理,表明同态变换方法可以有效地解决这一问题.在此基础上,总结了应用同态变换进行状态抽象的几个重要的观点.并使用本文提出的方法对一个典型的实例进行了分析与验证.
A main problem that exists in HAMs-family HRL is its joint state space consisting of the cross-product of the machine states in the HAM and the states in the original MDP, which is not completely solved by a subroutine-based state abstraction method. This paper analyzes this problem in detail and provides formal descriptions on HAMs model by using "policy- coupled" semi-Markov decision processes. It also provides formal definitions on HAMs-based homomorphisms, proves some useful theorems, and shows that the HAMs-based homomorphisms can conquer this problem. This paper concludes some important opinions on applying homomorphisms to state abstractions. Lastly, a typical example is analyzed and evaluated.
出处
《小型微型计算机系统》
CSCD
北大核心
2008年第11期2074-2082,共9页
Journal of Chinese Computer Systems
基金
国家自然科学基金面上项目(60503048)资助
关键词
层次强化学习
层次抽象机
同态变换
hierarchical reinforcement learning
hierarchies of abstract machines
homomorphism