摘要
分层强化学习中目前有Option、HAM和MAXQ三种主要方法,其自动分层问题均未得到有效解决,该文针对第一种方法,提出了Option自动生成算法,该算法以Agent在学习初始阶段探测到的状态空间为输入,采用人工免疫网络技术对其进行聚类,在聚类后的各状态子集上通过经验回放学习产生内部策略集,从而生成Option,仿真实验验证了该算法的有效性。
There are currently three typical approaches,namely,0ption,HAM,and MAXQ,for hierarchical reinforcement learning,whereas the open problem that generates hierarchies automatically is not solved well,Aiming at the first approach,this paper presents an algorithm for Option automatic generation.The algorithm takes the state space explored by Agent in the initial learning phase and clusters the states employing artificial immune net,Based on the clustered state sets,the intra-strategies are learned by an experience replay procedure.As a result,the Options are generated.The validity of the algorithm is demonstrated by simulation experiments.
出处
《计算机工程与应用》
CSCD
北大核心
2005年第34期4-6,15,共4页
Computer Engineering and Applications
基金
部委基础研究计划项目
关键词
分层强化学习
OPTION
人工免疫网络
经验回放
hierarchical reinforcement learning, Option, artificial immune net, experience replay