摘要
为了提高多智能体系统中的典型的强化学习——Q学习的学习效率和收敛速度,充分利用环境信息和相关的专家经验,提出了具有先验知识的Q学习算法。利用模糊综合决策方法处理专家经验和环境信息得到Q学习的先验知识,对Q学习的初始状态进行优化。典型的足球机器人系统中的仿真实验结果表明:该算法使学习过程建立在较好的学习基础上,从而更快地趋近于最优状态,其学习效率和收敛速度明显优于普通的Q学习。
Reinforcement Learning (RL) is an important branch of machine learning and it is unsupervised without specific signals. The learning process adjusts its actions according to external signals from interactions with the environment as a result, the system learning speed is relatively slow. Q-learning is a typical RL method with a slow convergence speed especially as the scales of the state space and the action space increase. An improved Q-learning method using prior knowledge uses fuzzy integrated decision making to process expert knowledge, which optimizes the initial states to give a better learning foundation. Test results on the Robot Soccer system show that the improved Q learning method has a higher learning efficiency and convergence speed.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2005年第7期981-984,共4页
Journal of Tsinghua University(Science and Technology)
基金
山东省自然科学基金资助项目(Y2002G18)
关键词
机器学习
Q学习
模糊综合决策
多智能体系统
machine learning
Q-learning
fuzzy integrated decision-making
multi-agent system