摘要
在DQN算法的框架下,研究了无人车路径规划问题.为提高探索效率,将处理连续状态的DQN算法加以变化地应用到离散状态,同时为平衡探索与利用,选择仅在DQN网络输出层添加噪声,并设计了渐进式奖励函数,最后在Gazebo仿真环境中进行实验.仿真结果表明:①该策略能快速规划出从初始点到目标点的无碰撞路线,与Q-learning算法、DQN算法和noisynet_DQN算法相比,该文提出的算法收敛速度更快;②该策略关于初始点、目标点、障碍物具有泛化能力,验证了其有效性与鲁棒性.
The path programming of the unmanned ground vehicle(UGV)was studied under the framework of the deep Q-network(DQN)algorithm.To improve the exploration efficiency,the DQN algorithm was applied through discretization of the continuous state into the discrete state.To balance between exploration and exploitation,the Gaussian noise was added only in the output layer of the network,and a progressive reward function was designed.Finally,experiments were carried out in the Gazebo simulation environment.The simulation results show that,first,this strategy can quickly program a collision-free route from the initial point to the target point,and the convergence speed is significantly higher than those of the Q-learning algorithm,the DQN algorithm and the noisynet_DQN algorithm;second,this strategy has the generalization ability about the initial point,the target point and the obstacles,as well as verified effectiveness and robustness.
作者
李杨
闫冬梅
刘磊
LI Yang;YAN Dongmei;LIU Lei(College of Science,Hohai University,Nanjing 211100,P.R.China;School of Modern Posts,Nanjing University of Posts and Telecommunications,Nanjing 211100,P.R.China)
出处
《应用数学和力学》
CSCD
北大核心
2023年第4期450-460,共11页
Applied Mathematics and Mechanics
基金
国家自然科学基金(面上项目)(61773152)。