摘要
针对传统的协同式自适应巡航控制的算法响应慢、无法快速准确地对突发危险路况做出反应的问题,设计了基于深度强化学习的协同式自适应巡航控制框架,提出了双经验池和优化评价的深度确定性策略梯度算法。在传统算法基础上新建了2个包含车辆状态信息的经验池(优先价值经验池和撒普列经验池),训练数据样本分别从2个经验池按比例选取;critic评价模块采用多维向量对输出的踏板开度策略精确评价。结果表明,该算法在正常行驶工况和突发危险工况下:平均跟车间距误差分别下降1.8m和1.5m,跟车调节时间分别降低30%和25%,可以提升控制的准确性和系统紧急反应能力。
Aiming at problems that traditional algorithms of cooperative adaptive cruise control cannot quickly and accurately respond to emergency situations,a framework of cooperative adaptive cruise control based on deep reinforcement learning is built.Then,a gradient algorithm of deep deterministic strategy with advantages of double experience pools and optimal evaluation is proposed.Based on traditional algorithms,two new experience pools(priority value experience pool and Sapler experience pool)which contain vehicle information are established.Data of training samples is proportionally selected from the two experience pools,and a method of pedal opening strategy with multi-dimensional vectors is adopted in critic module for accurate evaluation.The experimental results show that under normal driving conditions and sudden dangerous conditions,the average spacing error decreases by 1.8 m and 1.5 m,respectively.The follow-up time decreases by 30%and 25%,respectively.Apparently,this algorithm can improve the accuracy of control and the ability of emergency response of the system.
作者
王文飒
梁军
陈龙
陈小波
朱宁
华国栋
WANG Wensa;LIANG Jun;CHEN Long;CHEN Xiaobo;ZHU Ning;HUA Guodong(School of Automotive,Jiangsu University,Zhenjiang 212013,Jiangsu,China;Department of Mechanical,Shizuoka Institute of Science and Technology,Fukuroi,Shizuoka 437-0032,Japan;Jiangsu Zhixing Future Automobile Research Institute,Nanjing 210000,China)
出处
《交通信息与安全》
CSCD
北大核心
2019年第3期93-100,共8页
Journal of Transport Information and Safety
基金
国家重点研发计划项目(2017YFB0102503)
国家自然科学基金项目(U1564201、61773184、61806086)资助
关键词
智能驾驶
自动控制
协同式自适应巡航控制
深度强化学习
深度确定性策略梯度
intelligent driving
automatic control
Collaborative Adaptive Cruise Control
deep reinforcement learning
deep deterministic policy gradient