摘要
为提升不同网联范围下智能网联车(Intelligent Connected Vehicles,ICV)的换道效率,结合深度强化学习和分子动力学理论,提出一种融合掩码机制和注意力机制的双深度Q网络(MaskAttention-DDQN,MAQ)换道决策模型。首先,在SUMO(Simulation of Urban Mobility)仿真环境中采集网联范围内ICV及人工驾驶车辆(Human Drive Vehicles,HDV)的行驶状态信息。其次,搭建MAQ模型,采用掩码机制和注意力机制方法,实现固定模型输入大小,以及实现置换不变性。第三,为实现车辆间影响程度的数值化,以车辆间相对速度和相对位置为参数,使用分子动力学理论为网联范围内HDV信息赋予权重。最后,分别在不同交通密度仿真环境中对不同换道决策模型和赋权方法进行对比,并测试ICV在不同网联范围(80~330 m,以50 m为间隔)下的换道决策效果。仿真结果表明,以40辆HDV、100 m网联范围为例,MAQ模型比DeepSet-Q模型拟合精度提高了90.2%;分子动力学赋权方法相比线性权重赋权方法总奖励值提高了5.5%,ICV平均车速提高了4.8%;ICV平均车速随着网联范围的扩大,呈现出先增大、再减小、后趋于平稳的变化规律。
In order to improve the lane-changing efficiency of intelligent connected vehicles(ICV)under different network connection ranges,combined with deep reinforcement learning and molecular dynamics theory,a double deep Q network lane-changing decision model integrating the masking mechanism and attention mechanism(MAQ)is proposed.Firstly,in the Simulation of Urban Mobility(SUMO)simulation environment,the driving status information of connected vehicles and human drive vehicles(HDV)within the network range is collected.Secondly,the MAQ model is built,the mask mechanism and attention mechanism are adopted to achieve fixed model input size and displacement invariance.Thirdly,in order to quantify the degree of influence between vehicles,the relative speed and the relative position between vehicles are used as parameters,and the molecular dynamics theory is used to give weights to HDV information within the connectivity range.Finally,different lane-changing decision models and weighting methods are compared in different traffic density simulation environments.The effect of lane change decision is tested under different connectivity ranges(80~330 meters,with an interval of 50 meters).The simulation results show that,taking 40 HDVs and a 100-meter connectivity range as an example,the MAQ model has a 90.2%improvement in fitting accuracy compared with the DeepSet-Q model;compared with the linear weighting method,the molecular dynamics weighting method increases the total reward value by 5.5%,and the average speed of ICV by 4.4%;with the expansion of the connectivity range,the average speed of ICV shows a change rule of first increasing,then decreasing,and then tending to be stable.
作者
赵建东
贺晓宇
余智鑫
韩明敏
ZHAO Jian-dong;HE Xiao-yu;YU Zhi-xin;HAN Ming-min(School of Traffic and Transportation,Beijing Jiaotong University,Beijing 100044,China;Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport,Beijing Jiaotong University,Beijing 100044,China;Hebei Provincial Communications Planning,Design and Research Institute Co.Ltd.,Shijiazhuang 050000,China)
出处
《交通运输系统工程与信息》
EI
CSCD
北大核心
2023年第1期77-85,共9页
Journal of Transportation Systems Engineering and Information Technology
基金
国家重点研发计划(2019YFB1600200)
国家自然科学基金(71931002,71871011)。
关键词
智能交通
智能网联车
换道决策
深度强化学习
分子动力学
网联范围
intelligent transportation
intelligent connected vehicles
lane change decision
deep reinforcement learning
molecular dynamics
connectivity range