摘要
有机朗肯循环因其热效率较高、部件成熟的优点,被认为是目前主流的内燃机余热回收技术。然而由于实际道路工况的复杂多变,给余热回收系统在瞬变工况下的安全高效控制带来了巨大挑战。为此,该文提出基于深度强化学习的控制方法,通过线下学习优化控制结合线上做决策解决此问题。建立经过实验验证的跨临界有机朗肯循环动态仿真模型,作为深度强化学习的训练环境,进而学习到安全的优化控制策略。仿真结果表明:深度强化学习控制器相比于传统PID恒温控制器(控制膨胀机进口工质温度为定值),可将系统始终控制在更安全和高效的状态;且在未经训练的瞬态波动热源条件下,深度强化学习控制器表现出了较好的外推泛化性能。研究结果证明深度强化学习对瞬态工况下的热动力循环优化控制具有非常可观的潜力。
The organic Rankine cycle is the dominating internal combustion engine waste heat recovery technology due to its high thermal efficiency and mature components.However,due to the complexity and variability of actual road conditions,the safety and efficient control of waste heat recovery systems under transient conditions faces great challenges.To solve the above issues,a deep reinforcement learning-based control method is proposed with offline optimal control leaning and online decision.An experimentally validated dynamic simulation model of a transcritical organic Rankine cycle is developed as the training environment to learn a safe and optimized control strategy.The simulation results show that the deep reinforcement learning control can achieve safer and more efficient control than a conventional thermostatic PID control(controlling a constant working fluid temperature at the expander inlet).The deep reinforcement learning-based control also shows perfect extrapolation generalization performance under untrained transient fluctuating heat sources.The results of this paper demonstrate a very promising potential of deep reinforcement learning for the optimal control of thermodynamic cycles.
作者
王轩
陈嘉宝
舒歌群
田华
蔡金文
王瑞
WANG Xuan;CHEN Jiabao;SHU Gequn;TIAN Hua;CAI Jinwen;WANG Rui(State Key Laboratory of Engines(Tianjin University),Jinnan District,Tianjin 300350,China;University of Science and Technology of China,Hefei 230026,Anhui Province,China)
出处
《中国电机工程学报》
EI
CSCD
北大核心
2023年第11期4169-4177,共9页
Proceedings of the CSEE
基金
国家重点研发计划项目(2022YFE0100100)。
关键词
深度强化学习
优化控制
有机朗肯循环
余热回收
内燃机
deep reinforcement learning
optimal control
organic Rankine cycle
waste heat recovery
internal combustion engine