摘要
针对水库长期随机调度的维数灾问题,在描述来水随机过程的基础上,提出基于强化学习理论的水库长期随机优化调度模型。采用机器学习中有模型的SARSA算法,且考虑入库随机变量的马尔可夫特性,通过贪婪决策与近似值迭代,调整学习参数,求解出近似最优决策序列。实例分析表明,对比随机动态规划(SDP)方法,SARSA算法在获得高质量解的同时,计算时间约减少41%,该算法高效求解能力与较少计算时长为水库长期随机调度问题提供了一种新的求解思路。
Aiming at the problem of the curse of dimensionality in long-term random scheduling of reservoir, based on describing the random process of inflow, a reinforcement learning method based SARSA algorithm was applied. The model considered the uncertainty of the runoff which was taken as simple Markov Decision Process (MDP). By greedy decision-making and approximate value iteration, the learning parameters were adjusted to determine the near-optimal decision-making sequence. Compared with stochastic dynamic programming (SDP) method, the example shows that the model based SARSA algorithm achieves a high quality solutions and the computation time is reduced by approximately 41 %. Its efficient solution and short calculation time provide a new solution idea for long-term stochastic operation of reservoir.
作者
李文武
张雪映
Daniel Eliote Mbanze
吴巍
LI Wen-wu 1,2, ZHANG Xue-ying 1,2,DANIEL Eliote Mbanze 1,2,WU Wei 1,2(1. Hubei Key Laboratory of Cascaded Hydropower Stations Operation & Control;2. College of Electrical Engineering & New Energy,China Three Gorges University, Yichang 443002, Chin)
出处
《水电能源科学》
北大核心
2018年第9期72-75,共4页
Water Resources and Power
基金
湖北省技术创新专项(重点项目)(2017AAA132)