摘要
强化学习已经成为人工智能领域一个新的研究热点,并已成功应用于各领域,强化学习将运筹优化领域的很多问题视为序贯决策问题,建模为马尔可夫决策过程并进行求解,在求解复杂、动态、随机运筹优化问题具有较大的优势。本文主要对强化学习在运筹优化领域的应用进行综述,首先介绍了强化学习的基本原理及其应用于运筹优化领域的研究框架,然后回顾并总结了强化学习在库存控制、路径优化、装箱配载和车间作业调度等方面的研究成果,并将最新的深度强化学习以及传统方法在运筹学领域的应用研究进行了对比分析,以突出深度强化学习的优越性。最后提出几个值得进一步探讨的研究方向,期望能为强化学习在运筹优化领域的研究提供参考。
Reinforcement learning has become a new research hotspot in the field of artificial intelligence.and has been successfully applied in various fields.Reinforcement learning regards many problems in the community of operational optimization as sequential decision problems,modeled as Markov decision processes,and thensolve them.It has great advantages in solving complex,dynamic and random operation optimization problems.This paper mainly summarizes the application of reinforcement learning in the area of operational optimization.Firstly,it introduces the basic principles of reinforcement learning and its application framework in the field of operational optimization.Then it systematically reviews and summarizes the reinforcement learning in inventory control,path optimization,packing and loading and job shop scheduling.And the latest deep reinforcement learning and the application of traditional methods in the field of operations research are compared and analyzed to highlight the superiority of deep reinforcement learning.Finally,several research directions worthy of further discussion are proposed,and it is expected to provide reference for the study of reinforcement learning in the field of operational optimization.
作者
徐翔斌
李志鹏
XU Xiang-bin;LI Zhi-peng(School of Transportation and Logistics,East China Jiaotong University,Nanchang 330013,China)
出处
《运筹与管理》
CSSCI
CSCD
北大核心
2020年第5期227-239,共13页
Operations Research and Management Science
基金
国家自然科学基金资助项目(71761013)
江西省自然科学基金面上项目(20181BAB201010)。
关键词
强化学习
运筹优化
序贯决策
马尔可夫决策过程
深度强化学习
reinforcement learning
operation and optimization
sequential decision:Markov decision process
deep reinforcement learning