Path Selection in Disaster Response Management Based on Q-learning 被引量：3

Path Selection in Disaster Response Management Based on Q-learning

下载PDF

导出

摘要 Suitable rescue path selection is very important to rescue lives and reduce the loss of disasters, and has been a key issue in the field of disaster response management. In this paper, we present a path selection algorithm based on Q-learning for disaster response applications. We assume that a rescue team is an agent, which is operating in a dynamic and dangerous environment and needs to find a safe and short path in the least time. We first propose a path selection model for disaster response management, and deduce that path selection based on our model is a Markov decision process. Then, we introduce Q-learning and design strategies for action selection and to avoid cyclic path. Finally, experimental results show that our algorithm can find a safe and short path in the dynamic and dangerous environment, which can provide a specific and significant reference for practical management in disaster response applications. Suitable rescue path selection is very important to rescue lives and reduce the loss of disasters, and has been a key issue in the field of disaster response management. In this paper, we present a path selection algorithm based on Q-learning for disaster response applications. We assume that a rescue team is an agent, which is operating in a dynamic and dangerous environment and needs to find a safe and short path in the least time. We first propose a path selection model for disaster response management, and deduce that path selection based on our model is a Markov decision process. Then, we introduce Q-learning and design strategies for action selection and to avoid cyclic path. Finally, experimental results show that our algorithm can find a safe and short path in the dynamic and dangerous environment, which can provide a specific and significant reference for practical management in disaster response applications.

作者 Zhao-Pin Su Jian-Guo Jiang Chang-Yong Liang2＇ 3 Guo-Fu Zhang Guo-Fu Zhang

机构地区 Key Laboratory of Special Display Technology （Hefei University of Technology） School of Computer and Information Postdoctoral Research Station for Management Science and Engineering Engineering Research Center of Safety Critical Industrial Measurement and Control Technology

出处《International Journal of Automation and computing》 EI 2011年第1期100-106,共7页 国际自动化与计算杂志（英文版）

基金 supported by National Basic Research Program of China (973 Program) (No. 2009CB326203) National Natural Science Foundation of China (No. 61004103) the National Research Foundation for the Doctoral Program of Higher Education of China (No. 20100111110005) China Postdoctoral Science Foundation (No. 20090460742) National Engineering Research Center of Special Display Technology (No. 2008HGXJ0350) Natural Science Foundation of Anhui Province (No. 090412058, No. 070412035) Natural Science Foundation of Anhui Province of China (No. 11040606Q44, No. 090412058) Specialized Research Fund for Doctoral Scholars of Hefei University of Technology (No. GDBJ2009-003, No. GDBJ2009-067)

关键词 Disaster response management path selection AGENT SELF-ORGANIZING Markov decision process Q-learning. Disaster response management path selection agent self-organizing Markov decision process Q-learning.

分类号 TP [自动化与计算机技术]

引文网络
相关文献

参考文献2

1JIANG Jian-Guo,SU Zhao-Pin,QI Mei-Bin,ZHANG Guo-Fu.Multi-task Coalition Parallel Formation Strategy Based on Reinforcement Learning[J].自动化学报,2008,34(3):349-352. 被引量：6
2Manoj Kumar,A.K.Verma,A.Srividya.Analyzing Effect of Demand Rate on Safety of Systems with Periodic Proof-tests[J].International Journal of Automation and computing,2007,4(4):335-341. 被引量：1

二级参考文献14

1蒋建国,夏娜,于春华.基于能力向量发挥率和拍卖的联盟形成策略[J].电子学报,2004,32(F12):215-217. 被引量：20
2宋梅萍,顾国昌,张国印.随机博弈框架下的多agent强化学习方法综述[J].控制与决策,2005,20(10):1081-1090. 被引量：13
3张国富,蒋建国,夏娜,苏兆品.基于离散粒子群算法求解复杂联盟生成问题[J].电子学报,2007,35(2):323-327. 被引量：33
4.Functional Safety of Electric/Electronic/Pro- grammable Electronic Safety-related Systems,Parts 0-7[].IEC October -May.19982000
5T.Zhang,W.Long,Y.Sato.Availability of Systems with Self-diagnostic Components-applying Markov Model to IEC 61508-6[].Reliability Engineering and System Safety.2003
6J.V.Bukowski.Modeling and Analyzing the Effects of Pe- riodic Inspection on the Performance of Safety-critical Sys- tems[].IEEE Transactions on Reliability.2001
7P.Hokstad,K.Corneliussen.Loss of Safety Assessment and the IEC 61508 Standard[].Reliability Engineering and Sys- tem Safety.2004
8J.V.Bukowski,W.M.Goble.Defining Mean Time-to- failure in a Particular Failure-state for Multi-failure-state Systems[].IEEE Transactions on Reliability.2001
9M.Kumar,A.KVerma,A.Srividya.Modeling Demand Rate and Imperfect Proof-test and Analysis of their Ef- fect on System Safety[].Reliability Engineering and System Safety.
10J.V.Bukowski.Incorporating Process Demand into Models for Assessment of Safety System Performance[].Proceed- ings of the Annual Reliability and Maintainability Sympo- sium.2006

共引文献5

1李剑,景博,杨义先.一种基于奖励机制的agent联盟形成策略[J].电子学报,2008,36(B12):71-75. 被引量：5
2Min Fang,Frans C.A. Groen.Collaborative multi-agent reinforcement learning based on experience propagation[J].Journal of Systems Engineering and Electronics,2013,24(4):683-689. 被引量：5
3苏兆品,张国富,蒋建国,岳峰,张婷.基于非支配排序差异演化的应急资源多目标分配算法[J].自动化学报,2017,43(2):195-214. 被引量：18
4马宏伟,王世斌,毛清华,石增武,张旭辉,杨征,曹现刚,薛旭升,夏晶,王川伟.煤矿巷道智能掘进关键共性技术[J].煤炭学报,2021,46(1):310-320. 被引量：56
5马宏伟,王鹏,王世斌,毛清华,石增武,夏晶,杨征,薛旭升,王川伟.煤矿掘进机器人系统智能并行协同控制方法[J].煤炭学报,2021,46(7):2057-2067. 被引量：23

同被引文献54

1蒋建国,夏娜,于春华.基于能力向量发挥率和拍卖的联盟形成策略[J].电子学报,2004,32(F12):215-217. 被引量：20
2蒋建国,夏娜,齐美彬,木春梅.一种基于蚁群算法的多任务联盟串行生成算法[J].电子学报,2005,33(12):2178-2182. 被引量：26
3陈志,王汝传,孙力娟.一种无线传感器网络的多Agent系统模型[J].电子学报,2007,35(2):240-243. 被引量：14
4张国富,蒋建国,夏娜,苏兆品.基于离散粒子群算法求解复杂联盟生成问题[J].电子学报,2007,35(2):323-327. 被引量：33
5J Bohannon. Counterterrorism' s new tool: ' metanetwork' anal- ysis [J].Science, 2009,325 (5939) : 409 - 411.
6F Schweitzer, G Fagiolo, et al. Economic networks: The new challenges [J].Science, 2009,325 (5939) : 422 - 425.
7K Turner, A Agogino. Multiagent learning for black box system reward functions[ J]. Advances in Complex Systems, 2009, 12 (4 - 5) :475 - 492.
8S S Manvi, M S Kakkasageri. Multicast routing in mobile ad hoc networks by using a multi-agent system[ J]. Information Sciences, 2008,178 (6) : 1611 - 1628.
9Guofu Zhang, Jianguo Jiang, et al. Searching for overlapping coalitions in multiple virtual organizations[J] .Information Sci- ences,2010,180(17) :3140 - 3156.
10J M Zolezzi, H Rudnick. Transmission cost allocation by coop- erative games and coalition formation[ J]. IEEE Trans on Power Systems,2002,17(4) : 1008 - 1015.

引证文献3

1张国富,周鹏,蒋建国,苏兆品,田敬北,刘扬.基于虚拟联盟的重叠联盟形成算法[J].电子学报,2012,40(1):121-127. 被引量：8
2苏兆品,张国富,蒋建国,岳峰,张婷.基于非支配排序差异演化的应急资源多目标分配算法[J].自动化学报,2017,43(2):195-214. 被引量：18
3罗琴凤,贾坤泽,殷允强.灾后人道主义物流运营管理研究综述和展望[J].电子科技大学学报（社科版）,2022,24(1):82-91. 被引量：3

二级引证文献29

1胡小璠.模糊条件下境外应急物资自适应分配方法[J].科技通报,2020,36(6):97-100. 被引量：2
2刘垚,郑琳,郑凯,王肃,廖启丹.基于申威众核处理器的NSGA-Ⅱ并行和优化方法[J].计算机应用研究,2020,37(1):96-101. 被引量：1
3杜继永,张凤鸣,惠晓滨,李永宾.改进型连续粒子群算法求解重叠联盟生成问题[J].上海交通大学学报,2013,47(12):1918-1923. 被引量：5
4张国富,周鹏,苏兆品,杨仁志,蒋建国.基于讨价还价的重叠联盟效用划分策略[J].模式识别与人工智能,2014,27(10):930-938. 被引量：8
5杜继永,张凤鸣,黄国荣,吴虎胜.多属性能力agent的复杂联盟生成算法[J].计算机应用研究,2015,32(10):2960-2962. 被引量：1
6李相民,薄宁,代进进,唐嘉钰.有/无人机编队协同作战指挥控制关键技术综述[J].飞航导弹,2017(9):29-35. 被引量：13
7刘长石,罗亮,周鲜成,黄福华.震后初期应急物资分配-运输的协同决策:公平与效率兼顾[J].控制与决策,2018,33(11):2057-2063. 被引量：29
8桂海霞,张国富,苏兆品,蒋建国.一种基于差分进化和编码修正的重叠联盟结构生成算法[J].控制理论与应用,2018,35(2):215-223. 被引量：7
9高杨,郭红戈.基于差分进化算法的动车组周转接续优化研究[J].铁道运输与经济,2019,41(4):76-83. 被引量：2
10赵星,吉康,林灏,徐鹏.基于多目标路径规划的应急资源配置模型[J].华南理工大学学报（自然科学版）,2019,47(4):76-82. 被引量：13

1陆一飞,陶军.基于知识基础的多交互面网络体系结构[J].计算机工程与应用,2007,43(27):4-7. 被引量：1
2SU Zhao-Pin,JIANG Jian-Guo,LIANG Chang-Yong,ZHANG Guo-Fu.A Distributed Algorithm for Parallel Multi-task Allocation Based on Profit Sharing Learning[J].自动化学报,2011,37(7):865-872. 被引量：7
3Olivier Bouchet,Abdesselem Kortebi,Mathieu Boucher.Inter-MAC Green Path Selection for Heterogeneous Networks[J].通讯和计算机（中英文版）,2013,10(6):806-814.
4王杰贵,崔宗国.智能决策支持系统应用于雷达干扰的研究[J].电子对抗,1998(2):33-37. 被引量：2
5孙继欣.软件测试的真正目标[J].信息技术与标准化,2010(3):56-57.
6CHEN ChunLin,DONG DaoYi,LI Han-Xiong,TARN Tzyh-Jong.Hybrid MDP based integrated hierarchical Q-learning[J].Science China(Information Sciences),2011,54(11):2279-2294. 被引量：9
7刘国平,董增文,肖根福.粒子群算法在模糊神经网络系统辨识中的应用[J].南昌大学学报（工科版）,2006,28(3):253-255. 被引量：3
8童咏昕,余洁莹,陈雷.Towards Better Understanding of App Functions[J].Journal of Computer Science & Technology,2015,30(5):1130-1140. 被引量：2
9WORLD[J].Beijing Review,2015,58(28):8-9.
10宋璐璐,雒江涛.Ad Hoc分层模型及潜在问题[J].山西电子技术,2007(4):82-83.

International Journal of Automation and computing

2011年第1期

浏览历史

内容加载中请稍等...