Reinforcement Learning Behavioral Control for Nonlinear Autonomous System 被引量：2

下载PDF

导出

摘要 Behavior-based autonomous systems rely on human intelligence to resolve multi-mission conflicts by designing mission priority rules and nonlinear controllers.In this work,a novel twolayer reinforcement learning behavioral control(RLBC)method is proposed to reduce such dependence by trial-and-error learning.Specifically,in the upper layer,a reinforcement learning mission supervisor(RLMS)is designed to learn the optimal mission priority.Compared with existing mission supervisors,the RLMS improves the dynamic performance of mission priority adjustment by maximizing cumulative rewards and reducing hardware storage demand when using neural networks.In the lower layer,a reinforcement learning controller(RLC)is designed to learn the optimal control policy.Compared with existing behavioral controllers,the RLC reduces the control cost of mission priority adjustment by balancing control performance and consumption.All error signals are proved to be semi-globally uniformly ultimately bounded(SGUUB).Simulation results show that the number of mission priority adjustment and the control cost are significantly reduced compared to some existing mission supervisors and behavioral controllers,respectively.

作者 Zhenyi Zhang Zhibin Mo Yutao Chen Jie Huang

机构地区 the College of Electrical Engineering and Automation the Key Laboratory of Industrial Automation Control Technology and Information Processing G+Industrial Internet Institute

出处《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第9期1561-1573,共13页 自动化学报（英文版）

基金 the National Natural Science Foundation of China(61603094)。

关键词 Behavioral control mission supervisor nonlinear autonomous system reinforcement learning

分类号 TP18 [自动化与计算机技术—控制理论与控制工程] TP273 [自动化与计算机技术—检测技术与自动化装置]

引文网络
相关文献

参考文献5

1Hongjun Yang,Jinkun Liu.An Adaptive RBF Neural Network Control Method for a Class of Nonlinear Systems[J].IEEE/CAA Journal of Automatica Sinica,2018,5(2):457-462. 被引量：29
2Jingwei Lu,Qinglai Wei,Fei-Yue Wang.Parallel Control for Optimal Tracking via Adaptive Dynamic Programming[J].IEEE/CAA Journal of Automatica Sinica,2020,7(6):1662-1674. 被引量：23
3Jie CHEN,Minggang GAN,Jie HUANG,Lihua DOU,Hao FANG.Formation control of multiple Euler-Lagrange systems via null-space-based behavioral control[J].Science China(Information Sciences),2016,59(1):16-26. 被引量：16
4Aquib Mustafa,Narendra K.Dhar,Nishchal K Verma.Event-Triggered Sliding Mode Control for Trajectory Tracking of Nonlinear Systems[J].IEEE/CAA Journal of Automatica Sinica,2020,7(1):307-314. 被引量：6
5Haowei Lin,Bo Zhao,Derong Liu,Cesare Alippi.Data-based Fault Tolerant Control for Affine Nonlinear Systems Through Particle Swarm Optimized Neural Networks[J].IEEE/CAA Journal of Automatica Sinica,2020,7(4):954-964. 被引量：15

二级参考文献10

1王飞跃.平行系统方法与复杂系统的管理和控制[J].控制与决策,2004,19(5):485-489. 被引量：333
2WANG Long,JIANG FangCui,XIE GuangMing,JI ZhiJian.Controllability of multi-agent systems based on agreement protocols[J].Science in China(Series F),2009,52(11):2074-2088. 被引量：19
3王飞跃.平行控制:数据驱动的计算控制方法[J].自动化学报,2013,39(4):293-302. 被引量：131
4WANG Qiang,WANG YuZhen.Cluster synchronization of a class of multi-agent systems with a bipartite graph topology[J].Science China(Information Sciences),2014,57(1):188-198. 被引量：15
5Qingkai Yang,Hao Fang,Yutian Mao,Jie Huang.Distributed tracking for networked Euler-Lagrange systems without velocity measurements[J].Journal of Systems Engineering and Electronics,2014,25(4):671-680. 被引量：2
6Derong Liu,Yancai Xu,Qinglai Wei,Xinliang Liu.Residential Energy Scheduling for Variable Weather Solar Energy Based on Adaptive Dynamic Programming[J].IEEE/CAA Journal of Automatica Sinica,2018,5(1):36-46. 被引量：15
7王飞跃,魏庆来.智能控制:从学习控制到平行控制[J].控制理论与应用,2018,35(7):939-948. 被引量：26
8Qinglai Wei,Hongyang Li,Fei-Yue Wang.Parallel Control for Continuous-Time Linear Systems:A Case Study[J].IEEE/CAA Journal of Automatica Sinica,2020,7(4):919-928. 被引量：24
9Qinglai Wei,Derong Liu,Yu Liu,Ruizhuo Song.Optimal Constrained Self-learning Battery Sequential Management in Microgrid Via Adaptive Dynamic Programming[J].IEEE/CAA Journal of Automatica Sinica,2017,4(2):168-176. 被引量：16
10Fei-Yue Wang,Nan-Ning Zheng,Dongpu Cao,Clara Marina Martinez,Li Li,Teng Liu.Parallel Driving in CPSS:A Unified Approach for Transport Automation and Vehicle Intelligence[J].IEEE/CAA Journal of Automatica Sinica,2017,4(4):577-587. 被引量：48

共引文献83

1魏岳江.美国干涉亚洲事务的桥头堡——第七舰队[J].瞭望,2000(6):54-55.
2Jie CHEN,Shixiong KAI.Cooperative transportation control of multiple mobile manipulators through distributed optimization[J].Science China(Information Sciences),2018,61(12):1-17. 被引量：3
3Jiantao SHI,Donghua ZHOU,Yuhao YANG,Jun SUN.Fault tolerant multivehicle formation control framework with applications in multiquadrotor systems[J].Science China(Information Sciences),2018,61(12):174-176. 被引量：4
4Donghua ZHOU,Liguo QIN,Xiao HE,Rui YAN,Ruiliang DENG.Distributed sensor fault diagnosis for a formation system with unknown constant time delays[J].Science China(Information Sciences),2018,61(11):124-139. 被引量：5
5FANG Hao,LU Shao Lei,CHEN Jie.New advances in complex motion control for single robot systems and multi-agent systems[J].Science China(Technological Sciences),2016,59(12):1963-1964. 被引量：2
6Xiao YU,Lu LIU.Leader-follower formation of vehicles with velocity constraints and local coordinate frames[J].Science China(Information Sciences),2017,60(7):77-91. 被引量：2
7Liangren SHI,Zhiyun ZHAO,Zongli LIN.Robust semi-global leader-following practical consensus of a group of linear systems with imperfect actuators[J].Science China(Information Sciences),2017,60(7):188-199. 被引量：9
8Wenwu YU,He WANG,Huifen HONG,Guanghui WEN.Distributed cooperative anti-disturbance control of multi-agent systems: an overview[J].Science China(Information Sciences),2017,60(11):73-86. 被引量：8
9Hao FANG,Chengsi SHANG,Jie CHEN.An optimization-based shared control framework with applications in multi-robot systems[J].Science China(Information Sciences),2018,61(1):257-259. 被引量：8
10李婧,田龙威,王艳青.基于GA-RBF神经网络的电力系统短期负荷预测[J].上海电力学院学报,2019,35(3):205-210. 被引量：6

同被引文献8

1居鹤华,崔平远,刘红云.基于自主行为智能体的月球车运动规划与控制[J].自动化学报,2006,32(5):704-712. 被引量：8
2王义萍,陈庆伟,胡维礼.机器人行为选择机制综述[J].机器人,2009,31(5):472-480. 被引量：10
3Jie CHEN,Minggang GAN,Jie HUANG,Lihua DOU,Hao FANG.Formation control of multiple Euler-Lagrange systems via null-space-based behavioral control[J].Science China(Information Sciences),2016,59(1):16-26. 被引量：16
4王伟嘉,郑雅婷,林国政,张亮,韩战钢.集群机器人研究综述[J].机器人,2020,42(2):232-256. 被引量：42
5李勇,李坤成,孙柏青,张秋豪,王义娜,杨俊友.智能体Petri网融合的多机器人−多任务协调框架[J].自动化学报,2021,47(8):2029-2049. 被引量：7
6王峰,张衡,韩孟臣,邢立宁.基于协同进化的混合变量多目标粒子群优化算法求解无人机协同多任务分配问题[J].计算机学报,2021,44(10):1967-1983. 被引量：39
7Jie HUANG,Zhibin MO,Zhenyi ZHANG,Yutao CHEN.Behavioral control task supervisor with memory based on reinforcement learning for human-multi-robot coordination systems[J].Frontiers of Information Technology & Electronic Engineering,2022,23(8):1174-1188. 被引量：5
8Yang Liu,Hongyi Li,Zongyu Zuo,Xiaodi Li,Renquan Lu.An Overview of Finite/Fixed-Time Control and Its Application in Engineering Systems[J].IEEE/CAA Journal of Automatica Sinica,2022,9(12):2106-2120. 被引量：16

引证文献2

1张祯毅,黄捷.基于行为的多差速机器人强化学习任务监管器设计[J].机器人,2024,46(4):397-413.
2Zhenyi ZHANG,Jie HUANG,Congjie PAN.Multi-agent reinforcement learning behavioral control for nonlinear second-order systems[J].Frontiers of Information Technology & Electronic Engineering,2024,25(6):869-886.

1王玉珍.安全管理行为对安全管理绩效影响分析方法[J].数字化用户,2020(15):106-108.
2Ahmad Sarani Ali Abadi,Saeed Balochian.Chaos control of the power system via sliding mode based on fuzzy supervisor[J].International Journal of Intelligent Computing and Cybernetics,2017,10(1):68-79. 被引量：3
3Weijie Zhao.Inspired, but not mimicking: a conversation between artificial intelligence and human intelligence[J].National Science Review,2022,9(6):195-200.
4Muge Oner Tamam,Muhlis Can Tamam.Artificial intelligence technologies in nuclear medicine[J].World Journal of Radiology,2022,14(6):151-154.
5Sai Leung Ng.Effects of Risk Perception on Disaster Preparedness Toward Typhoons:An Application of the Extended Theory of Planned Behavior[J].International Journal of Disaster Risk Science,2022,13(1):100-113.
6Jintao Liu,Feng Zeng,Wei Wang,Zhichao Sheng,Xinchen Wei,Kanapathippillai Cumanan.Trajectory Design for UAV-Enabled Maritime Secure Communications:A Reinforcement Learning Approach[J].China Communications,2022,19(9):26-36.
7Vijeyata Chauhan,Pankaj Kumar Srivastava.Trio-Geometric mean-based three-stage Runge–Kutta algorithm to solve initial value problem arising in autonomous systems[J].International Journal of Modeling, Simulation, and Scientific Computing,2018,9(4):83-94.
8Carrie Ekins,Peter R. Wright,Marianne Liebich,Jacqueline Wright,Henry Schulz,Dean Owens.The Effects of a Drums Alive<sup>®</sup>Kids Beats Intervention on the Physical Performance and Motor Skills of Children with Developmental Delays[J].Open Journal of Pediatrics,2021,11(4):832-839.
9Syeda Javeria Shoukat,Humaira Afzal,Muhammad Rafiq Mufti,Muhammad Khalid Sohail,Dost Muhammad Khan,Nadeem Akhtar,Shahid Hussain,Mansoor Ahmed.Analyzing COVID-19 Impact on the Researchers Productivity through Their Perceptions[J].Computers, Materials & Continua,2021(5):1835-1847.
10DU Jian,ZHAO Xu,GUO Liming,WANG Jun.Machine Learning-Based Approach to Liner Shipping Schedule Design[J].Journal of Shanghai Jiaotong university(Science),2022,27(3):411-423.

IEEE/CAA Journal of Automatica Sinica

2022年第9期

浏览历史

内容加载中请稍等...