摘要
航天器在轨执行某些任务时,其质量参数会发生未知变化,传统控制方法在这种情况下控制效果不佳。本文提出基于强化学习的航天器姿态控制器设计方法,该方法在姿态控制器训练过程中不需要对航天器进行动力学建模,不依赖航天器的质量参数。当质量参数发生较大未知变化时,训练好的控制器仍然可以保持较好的控制效果。仿真测试表明:使用基于强化学习方法训练的控制器确实具有良好的鲁棒性。此外,回报函数的设计会明显影响姿态控制器的训练,因此对不同的回报函数设计进行了研究。
Owing to the growing complexity of space mission, classical control methods cannot meet the increasing high requirements for the robustness and adaptiveness of the satellite attitude control system. In this paper, a design method for the satellite attitude control system is proposed based on the reinforcement learning(RL) method.With the proposed method, it is not necessary to establish a dynamic model for the spacecraft in the training process of the attitude controller, and the satellite attitude control system is independent of the spacecraft mass parameters.Besides, when the mass parameters change, the trained controller can still maintain a good control effect. The test results show that the control system trained by the RL method has a stronger adaptive capability. In addition, since the design of the return function will significantly affect the training effect, different return function designs are also studied.
作者
张瑞卿
钟睿
徐毅
ZHANG Ruiqing;ZHONG Rui;XU Yi(School of Astronautics,Beihang University,Beijing 102206,China;Shanghai Institute of Satellite Engineering,Shanghai 201109,China)
出处
《上海航天(中英文)》
CSCD
2023年第1期80-85,共6页
Aerospace Shanghai(Chinese&English)
基金
国家自然科学基金(11772023)
上海航天科技创新基金(SAST2019-040)。
关键词
航天器姿态控制
鲁棒性
强化学习
神经网络
回报函数
attitude control
robustness
reinforcement learning
neural network
reward function