Endowing quadruped robots with the skill to forward jump is conducive to making it overcome barriers and pass through complex terrains.In this paper,a model-free control architecture with target-guided policy optimiza...Endowing quadruped robots with the skill to forward jump is conducive to making it overcome barriers and pass through complex terrains.In this paper,a model-free control architecture with target-guided policy optimization and deep reinforcement learn-ing(DRL)for quadruped robot jumping is presented.First,the jumping phase is divided into take-off and flight-landing phases,and op-timal strategies with soft actor-critic(SAC)are constructed for the two phases respectively.Second,policy learning including expecta-tions,penalties in the overall jumping process,and extrinsic excitations is designed.Corresponding policies and constraints are all provided for successful take-off,excellent flight attitude and stable standing after landing.In order to avoid low efficiency of random ex-ploration,a curiosity module is introduced as extrinsic rewards to solve this problem.Additionally,the target-guided module encour-ages the robot explore closer and closer to desired jumping target.Simulation results indicate that the quadruped robot can realize com-pleted forward jumping locomotion with good horizontal and vertical distances,as well as excellent motion attitudes.展开更多
基金National Natural Science Foundation of China(No.61773374)National Key Research and Development Program of China(No.2017YFB1300104).
文摘Endowing quadruped robots with the skill to forward jump is conducive to making it overcome barriers and pass through complex terrains.In this paper,a model-free control architecture with target-guided policy optimization and deep reinforcement learn-ing(DRL)for quadruped robot jumping is presented.First,the jumping phase is divided into take-off and flight-landing phases,and op-timal strategies with soft actor-critic(SAC)are constructed for the two phases respectively.Second,policy learning including expecta-tions,penalties in the overall jumping process,and extrinsic excitations is designed.Corresponding policies and constraints are all provided for successful take-off,excellent flight attitude and stable standing after landing.In order to avoid low efficiency of random ex-ploration,a curiosity module is introduced as extrinsic rewards to solve this problem.Additionally,the target-guided module encour-ages the robot explore closer and closer to desired jumping target.Simulation results indicate that the quadruped robot can realize com-pleted forward jumping locomotion with good horizontal and vertical distances,as well as excellent motion attitudes.