摘要
为了有效提高复杂电磁环境下对非合作方工作模式未知的智能雷达的干扰效率和准确率,提出了一种基于部分可观测马尔可夫决策过程(partially observable Markov decision process,POMDP)的干扰决策方法。首先,根据智能雷达的工作特点构建了智能雷达对抗系统的POMDP模型,采用非参数的、基于样本的信念分布反映智能体对环境的认知,并利用贝叶斯滤波更新智能体对环境的信念。然后,以信息熵作为评估准则,令干扰机选择信息熵最大的干扰样式不断尝试。最后,通过仿真实验与传统Q-学习法和经验决策法的干扰决策性能进行比较,验证所提方法的优越性。结果表明,所提方法能够根据未知雷达状态变化动态地选择最优干扰方式,且能更快实现对智能雷达的干扰决策。
In order to effectively improve the jamming efficiency and accuracy of intelligent radar with unknown working mode of non partners in complex electromagnetic environment,a jamming decision method based on partially observable Markov decision process(POMDP)is proposed.Firstly,according to the working characteristics of intelligent radar,the POMDP model of intelligent radar countermeasure system is constructed,the nonparametric and sample based belief distribution is used to reflect the agent’s cognition of the environment,and the Bayesian filter is used to update the agent’s belief in the environment.Then,taking the information entropy as the evaluation criterion,make the jammer choose the jamming style with the largest information entropy and try again and again.Finally,the simulation results are compared with the interference decision-making performance of traditional Q-learning method and empirical decision-making method to verify the superiority of the proposed method.The results show that the proposed method can dynamically select the optimal jamming mode according to the changes of unknown radar state,and realize the jamming decision of intelligent radar faster.
作者
冯路为
刘松涛
徐华志
FENG Luwei;LIU Songtao;XU Huazhi(Department of Information System,Dalian Naval Academy,Dalian 116018,China)
出处
《系统工程与电子技术》
EI
CSCD
北大核心
2023年第9期2755-2760,共6页
Systems Engineering and Electronics
基金
中国博士后基金(2015M572694,2016T90979)资助课题。
关键词
智能雷达
强化学习
部分可观测马尔可夫决策过程模型
贝叶斯滤波
intelligent radar
reinforcement learning
partially observable Markov decision process(POMDP)model
Bayesian filtering