The automatic stealth task of military time-sensitive targets plays a crucial role in maintaining national military security and mastering battlefield dynamics in military applications.We propose a novel Military Time...The automatic stealth task of military time-sensitive targets plays a crucial role in maintaining national military security and mastering battlefield dynamics in military applications.We propose a novel Military Time-sensitive Targets Stealth Network via Real-time Mask Generation(MTTSNet).According to our knowledge,this is the first technology to automatically remove military targets in real-time from videos.The critical steps of MTTSNet are as follows:First,we designed a real-time mask generation network based on the encoder-decoder framework,combined with the domain expansion structure,to effectively extract mask images.Specifically,the ASPP structure in the encoder could achieve advanced semantic feature fusion.The decoder stacked high-dimensional information with low-dimensional information to obtain an effective mask layer.Subsequently,the domain expansion module guided the adaptive expansion of mask images.Second,a context adversarial generation network based on gated convolution was constructed to achieve background restoration of mask positions in the original image.In addition,our method worked in an end-to-end manner.A particular semantic segmentation dataset for military time-sensitive targets has been constructed,called the Military Time-sensitive Target Masking Dataset(MTMD).The MTMD dataset experiment successfully demonstrated that this method could create a mask that completely occludes the target and that the target could be hidden in real time using this mask.We demonstrated the concealment performance of our proposed method by comparing it to a number of well-known and highly optimized baselines.展开更多
In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and ha...In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and have poor versatility,and there is a certain mismatch phenomenon,which affects the positioning accuracy.Therefore,this paper proposes a location algorithm that combines a target recognition algorithm with a depth feature matching algorithm to solve the problem of unmanned aerial vehicle(UAV)environment perception and multi-modal image-matching fusion location.This algorithm was based on the single-shot object detector based on multi-level feature pyramid network(M2Det)algorithm and replaced the original visual geometry group(VGG)feature extraction network with the ResNet-101 network to improve the feature extraction capability of the network model.By introducing a depth feature matching algorithm,the algorithm shares neural network weights and realizes the design of UAV target recognition and a multi-modal image-matching fusion positioning algorithm.When the reference image and the real-time image were mismatched,the dynamic adaptive proportional constraint and the random sample consensus consistency algorithm(DAPC-RANSAC)were used to optimize the matching results to improve the correct matching efficiency of the target.Using the multi-modal registration data set,the proposed algorithm was compared and analyzed to verify its superiority and feasibility.The results show that the algorithm proposed in this paper can effectively deal with the matching between multi-modal images(visible image–infrared image,infrared image–satellite image,visible image–satellite image),and the contrast,scale,brightness,ambiguity deformation,and other changes had good stability and robustness.Finally,the effectiveness and practicability of the algorithm proposed in this paper were verified in an aerial test scene of an S1000 sixrotor UAV.展开更多
基金supported in part by the National Natural Science Foundation of China(Grant No.62276274)Shaanxi Natural Science Foundation(Grant No.2023-JC-YB-528)Chinese aeronautical establishment(Grant No.201851U8012)。
文摘The automatic stealth task of military time-sensitive targets plays a crucial role in maintaining national military security and mastering battlefield dynamics in military applications.We propose a novel Military Time-sensitive Targets Stealth Network via Real-time Mask Generation(MTTSNet).According to our knowledge,this is the first technology to automatically remove military targets in real-time from videos.The critical steps of MTTSNet are as follows:First,we designed a real-time mask generation network based on the encoder-decoder framework,combined with the domain expansion structure,to effectively extract mask images.Specifically,the ASPP structure in the encoder could achieve advanced semantic feature fusion.The decoder stacked high-dimensional information with low-dimensional information to obtain an effective mask layer.Subsequently,the domain expansion module guided the adaptive expansion of mask images.Second,a context adversarial generation network based on gated convolution was constructed to achieve background restoration of mask positions in the original image.In addition,our method worked in an end-to-end manner.A particular semantic segmentation dataset for military time-sensitive targets has been constructed,called the Military Time-sensitive Target Masking Dataset(MTMD).The MTMD dataset experiment successfully demonstrated that this method could create a mask that completely occludes the target and that the target could be hidden in real time using this mask.We demonstrated the concealment performance of our proposed method by comparing it to a number of well-known and highly optimized baselines.
基金supported in part by the National Natural Science Foundation of China under Grant 62276274in part by the Natural Science Foundation of Shaanxi Province under Grant 2020JM-537,and in part by the Aeronautical Science Fund under Grant 201851U8012(corresponding author:Xiaogang Yang).
文摘In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and have poor versatility,and there is a certain mismatch phenomenon,which affects the positioning accuracy.Therefore,this paper proposes a location algorithm that combines a target recognition algorithm with a depth feature matching algorithm to solve the problem of unmanned aerial vehicle(UAV)environment perception and multi-modal image-matching fusion location.This algorithm was based on the single-shot object detector based on multi-level feature pyramid network(M2Det)algorithm and replaced the original visual geometry group(VGG)feature extraction network with the ResNet-101 network to improve the feature extraction capability of the network model.By introducing a depth feature matching algorithm,the algorithm shares neural network weights and realizes the design of UAV target recognition and a multi-modal image-matching fusion positioning algorithm.When the reference image and the real-time image were mismatched,the dynamic adaptive proportional constraint and the random sample consensus consistency algorithm(DAPC-RANSAC)were used to optimize the matching results to improve the correct matching efficiency of the target.Using the multi-modal registration data set,the proposed algorithm was compared and analyzed to verify its superiority and feasibility.The results show that the algorithm proposed in this paper can effectively deal with the matching between multi-modal images(visible image–infrared image,infrared image–satellite image,visible image–satellite image),and the contrast,scale,brightness,ambiguity deformation,and other changes had good stability and robustness.Finally,the effectiveness and practicability of the algorithm proposed in this paper were verified in an aerial test scene of an S1000 sixrotor UAV.