Image classifiers that based on Deep Neural Networks(DNNs)have been proved to be easily fooled by well-designed perturbations.Previous defense methods have the limitations of requiring expensive computation or reducin...Image classifiers that based on Deep Neural Networks(DNNs)have been proved to be easily fooled by well-designed perturbations.Previous defense methods have the limitations of requiring expensive computation or reducing the accuracy of the image classifiers.In this paper,we propose a novel defense method which based on perceptual hash.Our main goal is to destroy the process of perturbations generation by comparing the similarities of images thus achieve the purpose of defense.To verify our idea,we defended against two main attack methods(a white-box attack and a black-box attack)in different DNN-based image classifiers and show that,after using our defense method,the attack-success-rate for all DNN-based image classifiers decreases significantly.More specifically,for the white-box attack,the attack-success-rate is reduced by an average of 36.3%.For the black-box attack,the average attack-success-rate of targeted attack and non-targeted attack has been reduced by 72.8%and 76.7%respectively.The proposed method is a simple and effective defense method and provides a new way to defend against adversarial samples.展开更多
Website fingerprinting,also known asWF,is a traffic analysis attack that enables local eavesdroppers to infer a user’s browsing destination,even when using the Tor anonymity network.While advanced attacks based on de...Website fingerprinting,also known asWF,is a traffic analysis attack that enables local eavesdroppers to infer a user’s browsing destination,even when using the Tor anonymity network.While advanced attacks based on deep neural network(DNN)can performfeature engineering and attain accuracy rates of over 98%,research has demonstrated thatDNNis vulnerable to adversarial samples.As a result,many researchers have explored using adversarial samples as a defense mechanism against DNN-based WF attacks and have achieved considerable success.However,these methods suffer from high bandwidth overhead or require access to the target model,which is unrealistic.This paper proposes CMAES-WFD,a black-box WF defense based on adversarial samples.The process of generating adversarial examples is transformed into a constrained optimization problem solved by utilizing the Covariance Matrix Adaptation Evolution Strategy(CMAES)optimization algorithm.Perturbations are injected into the local parts of the original traffic to control bandwidth overhead.According to the experiment results,CMAES-WFD was able to significantly decrease the accuracy of Deep Fingerprinting(DF)and VarCnn to below 8.3%and the bandwidth overhead to a maximum of only 14.6%and 20.5%,respectively.Specially,for Automated Website Fingerprinting(AWF)with simple structure,CMAES-WFD reduced the classification accuracy to only 6.7%and the bandwidth overhead to less than 7.4%.Moreover,it was demonstrated that CMAES-WFD was robust against adversarial training to a certain extent.展开更多
Recently developed fault classification methods for industrial processes are mainly data-driven.Notably,models based on deep neural networks have significantly improved fault classification accuracy owing to the inclu...Recently developed fault classification methods for industrial processes are mainly data-driven.Notably,models based on deep neural networks have significantly improved fault classification accuracy owing to the inclusion of a large number of data patterns.However,these data-driven models are vulnerable to adversarial attacks;thus,small perturbations on the samples can cause the models to provide incorrect fault predictions.Several recent studies have demonstrated the vulnerability of machine learning methods and the existence of adversarial samples.This paper proposes a black-box attack method with an extreme constraint for a safe-critical industrial fault classification system:Only one variable can be perturbed to craft adversarial samples.Moreover,to hide the adversarial samples in the visualization space,a Jacobian matrix is used to guide the perturbed variable selection,making the adversarial samples in the dimensional reduction space invisible to the human eye.Using the one-variable attack(OVA)method,we explore the vulnerability of industrial variables and fault types,which can help understand the geometric characteristics of fault classification systems.Based on the attack method,a corresponding adversarial training defense method is also proposed,which efficiently defends against an OVA and improves the prediction accuracy of the classifiers.In experiments,the proposed method was tested on two datasets from the Tennessee–Eastman process(TEP)and steel plates(SP).We explore the vulnerability and correlation within variables and faults and verify the effectiveness of OVAs and defenses for various classifiers and datasets.For industrial fault classification systems,the attack success rate of our method is close to(on TEP)or even higher than(on SP)the current most effective first-order white-box attack method,which requires perturbation of all variables.展开更多
Deep learning-based models are vulnerable to adversarial attacks. Defense against adversarial attacks is essential for sensitive and safety-critical scenarios. However, deep learning methods still lack effective and e...Deep learning-based models are vulnerable to adversarial attacks. Defense against adversarial attacks is essential for sensitive and safety-critical scenarios. However, deep learning methods still lack effective and efficient defense mechanisms against adversarial attacks. Most of the existing methods are just stopgaps for specific adversarial samples. The main obstacle is that how adversarial samples fool the deep learning models is still unclear. The underlying working mechanism of adversarial samples has not been well explored, and it is the bottleneck of adversarial attack defense. In this paper, we build a causal model to interpret the generation and performance of adversarial samples. The self-attention/transformer is adopted as a powerful tool in this causal model. Compared to existing methods, causality enables us to analyze adversarial samples more naturally and intrinsically. Based on this causal model, the working mechanism of adversarial samples is revealed, and instructive analysis is provided. Then, we propose simple and effective adversarial sample detection and recognition methods according to the revealed working mechanism. The causal insights enable us to detect and recognize adversarial samples without any extra model or training. Extensive experiments are conducted to demonstrate the effectiveness of the proposed methods. Our methods outperform the state-of-the-art defense methods under various adversarial attacks.展开更多
基金The work is supported by the National Key Research Development Program of China(2016QY01W0200)the National Natural Science Foundation of China NSFC(U1636101,U1736211,U1636219).
文摘Image classifiers that based on Deep Neural Networks(DNNs)have been proved to be easily fooled by well-designed perturbations.Previous defense methods have the limitations of requiring expensive computation or reducing the accuracy of the image classifiers.In this paper,we propose a novel defense method which based on perceptual hash.Our main goal is to destroy the process of perturbations generation by comparing the similarities of images thus achieve the purpose of defense.To verify our idea,we defended against two main attack methods(a white-box attack and a black-box attack)in different DNN-based image classifiers and show that,after using our defense method,the attack-success-rate for all DNN-based image classifiers decreases significantly.More specifically,for the white-box attack,the attack-success-rate is reduced by an average of 36.3%.For the black-box attack,the average attack-success-rate of targeted attack and non-targeted attack has been reduced by 72.8%and 76.7%respectively.The proposed method is a simple and effective defense method and provides a new way to defend against adversarial samples.
基金the Key JCJQ Program of China:2020-JCJQ-ZD-021-00 and 2020-JCJQ-ZD-024-12.
文摘Website fingerprinting,also known asWF,is a traffic analysis attack that enables local eavesdroppers to infer a user’s browsing destination,even when using the Tor anonymity network.While advanced attacks based on deep neural network(DNN)can performfeature engineering and attain accuracy rates of over 98%,research has demonstrated thatDNNis vulnerable to adversarial samples.As a result,many researchers have explored using adversarial samples as a defense mechanism against DNN-based WF attacks and have achieved considerable success.However,these methods suffer from high bandwidth overhead or require access to the target model,which is unrealistic.This paper proposes CMAES-WFD,a black-box WF defense based on adversarial samples.The process of generating adversarial examples is transformed into a constrained optimization problem solved by utilizing the Covariance Matrix Adaptation Evolution Strategy(CMAES)optimization algorithm.Perturbations are injected into the local parts of the original traffic to control bandwidth overhead.According to the experiment results,CMAES-WFD was able to significantly decrease the accuracy of Deep Fingerprinting(DF)and VarCnn to below 8.3%and the bandwidth overhead to a maximum of only 14.6%and 20.5%,respectively.Specially,for Automated Website Fingerprinting(AWF)with simple structure,CMAES-WFD reduced the classification accuracy to only 6.7%and the bandwidth overhead to less than 7.4%.Moreover,it was demonstrated that CMAES-WFD was robust against adversarial training to a certain extent.
基金This work was supported in part by the National Natural Science Foundation of China(NSFC)(92167106,62103362,and 61833014)the Natural Science Foundation of Zhejiang Province(LR18F030001).
文摘Recently developed fault classification methods for industrial processes are mainly data-driven.Notably,models based on deep neural networks have significantly improved fault classification accuracy owing to the inclusion of a large number of data patterns.However,these data-driven models are vulnerable to adversarial attacks;thus,small perturbations on the samples can cause the models to provide incorrect fault predictions.Several recent studies have demonstrated the vulnerability of machine learning methods and the existence of adversarial samples.This paper proposes a black-box attack method with an extreme constraint for a safe-critical industrial fault classification system:Only one variable can be perturbed to craft adversarial samples.Moreover,to hide the adversarial samples in the visualization space,a Jacobian matrix is used to guide the perturbed variable selection,making the adversarial samples in the dimensional reduction space invisible to the human eye.Using the one-variable attack(OVA)method,we explore the vulnerability of industrial variables and fault types,which can help understand the geometric characteristics of fault classification systems.Based on the attack method,a corresponding adversarial training defense method is also proposed,which efficiently defends against an OVA and improves the prediction accuracy of the classifiers.In experiments,the proposed method was tested on two datasets from the Tennessee–Eastman process(TEP)and steel plates(SP).We explore the vulnerability and correlation within variables and faults and verify the effectiveness of OVAs and defenses for various classifiers and datasets.For industrial fault classification systems,the attack success rate of our method is close to(on TEP)or even higher than(on SP)the current most effective first-order white-box attack method,which requires perturbation of all variables.
基金supported by National Key Research and Development Program of China(No.2020AAA0140002)Natural Science Foundation of China(Nos.U1836217,62076240,62006225,61906199,62071468,62176025 and U21B200389)the CAAI-Huawei Mind-spore Open Fund.
文摘Deep learning-based models are vulnerable to adversarial attacks. Defense against adversarial attacks is essential for sensitive and safety-critical scenarios. However, deep learning methods still lack effective and efficient defense mechanisms against adversarial attacks. Most of the existing methods are just stopgaps for specific adversarial samples. The main obstacle is that how adversarial samples fool the deep learning models is still unclear. The underlying working mechanism of adversarial samples has not been well explored, and it is the bottleneck of adversarial attack defense. In this paper, we build a causal model to interpret the generation and performance of adversarial samples. The self-attention/transformer is adopted as a powerful tool in this causal model. Compared to existing methods, causality enables us to analyze adversarial samples more naturally and intrinsically. Based on this causal model, the working mechanism of adversarial samples is revealed, and instructive analysis is provided. Then, we propose simple and effective adversarial sample detection and recognition methods according to the revealed working mechanism. The causal insights enable us to detect and recognize adversarial samples without any extra model or training. Extensive experiments are conducted to demonstrate the effectiveness of the proposed methods. Our methods outperform the state-of-the-art defense methods under various adversarial attacks.