Figure - uploaded by Jianbo Chen
Content may be subject to copyright.
Top: Performance of detection methods trained with C&W-MIX and tested on C&W-LC, C&W-HC and C&W-MIX. Bottom: Performance of detection methods trained with L∞-PGD-MIX and tested on L∞-PGD-LC, L∞-PGD-HC and L∞-PGD- MIX.
Source publication
Deep neural networks obtain state-of-the-art performance on a series of tasks. However, they are easily fooled by adding a small adversarial perturbation to input. The perturbation is often human imperceptible on image data. We observe a significant difference in feature attributions of adversarially crafted examples from those of original ones. Ba...
Contexts in source publication
Context 1
... test the detection methods on a different set of original and adversarial images generated from three versions: low-confidence C&W attack (c = 0), high-confidence C&W attack (c = 50), and the mixed-confidence C&W attack. Table 2 (Top) and Figure 4 (Left) show TPRs at different FPR thresholds, AUC, and the ROC curve. Mahalanobis, LID and KD+BU fail to detect adversarial examples of mixed-confidence effectively, while our method performs consistently better for adversarial images across the three settings. ...Context 2
... corresponding original images are different from the training images. Table 2 (Bottom) and Figure 4 (Right) show TPRs at different FPR thresholds, AUC, and the ROC curve. Mahalanobis, LID and KD+BU fail to detect adversarial examples of mixed-confidence effectively, while our method performs significantly better across the three settings. ...Context 3
... test the detection methods on a different set of original and adversarial images generated from three versions: low-confidence C&W attack (c = 0), high-confidence C&W attack (c = 50), and the mixed-confidence C&W attack. Table 2 (Top) and Figure 4 (Left) show TPRs at different FPR thresholds, AUC, and the ROC curve. Mahalanobis, LID and KD+BU fail to detect adversarial examples of mixed-confidence effectively, while our method performs consistently better for adversarial images across the three settings. ...Context 4
... corresponding original images are different from the training images. Table 2 (Bottom) and Figure 4 (Right) show TPRs at different FPR thresholds, AUC, and the ROC curve. Mahalanobis, LID and KD+BU fail to detect adversarial examples of mixed-confidence effectively, while our method performs significantly better across the three settings. ...Similar publications
Deep neural networks obtain state-of-the-art performance on a series of tasks. However, they are easily fooled by adding a small adversarial perturbation to the input. The perturbation is often imperceptible to humans on image data. We observe a significant difference in feature attributions between adversarially crafted examples and original examp...