Conference Paper

Emphasizing Similar Feature Representations to Defend against Adversarial Attacks

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Deep Neural Networks (DNNs) are vulnerable to adversarial perturbations of the input data. For DNNs to be deployed in critical applications, they have to be made robust to such perturbations. In this work, we test an existing strategy and propose a new strategy based on autoencoders to defend DNNs against adversarial attacks. The first strategy is based on Contractive AutoEncoder (CAE). The second strategy is a Siamese network based AutoEncoder (SAE). Both CAE and SAE emphasize on similar feature representation for both the original sample and its adversarial counterpart. CAE reconstructs original sample from the adversarial input by imposing that the norm of the Jacobian of latent representation with respect to input should be minimum. SAE works with the pair of original sample and the adversarial input simultaneously, enforcing similarity in their representations at the level of layer granularity. We demonstrate the superlative performance of both the strategies in the presence of three popular attacks - The Fast Gradient Sign Method (FGSM), DeepFool (DF) and Universal Adversarial Perturbations (UAP). In particular, the proposed methods reduce the effectiveness of adversarial attacks by increasing the accuracy from 7% to 97% on the powerful DeepFool attack on MNIST dataset and from 18% to 67% on Universal Adversarial Perturbation on Fashion-MNIST dataset. We also show the superior performance of our methods over a recent method using Denoising AutoEncoder(DAE)

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.