Article

A Local Perturbation Generation Method for GAN-Generated Face Anti-Forensics

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Although the current generative adversarial networks (GAN)-generated face forensic detectors based on deep neural networks (DNNs) have achieved considerable performance, they are vulnerable to adversarial attacks. In this paper, an effective local perturbation generation method is proposed to expose the vulnerability of state-of-the-art forensic detectors. The main idea is to mine the fake faces’ areas of common concern in multiple-detectors’ decision-making, then generate local anti-forensic perturbations by GANs in these areas to enhance the visual quality and transferability of anti-forensic faces. Meanwhile, in order to improve the anti-forensic effect, a double- mask (soft mask and hard mask) strategy and a three-part loss (the GAN training loss, the adversarial loss consisting of ensemble classification loss and ensemble feature loss, and the regularization loss) are designed for the training of the generator. Experiments conducted on fake faces generated by StyleGAN demonstrate the proposed method’s advantage over the state-of-the-art methods in terms of anti-forensic success rate, imperceptibility, and transferability. The source code is available at https://github.com/imagecbj/A-Local-Perturbation-Generation-Method-for-GAN-generated-Face-Anti-forensics .

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Ding et al. [47][48][49][50][51]53] propose anti-forensic tools and methods by which to bypass "DeepFake" detection on videos. On the other hand, Zhao, X et al. [52,54] apply it to "DeepFake" detection in GAN-generated images. ...
... Studies [47][48][49][50][51][52][53][54] propose different strategies and models for generating DeepFakes that can evade forensic detection. Paper [47] propose GAN models with additional features and loss functions designed to improve visual quality and model efficiency. ...
... Paper [47] propose GAN models with additional features and loss functions designed to improve visual quality and model efficiency. Article [48] employs local perturbations to expose the vulnerability of forensic detectors, while article [49] uses a bidirectional conversion between computer-generated and natural face images. Articles [50,51] focus on adversarial attacks for DeepFake detectors, reducing detection accuracy and highlighting the need to consider visual perception in the generation of these attacks. ...
Article
Full-text available
The main purpose of anti-forensic computer techniques, in the broadest sense, is to hinder the investigation of a computer attack by eliminating traces and preventing the collection of data contained in a computer system. Nowadays, cyber-attacks are becoming more and more frequent and sophisticated, so it is necessary to understand the techniques used by hackers to be able to carry out a correct forensic analysis leading to the identification of the perpetrators. Despite its importance, this is a poorly represented area in the scientific literature. The disparity of the existing works, together with the small number of articles, makes it challenging to find one’s way around the vast world of computer forensics. This article presents a comprehensive review of the existing scientific literature on anti-forensic techniques, mainly DFIR (digital forensics incident response), organizing the studies according to their subject matter and orientation. It also presents key ideas that contribute to the understanding of this field of forensic science and details the shortcomings identified after reviewing the state of the art.
... In recent years, with the introduction and advancement of GAN, GAN have become a focal point of research for an increasing number of scholars. GANs have exhibited remarkable performance in the field of image processing, such as image recognition [19][20][21], image restoration [22], image generation [21], and image super-resolution [23]. ...
... In recent years, with the introduction and advancement of GAN, GAN have become a focal point of research for an increasing number of scholars. GANs have exhibited remarkable performance in the field of image processing, such as image recognition [19][20][21], image restoration [22], image generation [21], and image super-resolution [23]. ...
Article
Full-text available
When capturing distant targets, the video sequence images are affected by atmospheric turbulence, resulting in distortion and blur. In order to restore the degraded images due to atmospheric turbulence in video sequences, this article combines lucky imaging with generative adversarial networks for the first time. The idea of lucky imaging is employed to eliminate geometric distortions, followed by the use of generative adversarial networks to address the blur issue. Additionally, an adaptive restoration method targeting turbulence intensity is proposed to improve the computational efficiency of the proposed approach. Experimental results demonstrate that the combined restoration method of lucky imaging and generative adversarial networks outperforms classical lucky imaging. Specifically, compared to classical lucky imaging, the Brenner gradient function, Laplacian gradient function, Spatial Median Difference (SMD), Entropy, Energy gradient function, PIQE, and Brisque indicators improve by 7.7%, 13.1%, 3.6%, 4.1%, 2.1%, 26.6% and 21.54% (all evaluation indicators in the above improvement rates have undergone logarithmic transformation), respectively. Meanwhile, the proposed adaptive restoration method can improve efficiency by 28%, with greater efficiency gains observed with larger datasets.
... As cyberspace activities become increasingly frequent, cyberspace portraits raise many security issues. Terefore, it is crucial to protect the privacy of portraits [1,2]. Image-toimage (I2I) translation has become a popular research topic in recent years, and it aims to learn image-projecting functions from source to target domains. ...
Article
Full-text available
Image-to-image (I2I) translation has emerged as a valuable tool for privacy protection in the digital age, offering effective ways to safeguard portrait rights in cyberspace. In addition, I2I translation is applied in real-world tasks such as image synthesis, super-resolution, virtual fitting, and virtual live streaming. Traditional I2I translation models demonstrate strong performance when handling similar datasets. However, when the domain distance between two datasets is large, translation quality may degrade significantly due to notable differences in image shape and edges. To address this issue, we propose Long-Domain Search GAN (LDSGAN), an unsupervised I2I translation network that employs a GAN structure as its backbone, incorporating a novel Real-Time Routing Search (RTRS) module and Sketch Loss. Specifically, RTRS aids in expanding the search space within the target domain, aligning feature projection with images closest to the optimization target. Additionally, Sketch Loss retains human visual similarity during long-domain distance translation. Experimental results indicate that LDSGAN surpasses existing I2I translation models in both image quality and semantic similarity between input and generated images, as reflected by its mean FID and LPIPS scores of 31.509 and 0.581, respectively.
... For instance, [35] proposed Face X-ray, a method predicting blending boundaries in fake video frames. [74] employed adversarial learning to add perturbations to regions of interest, revealing vulnerabilities in current image-based forgery detection. [69] [7] demonstrated good generalization abilities through intra-consistency within classes and inter-diversity between classes. ...
Article
Full-text available
Deep-fake videos, generated through AI face-swapping techniques, have garnered considerable attention due to their potential for impactful impersonation attacks. While existing research primarily distinguishes real from fake videos, attributing a deep-fake to its specific generation model or encoder is crucial for forensic investigation, enabling precise source tracing and tailored countermeasures. This approach not only enhances detection accuracy by leveraging unique model-specific artifacts but also provides insights essential for developing proactive defenses against evolving deep-fake techniques. Addressing this gap, this paper investigates the model attribution problem for deep-fake videos using two datasets: Deepfakes from Different Models (DFDM) and GANGen-Detection, which comprise deep-fake videos and images generated by GAN models. We select only fake images from the GANGen-Detection dataset to align with the DFDM dataset, which specifies the goal of this study, focusing on model attribution rather than real/fake classification. This study formulates deep-fake model attribution as a multiclass classification task, introducing a novel Capsule-Spatial-Temporal (CapST) model that effectively integrates a modified VGG19 (utilizing only the first 26 out of 52 layers) for feature extraction, combined with Capsule Networks and a Spatio-Temporal attention mechanism. The Capsule module captures intricate feature hierarchies, enabling robust identification of deep-fake attributes, while a video-level fusion technique leverages temporal attention mechanisms to process concatenated feature vectors and capture temporal dependencies in deep-fake videos. By aggregating insights across frames, our model achieves a comprehensive understanding of video content, resulting in more precise predictions. Experimental results on the DFDM and GANGen-Detection datasets demonstrate the efficacy of CapST, achieving substantial improvements in accurately categorizing deep-fake videos over baseline models, all while demanding fewer computational resources.
... Deepfake has a wide range of applications in the field of face images [4,5], typically employed for localized manipulation such as facial editing, replication, and replacement in several key areas. On the basis of StarGAN [22], Zhang et al. [23] used the method of local perturbation generation to make a fake face, and the generated image was more deceptive. Facial editing such as StarGAN [22], attGAN [24] and STGAN [25] allow the addition, deletion, modification, or alteration of certain attributes of human faces, such as hair, age, etc. Facial replication allows for altering facial expressions while maintaining a sense of facial authenticity. ...
Article
Full-text available
The forensic examination of AIGC(Artificial Intelligence Generated Content) faces poses a contemporary challenge within the realm of color image forensics. A myriad of artificially generated faces by AIGC encompasses both global and local manipulations. While there has been noteworthy progress in the forensic scrutiny of fake faces, current research primarily focuses on the isolated detection of globally and locally manipulated fake faces, thus lacking a universally effective detection methodology. To address this limitation, we propose a sophisticated forensic model that incorporates a dual-stream framework comprising quaternion RGB and PRNU(Photo Response Non-Uniformity). The PRNU stream extracts the “camera fingerprint” feature by discerning the non-uniform response of the image sensor under varying lighting conditions, thereby encapsulating the overall distribution characteristics of globally manipulated faces. The quaternion RGB stream leverages the inherent nonlinear properties of quaternions and their informative representation capabilities to accurately describe changes in image color, background, and spatial structure, facilitating the meticulous capture of nuanced local distinctions between locally manipulated faces and real faces. Ultimately, we integrate the two streams to establish the exchange of feature information between PRNU and quaternion RGB streams. This strategic integration fully exploits the complementarity between two streams to amalgamate local and global features effectively. Experimental results obtained from diverse datasets underscore the advantages of our method in terms of accuracy, achieving a detection accuracy of 96.81%.
... While data augmentation and dataset expansion have been employed to enhance generalization [11], they often yield limited efectiveness due to their predictability. Furthermore, the robustness of GAN-based detectors is often compromised in the face of adversarial perturbations, as evidenced by the localized attacks conducted by Zhang et al. [19], which revealed vulnerabilities in a range of detection models. As for difusiongenerated images, efective detection techniques remain underexplored, and those available tend to be overly complex, lacking comprehensive image feature analysis, thus hindering their practical applicability. ...
Article
Full-text available
Within the domain of Artificial Intelligence Generated Content (AIGC), technological strides in image generation have been marked, resulting in the proliferation of deepfake images that pose substantial security threats. The current landscape of deepfake detection technologies is marred by limited generalization across diverse generative models and a subpar detection rate for images generated through diffusion processes. In response to these challenges, this paper introduces a novel detection model designed for high generalizability, leveraging multiscale frequency and spatial domain features. Our model harnesses an array of specialized filters to extract frequency-domain characteristics, which are then integrated with spatial-domain features captured by a Feature Pyramid Network (FPN). The integration of the Attentional Feature Fusion (AFF) mechanism within the feature fusion module allows for the optimal utilization of the extracted features, thereby enhancing detection capabilities. We curated an extensive dataset encompassing deepfake images from a variety of GANs and diffusion models for rigorous evaluation. The experimental findings reveal that our proposed model achieves superior accuracy and generalization compared to existing baseline models when confronted with deepfake images from multiple generative sources. Notably, in cross-model detection scenarios, our model outperforms the next best model by a significant margin of 29.1% for diffusion-generated images and 15.1% for GAN-generated images. This accomplishment presents a viable solution to the pressing issues of generalization and adaptability in the field of deepfake detection.
... Te rapid proliferation of deepfake technologies, which use deep learning techniques to generate highly realistic yet fabricated media, poses a signifcant threat to the integrity and credibility of digital content [1,2]. Te mainstream methods of deepfakes are shown in Figure 1. ...
Article
Full-text available
Detecting deepfake media remains an ongoing challenge, particularly as forgery techniques rapidly evolve and become increasingly diverse. Existing face forgery detection models typically attempt to discriminate fake images by identifying either spatial artifacts (e.g., generative distortions and blending inconsistencies) or predominantly frequency-based artifacts (e.g., GAN fingerprints). However, a singular focus on a single type of forgery cue can lead to limited model performance. In this work, we propose a novel cross-domain approach that leverages a combination of both spatial and frequency-aware cues to enhance deepfake detection. First, we extract wavelet features using wavelet transformation and residual features using a specialized frequency domain filter. These complementary feature representations are then concatenated to obtain a composite frequency domain feature set. Furthermore, we introduce an adaptive feature fusion module that integrates the RGB color features of the image with the composite frequency domain features, resulting in a rich, multifaceted set of classification features. Extensive experiments conducted on benchmark deepfake detection datasets demonstrate the effectiveness of our method. Notably, the accuracy of our method on the challenging FF++ dataset is mostly above 98%, showcasing its strong performance in reliably identifying deepfake images across diverse forgery techniques.
... While GAN-generated face forensic detectors based on deep neural networks (DNNs) have achieved remarkable performance, they are vulnerable to adversarial attacks. This study [14] proposes an effective local perturbation generation method to expose the shortcomings of state of the art forensic detectors. The primary objective is to determine which concerns are shared by fake faces in the decision-making process of multiple detectors. ...
Article
Full-text available
Fake face identity is a serious, potentially fatal issue that affects every industry from the banking and finance industry to the military and mission-critical applications. This is where the proposed system offers artificial intelligence (AI)-based supported fake face detection. The models were trained on an extensive dataset of real and fake face images, incorporating steps like sampling, preprocessing, pooling, normalization, vectorization, batch processing and model training, testing-, and classification via output activation. The proposed work performs the comparative analysis of the three fusion models, which can be integrated with Generative Adversarial Networks (GAN) based on the performance evaluation. The Model-3, which contains the combination of DenseNet-201+ResNet-102+Xception, offers the highest accuracy of 0.9797, and the Model-2 with the combination of DenseNet-201+ResNet-50+Inception V3 offers the lowest loss value of 0.1146; both are suitable for the GAN integration. Additionally, the Model-1 performs admirably, with an accuracy of 0.9542 and a loss value of 0.1416. A second dataset was also tested where the proposed Model-3 provided maximum accuracy of 86.42% with a minimum loss of 0.4054.
... The tremendous performance of deep learning models has led to rampant application of these systems in practice. However, these models can be manipulated by introducing minor perturbations [1]- [5]. This process is called adversarial attacks. ...
Preprint
Full-text available
Adversarial attacks have been recently investigated in person re-identification. These attacks perform well under cross dataset or cross model setting. However, the challenges present in cross-dataset cross-model scenario does not allow these models to achieve similar accuracy. To this end, we propose our method with the goal of achieving better transferability against different models and across datasets. We generate a mask to obtain better performance across models and use meta learning to boost the generalizability in the challenging cross-dataset cross-model setting. Experiments on Market-1501, DukeMTMC-reID and MSMT-17 demonstrate favorable results compared to other attacks.
Article
Steganography is a critical information‐hiding technique widely used for the covert transmission of secret information on social media. In contrast, steganalysis plays a key role in ensuring information security. Although various effective steganalysis algorithms have been proposed, existing studies typically treat color images as three independent channels and do not fully consider robust features suitable for JPEG images. To address this limitation, we propose a robust steganalysis algorithm based on high‐dimensional mapping. By analyzing the changes in color images during the JPEG compression and decompression processes, we observe that the embedding of secret information causes shifts in the JPEG coefficients, which subsequently affects feature representation during decompression. Based on this observation, our method captures steganographic traces by utilizing the transformation errors produced during decompression. Additionally, due to the imbalance between luminance and chrominance, the feature weights of each channel are uneven. To ensure balanced analysis across the three channels, we adjust the distribution differences of each channel through high‐dimensional mapping, thereby reducing intraclass feature variations. Experimental results demonstrate that the proposed method outperforms existing approaches in most cases.
Article
Deep neural networks (DNNs) have demonstrated excellent performance across various domains. However, recent studies have shown that deep neural networks are vulnerable to adversarial examples, including DNN-based video action recognition models. While much of the existing research on adversarial attacks against video models focuses on perturbation-based attacks, there is limited research on patch-based black-box attacks. Existing patch-based attack algorithms suffer from the problem of a large search space of optimization algorithms and use patches with simple content, leading to suboptimal attack performance or requiring a large number of queries. To address these challenges, we propose the “Diffusion Patch Attack (DPA) with Spatial-Temporal Cross-Evolution (STCE) for Video Recognition”, a novel approach that integrates the excellent properties of the diffusion model into video black-box adversarial attacks for the first time. This integration significantly narrows the parameter search space while enhancing the adversarial content of patches. Moreover, we introduce the spatial-temporal cross-evolutionary algorithm to adapt to the narrowed search space. Specifically, we separate the spatial and temporal parameters and then employ an alternate evolutionary strategy for each parameter type. Extensive experiments conducted on three widely used video action recognition models (C3D, NL, and TPN) and two benchmark datasets (UCF-101 and HMDB-51) demonstrate the superior performance of our approach compared to other state-of-the-art black-box patch attack algorithms.
Article
Most existing text-driven face image generation and manipulation methods are based on StyleGAN2, which is inherently limited to aligned faces and therefore makes these methods fail to preserve the highly variable face placement. Additionally, these methods directly leverage a pairwise loss to learn the correspondence between the image and text, which can not handle complex text descriptions, e.g., the text with multiple captions describes multiple facial attributes. To address these issues, we explore the feasibility of applying the more advanced StyleGAN3 to generate and manipulate the face images in an Open-World setup, e.g., the target face image is not required to be aligned and the text description contains multiple captions. To this end, we first design an improved iterative refinement strategy that adaptively predicts the generator weight offsets rather than residuals for the inverted latent code via a hyper-network, which efficiently finds a desired generator with no image-specific optimization. We further analyze the disentanglement of different StyleGAN3 latent spaces and demonstrate that the S space learns a more semantically-disentangled representation. To enable complex edits mentioned by the multi-caption text, we propose a cross-modal feature filtration module with a probability adaptation strategy to capture the image-text correspondences. Finally, we incorporate a channel-wise attention mechanism to obtain a global latent manipulation direction, which learns to assign importance weights to different channels. Extensive experiments demonstrate the superior performance of our proposed method compared against the state-of-the-art methods.
Article
Recently, the rapid advancement of generative model has led to its exploitation by malicious actors who employ it to fabricate fake synthetic images. Meanwhile, the deceptive images are often disseminated on social network platforms, thereby undermining public trust. Although reliable forensic tools have emerged to detect generative fake images, the existing supervised detectors excessively rely on the correctly-labeled training samples, leading to overwhelming outsourcing annotation costs and the potential risk of suffering from label flipping attack. In light of the aforementioned limitations, we propose an unsupervised detector fighting against generative fake image. In particular, we assign the noisy labels to the training samples. Then dependent on the pre-clustered samples with noisy labels, the strategy of pre-training and re-training mechanism helps us train the feature extractor utilized to extract the discriminative feature. Last, the extracted feature guides us to respectively cluster both pristine and fake images; the fake images are effectively filtered by employing cosine similarity. Extensive experimental results highlight that our unsupervised detector rivals the baseline supervised methods; moreover, it has better capability of defending against label flipping attack.
Chapter
Recently, researchers have developed new image splicing detection and localization algorithms by leveraging deep learning and contrastive learning techniques. These algorithms utilize Siamese neural networks to detect and localize spliced contents by identifying inconsistencies in an image’s forensic traces. At the same time, deep learning has also enabled the researchers to develop new types of anti-forensic attack that can fool forensic algorithms. Among these attacks, the generative adversarial network (GAN) based approach has demonstrated superior performance in attacking forensic algorithms, including Siamese neural network based forensic algorithms. However, GAN-based attacks can sometimes still fail to fool these forensic algorithms due to the lack of direct control over the distribution of forensic traces in spliced images. In this chapter, we propose a new attack that refine the GAN-based attack on individual spliced images. Our attack achieves this by directly reducing the difference of forensic traces within the spliced images. Through a series of experiments, we demonstrate the capability of our attack to convincingly deceive Siamese neural network based image splicing detection and localization algorithms. The proposed attack significantly improves attack performance and successfully fool the targeted splicing detection and localization algorithms when the GAN-based attack fails.
Article
Active defense is an important approach to counter speech deepfakes that threaten individuals' privacy, property, and reputation. However, the existing works in this field suffer from issues such as time-consuming and ordinary defense effectiveness. This letter proposes a Generative Adversarial Network (GAN) framework for adversarial attacks as a defense against malicious voice conversion. The proposed method uses a generator to produce adversarial perturbations and adds them to the mel-spectrogram of the target audio to craft adversarial example. In addition, in order to enhance the defense effectiveness, a spectrogram waveform conversion simulation module (SWCSM) is designed to simulate the process of reconstructing waveform from the adversarial mel-spectrogram example and re-extracting mel-spectrogram from the reconstructed waveform. Experiments on four state-of-the-art voice conversion models show that our method achieves the overall best performance among five compared methods in both white-box and black-box scenarios in terms of defense effectiveness and generation time. The source code is available at GitHub by https://github.com/imagecbj/ Initiative-Defense-against-Voice-Conversion-through-Gen erative-Adversarial-Network.
Article
In recent times, digital image forensics is gaining increased attention in multimedia forensics owing to the widespread scam alertness. Several forensic methods have been studied to establish the integrity of digital images by disclosing manipulation fingerprints. Anti-forensic (AF) attacks on manipulated images, particularly deep learning-based adversarial attacks using generative adversarial network (GAN), have been successfully applied to delude forensic methods. Consequently, an efficacious, efficient, and robust counter-AF (CAF) method is required to secure the integrity of digital images. In this study, we propose a robust open-set multi-instance learning approach for exposing GAN-based AF on manipulated images by introducing additional GAN-based operations. First, we generate multiple real instances from real images using multiple additional generators. Then we train an embedding network collaboratively with multiple real instances in an open-set fashion. During training, the embedding network learns only real images and has no prior knowledge regarding AF images. In the testing phase, real and AF images are processed for detection. The proposed open-set CAF method can effectively detect AF images and is more robust against transferable updating.
Article
In recent years, with the rapid development of face editing and generation, more and more fake videos are circulating on social media, which has caused extreme public concerns. Existing face forgery detection methods based on frequency domain find that the GAN forged images have obvious grid-like visual artifacts in the frequency spectrum. But for synthesized videos, these methods only confine to a single frame and pay little attention to the most discriminative part and temporal frequency clue among different frames. To take full advantage of the rich information in video sequences, this paper performs video forgery detection on both spatial and temporal frequency domains and proposes a Discrete Cosine Transform-based Forgery Clue Augmentation Network (FCAN-DCT) to achieve a more comprehensive spectrum spatial-temporal feature representation. FCAN-DCT totally consists of a backbone network and two branches: Compact Feature Extraction (CFE) module and Frequency Temporal Attention (FTA) module. We conduct thorough experimental assessments on three visible light (VIS) based datasets ( i.e ., FaceForensics++, Celeb-DF (v2), Wild-Deepfake), and our self-built video forgery dataset DeepfakeNIR, which is the first video forgery dataset on near-infrared (NIR) modality. The experimental results demonstrate the effectiveness and robustness of our method for detecting forgery videos in both VIS and NIR scenarios.DeepfakeNIR and code are available at https://github.com/AEP-WYK/DeepfakeNIR.
Article
Face forgery technology has developed rapidly, causing severe security issues in society. Recently, with the continuous emergence of forgery techniques and types, most forensics methods suffer from the generalization problem. In particular, it is difficult for existing generalized methods to detect fake faces with unseen fake types. The reason is that the distribution gaps among cross-forgery types are too large. In this paper, we propose a novel generalized framework to narrow large gaps based on bridging cross-domain alignment to solve this problem. Specifically, our framework consists of three key steps: preventing, bridging and aligning distribution gaps. Firstly, in the feature mining stage, taking advantage of the ability of Instance Normalization (IN) to better tolerate domain gaps, we design Adaptive Batch and Instance Normalization (ABIN) to replace the commonly used BN to adaptively extract features to preliminarily prevent domain gaps. Secondly, we propose to generate bridging samples distributed among the inter-domains to fill large gaps based on progressive linear interpolation operation. Finally, with the help of bridging samples, the cross-domain alignment is performed to better narrow distribution gaps to refine data distribution, which helps to learn a more generalized framework. Extensive experiments show that our proposed framework achieves the state-of-the-art generalized performance.
Article
Deepfake technique can synthesize realistic images, audios, and videos, facilitating the thriving of entertainment, education, healthcare, and other industries. However, its abuse may pose potential threats to personal privacy, social stability, and even national security. Therefore, the development of deepfake detection methods is attracting more and more attention. Existing works mainly focus on the detection of common videos for entertainment purposes. In contrast, fake videos maliciously synthesized for Person of Interest (PoI, i.e ., who is in an authoritative position and has broadly public influences) are much more harmful to society because of celebrity endorsement . However, there is no particular benchmark for driving related research in the community. Motivated by this observation, we present the first large-scale benchmark dataset, named FakePoI , to enable the research on fake PoI detection. It contains numerous fake videos of important people from all walks of life, e.g ., police chiefs, city mayors, famous artists, and well-known Internet bloggers. In summary, our FakePoI includes 11,092 synthesized videos where only a few clips rather than the entire are fake. Previous fake detection algorithms deteriorate heavily or even fail on our FakePoI due to two main challenges. On the one hand, the rich diversity of our fake videos makes it pretty difficult to find universally applicable patterns for detection. On the other hand, the high credibility contributed by the presence of real frames easily confuses a common detector. To tackle these challenges, we present an amplifier framework, highlighting the feature gap between real and generated video frames. Specifically, we present a quadruplet loss to narrow the distance of all real PoIs and meanwhile push away each real and fake PoI in embedding space. We implement our framework and conduct extensive experiments on the proposed benchmark. The quantitative results demonstrate that our approach outperforms existing methods significantly, setting a strong baseline on FakePoI . The qualitative analysis also shows its superiority. We will release our dataset and code at https://github.com/cslltian/deepfake-detection to encourage future research on this valuable area.
Article
Adversarial attacks have been recently investigated in person re-identification. These attacks perform well under cross dataset or cross model setting. However, the challenges present in cross-dataset cross-model scenario does not allow these models to achieve similar accuracy. To this end, we propose our method with the goal of achieving better transferability against different models and across datasets. We generate a mask to obtain better performance across models and use meta learning to boost the generalizability in the challenging cross-dataset cross-model setting. Experiments on Market-1501, DukeMTMC-reID and MSMT-17 demonstrate favorable results compared to other attacks.
Article
Full-text available
Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably their most significant impact has been in the area of computer vision where great advances have been made in challenges such as plausible image generation, image-to-image translation, facial attribute manipulation, and similar domains. Despite the significant successes achieved to date, applying GANs to real-world problems still poses significant challenges, three of which we focus on here. These are as follows: (1) the generation of high quality images, (2) diversity of image generation, and (3) stabilizing training. Focusing on the degree to which popular GAN technologies have made progress against these challenges, we provide a detailed review of the state-of-the-art in GAN-related research in the published scientific literature. We further structure this review through a convenient taxonomy we have adopted based on variations in GAN architectures and loss functions. While several reviews for GANs have been presented to date, none have considered the status of this field based on their progress toward addressing practical challenges relevant to computer vision. Accordingly, we review and critically discuss the most popular architecture-variant, and loss-variant GANs, for tackling these challenges. Our objective is to provide an overview as well as a critical analysis of the status of GAN research in terms of relevant progress toward critical computer vision application requirements. As we do this we also discuss the most compelling applications in computer vision in which GANs have demonstrated considerable success along with some suggestions for future research directions. Codes related to the GAN-variants studied in this work is summarized on https://github.com/sheqi/GAN_Review.
Preprint
Full-text available
The availability of large-scale facial databases, together with the remarkable progresses of deep learning technologies, in particular Generative Adversarial Networks (GANs), have led to the generation of extremely realistic fake facial content, raising obvious concerns about the potential for misuse. Such concerns have fostered the research on manipulation detection methods that, contrary to humans, have already achieved astonishing results in various scenarios. In this study, we focus on the synthesis of entire facial images, which is a specific type of facial manipulation. The main contributions of this study are four-fold: i) a novel strategy to remove GAN “fingerprints” from synthetic fake images based on autoencoders is described, in order to spoof facial manipulation detection systems while keeping the visual quality of the resulting images; ii) an in-depth analysis of the recent literature in facial manipulation detection; iii) a complete experimental assessment of this type of facial manipulation, considering the state-of-the-art fake detection systems (based on holistic deep networks, steganalysis, and local artifacts), remarking how challenging is this task in unconstrained scenarios; and finally iv) we announce a novel public database, named iFakeFaceDB, yielding from the application of our proposed GAN-fingerprint Removal approach (GANprintR) to already very realistic synthetic fake images. The results obtained in our empirical evaluation show that additional efforts are required to develop robust facial manipulation detection systems against unseen conditions and spoof techniques, such as the one proposed in this study.
Article
Full-text available
We propose a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable. Our approach—Gradient-weighted Class Activation Mapping (Grad-CAM), uses the gradients of any target concept (say ‘dog’ in a classification network or a sequence of words in captioning network) flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept. Unlike previous approaches, Grad-CAM is applicable to a wide variety of CNN model-families: (1) CNNs with fully-connected layers (e.g.VGG), (2) CNNs used for structured outputs (e.g.captioning), (3) CNNs used in tasks with multi-modal inputs (e.g.visual question answering) or reinforcement learning, all without architectural changes or re-training. We combine Grad-CAM with existing fine-grained visualizations to create a high-resolution class-discriminative visualization, Guided Grad-CAM, and apply it to image classification, image captioning, and visual question answering (VQA) models, including ResNet-based architectures. In the context of image classification models, our visualizations (a) lend insights into failure modes of these models (showing that seemingly unreasonable predictions have reasonable explanations), (b) outperform previous methods on the ILSVRC-15 weakly-supervised localization task, (c) are robust to adversarial perturbations, (d) are more faithful to the underlying model, and (e) help achieve model generalization by identifying dataset bias. For image captioning and VQA, our visualizations show that even non-attention based models learn to localize discriminative regions of input image. We devise a way to identify important neurons through Grad-CAM and combine it with neuron names (Bau et al. in Computer vision and pattern recognition, 2017) to provide textual explanations for model decisions. Finally, we design and conduct human studies to measure if Grad-CAM explanations help users establish appropriate trust in predictions from deep networks and show that Grad-CAM helps untrained users successfully discern a ‘stronger’ deep network from a ‘weaker’ one even when both make identical predictions. Our code is available at https://github.com/ramprs/grad-cam/, along with a demo on CloudCV (Agrawal et al., in: Mobile cloud visual media computing, pp 265–290. Springer, 2015) (http://gradcam.cloudcv.org) and a video at http://youtu.be/COjUB9Izk6E.
Conference Paper
Full-text available
Neural networks are vulnerable to adversarial attacks - small visually imperceptible crafted noise which when added to the input drastically changes the output. The most effective method of defending against adversarial attacks is to use the methodology of adversarial training. We analyze the adversarially trained robust models to study their vulnerability against adversarial attacks at the level of the latent layers. Our analysis reveals that contrary to the input layer which is robust to adversarial attack, the latent layer of these robust models are highly susceptible to adversarial perturbations of small magnitude. Leveraging this information, we introduce a new technique Latent Adversarial Training (LAT) which comprises of fine-tuning the adversarially trained models to ensure the robustness at the feature layers. We also propose Latent Attack (LA), a novel algorithm for constructing adversarial examples. LAT results in a minor improvement in test accuracy and leads to a state-of-the-art adversarial accuracy against the universal first-order adversarial PGD attack which is shown for the MNIST, CIFAR-10, CIFAR-100, SVHN and Restricted ImageNet datasets.
Conference Paper
Full-text available
Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial exam- ples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply Adv- GAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.
Article
Full-text available
Much research effort has been devoted to better understanding adversarial examples, which are specially crafted inputs to machine-learning models that are perceptually similar to benign inputs, but are classified differently (i.e., misclassified). Both algorithms that create adversarial examples and strategies for defending against them typically use LpL_p-norms to measure the perceptual similarity between an adversarial input and its benign original. Prior work has already shown, however, that two images need not be close to each other as measured by an LpL_p-norm to be perceptually similar. In this work, we show that nearness according to an LpL_p-norm is not just unnecessary for perceptual similarity, but is also insufficient. Specifically, focusing on datasets (CIFAR10 and MNIST), LpL_p-norms, and thresholds used in prior work, we show through 299-participant online user studies that "adversarial examples" that are closer to their benign counterparts than required by commonly used LpL_p-norm thresholds can nevertheless be perceptually different to humans from the corresponding benign examples. Namely, the perceptual distance between two images that are "near" each other according to an LpL_p-norm can be high enough that participants frequently classify the two images as representing different objects or digits. Combined with prior work, we thus demonstrate that nearness of inputs as measured by LpL_p-norms is neither necessary nor sufficient for perceptual similarity, which has implications for both creating and defending against adversarial examples.
Article
Full-text available
Recent research has revealed that the output of Deep neural networks(DNN) is not continuous and very sensitive to tiny perturbation on the input vectors and accordingly several methods have been proposed for crafting effective perturbation against the networks. In this paper, we propose a novel method for optically calculating extremely small adversarial perturbation (few-pixels attack), based on differential evolution. It requires much less adversarial information and works with a broader classes of DNN models. The results show that 73.8%\% of the test images can be crafted to adversarial images with modification just on one pixel with 98.7%\% confidence on average. In addition, it is known that investigating the robustness problem of DNN can bring critical clues for understanding the geometrical features of the DNN decision map in high dimensional input space. The results of conducting few-pixels attack contribute quantitative measurements and analysis to the geometrical understanding from a different perspective compared to previous works.
Chapter
Visually realistic GAN-generated images have recently emerged as an important misinformation threat. Research has shown that these synthetic images contain forensic traces that are readily identifiable by forensic detectors. Unfortunately, these detectors are built upon neural networks, which are vulnerable to recently developed adversarial attacks. In this paper, we propose a new anti-forensic attack capable of fooling GAN-generated image detectors. Our attack uses an adversarially trained generator to synthesize traces that these detectors associate with real images. Furthermore, we propose a technique to train our attack so that it can achieve transferability, i.e. it can fool unknown CNNs that it was not explicitly trained against. We evaluate our attack through an extensive set of experiments, where we show that our attack can fool eight state-of-the-art detection CNNs with synthetic images created using seven different GANs, and outperform other alternative attacks.KeywordsGAN-based attacksForensic detectorsAnti-forensics
Article
The vulnerability of deep neural networks (DNNs) to adversarial examples has attracted more attention. Many algorithms have been proposed to craft powerful adversarial examples. However, most of these algorithms modified the global or local region of pixels without taking network explanations into account. Hence, the perturbations are redundant, which are easily detected by human eyes. In this paper, we propose a novel method to generate local region perturbations. The main idea is to find a contributing feature region (CFR) of an image by simulating the human attention mechanism and then add perturbations to CFR. Furthermore, a soft mask matrix is designed on the basis of an activation map to finely represent the contributions of each pixel in CFR. With this soft mask, we develop a new loss function with inverse temperature to search for optimal perturbations in CFR. Due to the network explanations, the perturbations added to CFR are more effective than those added to other regions. Extensive experiments conducted on CIFAR-10 and ILSVRC2012 demonstrate the effectiveness of the proposed method, including attack success rate, imperceptibility, and transferability.
Article
Aiming at degrading the capability of the existing forensic methods in discriminating computer generated and natural facial images, a bidirectional conversion between computer-generated and natural facial images based on generative adversarial network (BDC-GAN) is proposed for anti-forensics in this paper. The generator of BDC-GAN is composed of noise encoding and content encoding. In the noise encoding, three high-pass filters are first utilized to extract the sensor pattern noise of the image, and then the stacked convolution layer is combined to continue encoding. In the content encoding, VGG-19 is truncated and fine-tuned to encode the content of the image. Some stacked convolution layers and adaptive instance normalization layer are used in the decoder. The discriminator uses multi-scale image discriminator. Furthermore, content loss and noise loss are well designed, and hyperparameters are reasonably set to accomplish the bidirectional conversion between two domain images meanwhile retaining the original facial contour. Experimental results and analysis demonstrate that the proposed anti-forensic method can achieve better visual quality and stronger deception ability compared with the existing unidirectional CG facial image anti-forensic methods and bidirectional domain adaptive methods, and its effectiveness is verified by the tests on the existing 9 forensic methods. It reveals that the existing forensic techniques can be bypassed by using adversarial learning, and it will eventually push the performance improvement of the discrimination of computer generated and natural facial images.
Article
Lots of Deepfake videos are circulating on the Internet, which not only damages the personal rights of the forged individual, but also pollutes the web environment. What's worse, it may trigger public opinion and endanger national security. Therefore, it is urgent to fight deep forgery. Most of the current forgery detection algorithms are based on convolutional neural networks to learn the feature differences between forged and real frames from big data. In this paper, from the perspective of image generation, we simulate the forgery process based on image generation and explore possible trace of forgery. We propose a multi-scale self-texture attention Generative Network(MSTA-Net) to track the potential texture trace in image generation process and eliminate the interference of deep forgery post-processing. Firstly, a generator with encoder-decoder is to disassemble images and performed trace generation, then we merge the generated trace image and the original map, which is input into the classifier with Resnet as the backbone. Secondly, the self-texture attention mechanism(STA) is proposed as the skip connection between the encoder and the decoder, which significantly enhances the texture characteristics in the image disassembly process and assists the generation of texture trace. Finally, we propose a loss function called Prob-tuple loss restricted by classification probability to amend the generation of forgery trace directly. To verify the performance of the MSTA-Net, we design different experiments to verify the feasibility and advancement of the method. Experimental results show that the proposed method performs well on deep forged databases represented by FaceForensics++, Celeb-DF, Deeperforensics and DFDC, and some results are reaching the state-of-the-art.
Article
In recent years, generative adversarial networks (GANs) have been widely used to generate realistic fake face images, which can easily deceive human beings. To detect these images, some methods have been proposed. However, their detection performance will be degraded greatly when the testing samples are post-processed. In this paper, some experimental studies on detecting post-processed GAN-generated face images find that (a) both the luminance component and chrominance components play an important role, and (b) the RGB and YCbCr color spaces achieve better performance than the HSV and Lab color spaces. Therefore, to enhance the robustness, both the luminance component and chrominance components of dual-color spaces (RGB and YCbCr) are considered to utilize color information effectively. In addition, the convolutional block attention module and multilayer feature aggregation module are introduced into the Xception model to enhance its feature representation power and aggregate multilayer features, respectively. Finally, a robust dual-stream network is designed by integrating dual-color spaces RGB and YCbCr and using an improved Xception model. Experimental results demonstrate that our method outperforms some existing methods, especially in its robustness against different types of post-processing operations, such as JPEG compression, Gaussian blurring, gamma correction, and median filtering.
Article
Generating falsified faces by artificial intelligence, widely known as DeepFake, attracted attention worldwide since 2017. Given the potential threat brought by this novel technique, forensics researchers dedicated themselves to detect the video forgery. Except for exposing falsified faces, there could be extended research directions for DeepFake such as anti-forensics. It can disclose the vulnerability of current DeepFake forensics methods. Besides, it could also enable DeepFake videos as tactical weapons if the falsified faces are more subtle to be detected. In this paper, we propose a GAN model to behave as an anti-forensics tool. It features a novel architecture with additional supervising modules for enhancing image visual quality. Besides, a loss function is designed to improve the efficiency of the proposed model. Evaluated by experiments, we show that the DeepFake forensics detectors are susceptible to attacks launched by the proposed method. Besides, the proposed method can efficiently produce anti-forensics videos in satisfying visual quality without noticeable artifacts. Compared with the other anti-forensics approaches, this is tremendous progress achieved for DeepFake anti-forensics. The attack launched by our proposed method can be truly regarded as DeepFake anti-forensics as it can fool detecting algorithms and human eyes simultaneously.
Article
It has become a research hotspot to detect whether a face is natural or GAN-generated. However, all the existing works focus on whole GAN-generated faces. So, an improved Xception model is proposed for locally GAN-generated face detection. To the best of our knowledge, our work is the first one to address this issue. Some improvements over Xception are as follows: (1) Four residual blocks are removed to avoid the overfitting problem as much as possible; (2) Inception block with the dilated convolution is used to replace the common convolution layer in the pre-processing module of the Xception to obtain multi-scale features; (3) Feature pyramid network is utilized to obtain multi-level features for final decision. The first locally GAN-based generated face (LGGF) dataset is constructed by the pluralistic image completion method on the basis of FFHQ dataset. It has a total 952,000 images with the generated regions in different shapes and sizes. Experimental results demonstrate the superiority of the proposed model which outperforms some existing models, especially for the faces having small generated regions.
Article
In this letter, we propose a general digital image operation anti-forensic framework based on generative adversarial nets (GANs), called dual-domain generative adversarial network (DDGAN). To tackle the issue of image operation detection, the proposed framework incorporates both operation specific forensic features and machine-learned knowledge to ensure that the generated images exhibit better undetectability performance against various detectors. The DDGAN consists of a generator and two discriminators working on different domains, i.e., the operation-specific feature domain which helps to conceal the artifacts from the perspective of forensic analysis for the target task, and the spatial domain which facilitates to take advantage of machine-learned features from the scratch as a supplementary. Through the experiments on median filtering and JPEG compression anti-forensics, we show the superior performance of the proposed DDGAN compared with state-of-the-art anti-forensic methods in terms of undetectability and visual quality.
Article
Recently, generative adversarial networks (GANs) can generate photo-realistic fake facial images which are perceptually indistinguishable from real face photos, promoting research on fake face detection. Though fake face forensics can achieve high detection accuracy, their anti-forensic counterparts are less investigated. Here we explore more imperceptible and transferable anti-forensics for fake face imagery detection based on adversarial attacks. Since facial and background regions are often smooth, even small perturbation could cause noticeable perceptual impairment in fake face images. Therefore it makes existing transfer-based adversarial attacks ineffective as an anti-forensic method. Our perturbation analysis reveals the intuitive reason of the perceptual degradation issue when directly applying such existing attacks. We then propose a novel adversarial attack method, better suitable for image anti-forensics, in the transformed color domain by considering visual perception. Conceptually simple yet effective, the proposed method can fool both deep learning and non-deep learning based forensic detectors, achieving higher adversarial transferability and significantly improved visual quality. Specially, when adversaries consider imperceptibility as a constraint, the proposed anti-forensic method achieves the state-of-the-art attacking performances in the transfer-based black-box setting (i.e. around 30% higher attack transferability than baseline attacks). More imperceptible and more transferable, the proposed method raises new security concerns to fake face imagery detection. We have released our code for public use, and hopefully the proposed method can be further explored in related forensic applications as an anti-forensic benchmark.
Article
With the development of video and image processing technology, the field of video tampering forensics is facing enormous challenges. Specifically, as the fundamental basis of judicial forensics, passive forensics for object removal video forgery is particularly essential. To extract tampering traces in video more sufficiently, the author proposed a spatiotemporal trident network based on the spatial rich model (SRM) and 3D convolution (C3D), which provides three branches and can theoretically improve the detection and localization accuracy of tampered regions. Based on the spatiotemporal trident network, a temporal detector and a spatial locator were designed to detect and locate the tampered regions in the temporal and spatial domains of videos. For the temporal detector, 3D CNNs were employed in three branches as the encoders and a bidirectional long short-term memory (BiLSTM) as the decoder. For the spatial locator, a backbone network named C3D-ResNet12 was designed as the encoder of the three branches, and the region proposal networks (RPNs) were employed as the decoders in three branches. In addition, we optimized the loss functions of the above two algorithms based on focal loss and GIoU loss. The experimental results revealed the effectiveness of spatiotemporal detection and localization algorithms: for temporal forgery detection, the accuracy of the frame classification increased to 99+%; for spatial forgery localization, the successful localization rate of the tampered regions in forged frames reached 96+%, and the mean intersection over union of the located tampered regions and the real tampered regions reached 62+%.
Article
Deep neural networks are vulnerable to adversarial attacks. More importantly, some adversarial examples crafted against an ensemble of source models transfer to other target models and, thus, pose a security threat to black-box applications (when attackers have no access to the target models). Current transfer-based ensemble attacks, however, only consider a limited number of source models to craft an adversarial example and, thus, obtain poor transferability. Besides, recent query-based black-box attacks, which require numerous queries to the target model, not only come under suspicion by the target model but also cause expensive query cost. In this article, we propose a novel transfer-based black-box attack, dubbed serial-minigroup-ensemble-attack (SMGEA). Concretely, SMGEA first divides a large number of pretrained white-box source models into several “minigroups.” For each minigroup, we design three new ensemble strategies to improve the intragroup transferability. Moreover, we propose a new algorithm that recursively accumulates the “long-term” gradient memories of the previous minigroup to the subsequent minigroup. This way, the learned adversarial information can be preserved, and the intergroup transferability can be improved. Experiments indicate that SMGEA not only achieves state-of-the-art black-box attack ability over several data sets but also deceives two online black-box saliency prediction systems in real world, i.e., DeepGaze-II ( https://deepgaze.bethgelab.org/ ) and SALICON ( http://salicon.net/demo/ ). Finally, we contribute a new code repository to promote research on adversarial attack and defense over ubiquitous pixel-to-pixel computer vision tasks. We share our code together with the pretrained substitute model zoo at https://github.com/CZHQuality/AAA-Pix2pix .
Article
In this paper, a CG (Computer-generated graphics) facial image regeneration scheme for anti-forensics based on generative adversarial network (CGR-GAN) is proposed. The generator of CGR-GAN utilizes a deep U-Net structure, and its discriminator utilizes some stacked convolution layers. Besides, content loss and style loss are both designed to guarantee that the regenerated CG facial images (CGR) retain both the facial profile of the original CG and the characteristics of NI (Natural image). Experimental results and analysis demonstrate that the CG facial images regenerated by the proposed anti-forensics scheme can achieve better visual quality compared with those of the existing CG facial image anti-forensics and domain adaptation methods, and it can strike a good balance between visual quality and deception ability.
Article
Making computer-generated (CG) images more difficult to detect is an interesting problem in computer graphics and security. While most approaches focus on the image rendering phase, this paper presents a method based on increasing the naturalness of CG facial images from the perspective of spoofing detectors. The proposed method is implemented using a convolutional neural network (CNN) comprising two autoencoders and a transformer and is trained using a black-box discriminator without gradient information. Over 50% of the transformed CG images were not detected by three state-of-the-art spoofing detectors. This capability raises an alarm regarding the reliability of facial authentication systems, which are becoming widely used in daily life.
Article
Most works on adversarial examples for deep-learning based image classifiers use noise that, while small, covers the entire image. We explore the case where the noise is allowed to be visible but confined to a small, localized patch of the image, without covering any of the main object(s) in the image. We show that it is possible to generate localized adversarial noises that cover only 2% of the pixels in the image, none of them over the main object, and that are transferable across images and locations, and successfully fool a state-of-the-art Inception v3 model with very high success rates.
Article
We present a method to create universal, robust, targeted adversarial image patches in the real world. The patches are universal because they can be used to attack any scene, robust because they work under a wide variety of transformations, and targeted because they can cause a classifier to output any target class. These adversarial patches can be printed, added to any scene, photographed, and presented to image classifiers; even when the patches are small, they cause the classifiers to ignore the other items in the scene and report a chosen target class.
Article
This paper investigates strategies that defend against adversarial-example attacks on image-classification systems by transforming the inputs before feeding them to the system. Specifically, we study applying image transformations such as bit-depth reduction, JPEG compression, total variance minimization, and image quilting before feeding the image to a convolutional network classifier. Our experiments on ImageNet show that total variance minimization and image quilting are very effective defenses in practice, in particular, when the network is trained on transformed images. The strength of those defenses lies in their non-differentiable nature and their inherent randomness, which makes it difficult for an adversary to circumvent the defenses. Our best defense eliminates 60% of strong white-box and 90% of strong black-box attacks by a variety of major attack methods
Article
We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses. This both speeds the training up and greatly stabilizes it, allowing us to produce images of unprecedented quality, e.g., CelebA images at 1024^2. We also propose a simple way to increase the variation in generated images, and achieve a record inception score of 8.80 in unsupervised CIFAR10. Additionally, we describe several implementation details that are important for discouraging unhealthy competition between the generator and discriminator. Finally, we suggest a new metric for evaluating GAN results, both in terms of image quality and variation. As an additional contribution, we construct a higher-quality version of the CelebA dataset.
Conference Paper
Deep learning has shown impressive performance on hard perceptual problems. However, researchers found deep learning systems to be vulnerable to small, specially crafted perturbations that are imperceptible to humans. Such perturbations cause deep learning systems to mis-classify adversarial examples, with potentially disastrous consequences where safety or security is crucial. Prior defenses against adversarial examples either targeted specific attacks or were shown to be ineffective. We propose MagNet, a framework for defending neural network classifiers against adversarial examples. MagNet neither modifies the protected classifier nor requires knowledge of the process for generating adversarial examples. MagNet includes one or more separate detector networks and a reformer network. The detector networks learn to differentiate between normal and adversarial examples by approximating the manifold of normal examples. Since they assume no specific process for generating adversarial examples, they generalize well. The reformer network moves adversarial examples towards the manifold of normal examples, which is effective for correctly classifying adversarial examples with small perturbation. We discuss the intrinsic difficulties in defending against whitebox attack and propose a mechanism to defend against graybox attack. Inspired by the use of randomness in cryptography, we use diversity to strengthen MagNet. We show empirically that MagNet is effective against the most advanced state-of-the-art attacks in blackbox and graybox scenarios without sacrificing false positive rate on normal examples.
Article
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry