Figure - available from: Chinese Journal of Electronics
This content is subject to copyright. Terms and conditions apply.
Source publication
With the development of face image synthesis and generation technology based on generative adversarial networks (GANs), it has become a research hotspot to determine whether a given face image is natural or generated. However, the generalization capability of the existing algorithms is still to be improved. Therefore, this paper proposes a general...
Similar publications
Collider searches face the challenge of defining a representation of high-dimensional data such that physical symmetries are manifest, the discriminating features are retained, and the choice of representation is new-physics agnostic. We introduce JetCLR to solve the mapping from low-level data to optimized observables through self-supervised contr...
Human faces provide demographic information, such as gender and ethnicity. Different modalities of human faces, e.g., range and intensity, provide different cues for gender and ethnicity identifications. In this paper we exploit the range information of human faces for ethnicity identification using a support vector machine. An integration scheme i...
Remote sensing image scene classification has drawn extensive attention for its wide application in various scenarios. Scene classification in many practical cases faces the challenge of few-shot conditions. The major difficulty of few-shot remote sensing image scene classification is how to extract effective features from insufficient labeled data...
In recent years, Generative Adversarial Networks (GANs) have become a hot topic among researchers and engineers that work with deep learning. It has been a ground-breaking technique which can generate new pieces of content of data in a consistent way. The topic of GANs has exploded in popularity due to its applicability in fields like image generat...
Password generation is a very common security issue. Users are faced with the problem of choosing good passwords and trying to make sure that the password they choose is as robust as possible. In this project, we train a generative adversarial network [?] with a generator that tries to generate guesses to break a password scheme and a discriminator...
Citations
... For instance, McCloskey et al. [6] leveraged red-green bivariate histograms and abnormal pixel exposure ratios for detection, while Agarwal et al. [7] exploited highfrequency artifacts stemming from GANs' upsampling processes. Chen et al. [8] integrated both global and local image features, utilizing metric learning to enhance the overall detection performance of model. In terms of network architectures, Convolutional Neural Networks (CNNs) remain a predominant choice for deepfake detection tasks due to their capacity to extract semantic, color, and texture information [9], as seen in Fu et al. [10] dual-channel CNN architecture, which is capable of concurrently processing both high-and low-frequency image components. ...
Within the domain of Artificial Intelligence Generated Content (AIGC), technological strides in image generation have been marked, resulting in the proliferation of deepfake images that pose substantial security threats. The current landscape of deepfake detection technologies is marred by limited generalization across diverse generative models and a subpar detection rate for images generated through diffusion processes. In response to these challenges, this paper introduces a novel detection model designed for high generalizability, leveraging multiscale frequency and spatial domain features. Our model harnesses an array of specialized filters to extract frequency-domain characteristics, which are then integrated with spatial-domain features captured by a Feature Pyramid Network (FPN). The integration of the Attentional Feature Fusion (AFF) mechanism within the feature fusion module allows for the optimal utilization of the extracted features, thereby enhancing detection capabilities. We curated an extensive dataset encompassing deepfake images from a variety of GANs and diffusion models for rigorous evaluation. The experimental findings reveal that our proposed model achieves superior accuracy and generalization compared to existing baseline models when confronted with deepfake images from multiple generative sources. Notably, in cross-model detection scenarios, our model outperforms the next best model by a significant margin of 29.1% for diffusion-generated images and 15.1% for GAN-generated images. This accomplishment presents a viable solution to the pressing issues of generalization and adaptability in the field of deepfake detection.
... However, when confronted with fake face images where only local areas are generated, searching for feature differences directly on the entire generated face image may lead to detection failure. Therefore, [4,5,8,38] also combine local information such as artifacts to assist in detection. Although the methods described above achieve relatively high detection accuracy, they suffer from poor generalization and a lack of interpretability [21]. ...
... Besides a cosine scheduler for warm start. Training stops when the learning (8) It can be seen that the output feature A 1 Cross contains global information of f 1 for each pixel, and A 2 Cross is the same. CCA promotes the generation of more discriminative features for semantically similar regions between support and query images, allowing the network to adjust its "focus" on the images during testing. ...
The rapid development of the Generative Adversarial Network (GAN) makes generated face images more and more visually indistinguishable, and the detection performance of previous methods will degrade seriously when the testing samples are out-of-sample datasets or have been post-processed. To address the above problems, we propose a new relational embedding network based on “what to observe” and “where to attend” from a relational perspective for the task of generated face detection. In addition, we designed two attention modules to effectively utilize global and local features. Specifically, the dual-self attention module selectively enhances the representation of local features through both image space and channel dimensions. The cross-correlation attention module computes similarity between images to capture the global information of the output in the image. We conducted extensive experiments to validate our method, and the proposed algorithm can effectively extract the correlations between features and achieve satisfactory generalization and robustness in generating face detection. In addition, we also explored the design of the model structure and the inspection performance on more categories of generated images (not limited to faces). The results show that RENet also has good detection performance on datasets other than faces.
... Most deepfake detectors can achieve good performance when applied intra-domain scenarios, but their performance may encounter dramatic degradation when subject to cross-domain settings [1][2][3][4][5][6][7][8][9]. ...
Most existing deepfake detection methods often fail to maintain their performance when confronting new test domains. To address this issue, we propose a generalizable deepfake detection system to implement style diversification by alternately learning the domain generalization (DG)-based detector and the stylized fake face synthesizer (SFFS). For the DG-based detector, we first adopt instance normalization- and batch normalization-based structures to extract the local and global image statistics as the style and content features, which are then leveraged to obtain the more diverse feature space. Subsequently, contrastive learning is used to emphasize common style features while suppressing domain-specific ones, and adversarial learning is performed to obtain the domain-invariant features. These optimized features help the DG-based detector to learn generalized classification features and also encourage the SFFS to simulate possibly unseen domain data. In return, the samples generated by the SFFS would contribute to the detector's learning of more generalized features from augmented training data. Such a joint learning and training process enhances both the detector's and the synthesizer's feature representation capability for generalizable deepfake detection. Experimental results demonstrate that our method outperforms the state-of-the-art competitors not only in intra-domain tests but particularly in cross-domain tests.
... Existing detection methods can be roughly divided into three categories. Deep learning-based detectors [6][7][8][9] directly learn features from the raw image, alleviating the burden of constructing handcrafted features. For example, Barni et al. compute the cooccurrence of images to train a deep neural network for identifying synthesized faces [7], and Wang et al. observe that the neurons in the network 'react' differently when processing authentic and generated images [8]. ...
Advances in Generative adversarial networks (GAN) have significantly improved the quality of synthetic facial images, posing threats to many vital areas. Thus, identifying whether a presented facial image is synthesized is of forensic importance. Our fundamental discovery is the lack of capillaries in the sclera of the GAN-generated faces, which is caused by the lack of physical/physiological constraints in the GAN model. Because there are more or fewer capillaries in people’s eyes, one can distinguish real faces from GAN-generated ones by carefully examining the sclera area. Following this idea, we first extract the sclera area from a probe image, then feed it into a residual attention network to distinguish GAN-generated faces from real ones. The proposed method is validated on the Flickr-Faces-HQ and StyleGAN2/StyleGAN3-generated face datasets. Experiments demonstrate that the capillary in the sclera is a very effective feature for identifying GAN-generated faces. Our code is available at: https://github.com/10961020/Deepfake-detector-based-on-blood-vessels.
... The work [43] developed a framework for evaluating detection methods under crossmodel, cross-data, and post-processing evaluations, to examine features produced from commonly-used image pre-processing methods. More recently, many variants of feature-based models have been studied [110,30,68,16]. However, the detection results from all these feature-based methods are not explainable, so it is unclear why the decision was given to any input face. ...
Generative Adversarial Networks (GAN) have led to the generation of very realistic face images, which have been used in fake social media accounts and other disinformation matters that can generate profound impacts. Therefore, the corresponding GAN-face detection techniques are under active development that can examine and expose such fake faces. In this work, we aim to provide a comprehensive review of recent progress in GAN-face detection. We focus on methods that can detect face images that are generated or synthesized from GAN models. We classify the existing detection works into four categories: (1) deep learning-based, (2) physical-based, (3) physiological-based methods, and (4) evaluation and comparison against human visual performance. For each category, we summarize the key ideas and connect them with method implementations. We also discuss open problems and suggest future research directions.
... On both natural and GAN-generated datasets, the proposed approach achieves a high detection accuracy of over 0.99 by combining global and local features, increasing learning on significant face regions with key points, and using metric learning for feature extraction [32]. With straightforward yet successful qualitative and quantitative evaluations, another approach uses discrepancies in corneal specular and highlights between the eyes in GAN-synthesized faces to discriminate between genuine and synthetic faces [33]. ...
In court, criminal investigations and identity management tools, like check-in and payment logins, face videos, and photos, are used as evidence more frequently. Although deeply falsified information may be found using deep learning classifiers, block-box decisionmaking makes forensic investigation in criminal trials more challenging. Therefore, the research suggests a three-step classification technique to classify the deceptive deepfake image content. The research examines the visual assessments of an EfficientNet and Shifted Window Transformer (SWinT) hybrid model based on Convolutional Neural Network (CNN) and Transformer architectures. The classifier generality is improved in the first stage using a different augmentation. Then, the hybrid model is developed in the second step by combining the EfficientNet and Shifted Window Transformer architectures. Next, the GradCAM approach for assessing human understanding demonstrates deepfake visual interpretation. In 14,204 images for the validation set, there are 7,096 fake photos and 7,108 real images. In contrast to focusing only on a few discrete face parts, the research shows that the entire deepfake image should be investigated. On a custom dataset of real, Generative Adversarial Networks (GAN)-generated, and human-altered web photos, the proposed method achieves an accuracy of 98.45%, a recall of 99.12%, and a loss of 0.11125. The proposed method successfully distinguishes between real and manipulated images. Moreover, the presented approach can assist investigators in clarifying the composition of the artificially produced material.
... The results of the experiments reveal that the proposed technique achieves a higher generalization accuracy than the modern current methods. In addition, the projected method is strong in contrast to attacks such as the insertion of Gaussian noise and blur [17]. Other systems cannot employ all the deepfake models. ...
... In recent years, some approaches [18][19][20] have synthesized the local and global features to detect forgeries. The above methods focus on the point that GAN-generated faces are more likely to produce traces in local regions, so they strengthen the forgery detection in the local area and use it to supply the global detection results. ...
Media content forgery is widely spread over the Internet and has raised severe societal concerns. With the development of deep learning, new technologies such as generative adversarial networks (GANs) and media forgery technology have already been utilized for politicians and celebrity forgery, which has a terrible impact on society. Existing GAN-generated face detection approaches rely on detecting image artifacts and the generated traces. However, these methods are model-specific, and the performance is deteriorated when faced with more complicated methods. What’s more, it is challenging to identify forgery images with perturbations such as JPEG compression, gamma correction, and other disturbances. In this paper, we propose a global–local facial fusion network, namely GLFNet, to fully exploit the local physiological and global receptive features. Specifically, GLFNet consists of two branches, i.e., the local region detection branch and the global detection branch. The former branch detects the forged traces from the facial parts, such as the iris and pupils. The latter branch adopts a residual connection to distinguish real images from fake ones. GLFNet obtains forged traces through various ways by combining physiological characteristics with deep learning. The method is stable with physiological properties when learning the deep learning features. As a result, it is more robust than the single-class detection methods. Experimental results on two benchmarks have demonstrated superiority and generalization compared with other methods.
... In particular, points in our extreme sectors seem to lie outside of the generative range of StyleGAN, or to be severely underrepresented (Fig. 13). The problem is possibly also related to the well-known fact that faces generated by StyleGAN (and other generative networks) can be easily distinguished from reals [78][79][80]. ...
Different encodings of datapoints in the latent space of latent-vector generative models may result in more or less effective and disentangled characterizations of the different explanatory factors of variation behind the data. Many works have been recently devoted to the exploration of the latent space of specific models, mostly focused on the study of how features are disentangled and of how trajectories producing desired alterations of data in the visible space can be found. In this work we address the more general problem of comparing the latent spaces of different models, looking for transformations between them. We confined the investigation to the familiar and largely investigated case of generative models for the data manifold of human faces. The surprising, preliminary result reported in this article is that (provided models have not been taught or explicitly conceived to act differently) a simple linear mapping is enough to pass from a latent space to another while preserving most of the information. This is full of consequences for representation learning, potentially paving the way to the transformation of editing trajectories from one space to another, or the adaptation of disentanglement techniques between different generative domains.
... In particular, points in our extreme sectors seem to lie outside of the generative range of StyleGAN, or to be severely underrepresented (Figure 13). The problem is possibly also related to the well-known fact that faces generated by StyleGAN (and other generative networks) can be easily distinguished from reals [80][81][82]. 12: Gradient ascent technique for StyleGAN on data in the Support Set. The original is in the first row, and the image generated through gradient ascent, in the second. ...
Different encodings of datapoints in the latent space of latent-vector generative models may result in more or less effective and disentangled characterizations of the different explanatory factors of variation behind the data. Many works have been recently devoted to the explorationof the latent space of specific models, mostly focused on the study of how features are disentangled and of how trajectories producing desired alterations of data in the visible space can be found. In this work we address the more general problem of comparing the latent spaces of different models, looking for transformations between them. We confined the investigation to the familiar and largely investigated case of generative models for the data manifold of human faces. The surprising, preliminary result reported in this article is that (provided models have not been taught or explicitly conceived to act differently) a simple linear mapping is enough to pass from a latent space to another while preserving most of the information.