Xufei Zheng’s research while affiliated with Southwest University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (19)


Exploring Gradient-Guided Masked Language Model to Detect Textual Adversarial Attacks
  • Preprint
  • File available

April 2025

·

58 Reads

Xiaomei Zhang

·

Zhaoxi Zhang

·

·

[...]

·

Textual adversarial examples pose serious threats to the reliability of natural language processing systems. Recent studies suggest that adversarial examples tend to deviate from the underlying manifold of normal texts, whereas pre-trained masked language models can approximate the manifold of normal data. These findings inspire the exploration of masked language models for detecting textual adversarial attacks. We first introduce Masked Language Model-based Detection (MLMD), leveraging the mask and unmask operations of the masked language modeling (MLM) objective to induce the difference in manifold changes between normal and adversarial texts. Although MLMD achieves competitive detection performance, its exhaustive one-by-one masking strategy introduces significant computational overhead. Our posterior analysis reveals that a significant number of non-keywords in the input are not important for detection but consume resources. Building on this, we introduce Gradient-guided MLMD (GradMLMD), which leverages gradient information to identify and skip non-keywords during detection, significantly reducing resource consumption without compromising detection performance.

Download


Masked Language Model Based Textual Adversarial Example Detection

April 2023

·

22 Reads

Adversarial attacks are a serious threat to the reliable deployment of machine learning models in safety-critical applications. They can misguide current models to predict incorrectly by slightly modifying the inputs. Recently, substantial work has shown that adversarial examples tend to deviate from the underlying data manifold of normal examples, whereas pre-trained masked language models can fit the manifold of normal NLP data. To explore how to use the masked language model in adversarial detection, we propose a novel textual adversarial example detection method, namely Masked Language Model-based Detection (MLMD), which can produce clearly distinguishable signals between normal examples and adversarial examples by exploring the changes in manifolds induced by the masked language model. MLMD features a plug and play usage (i.e., no need to retrain the victim model) for adversarial defense and it is agnostic to classification tasks, victim model's architectures, and to-be-defended attack methods. We evaluate MLMD on various benchmark textual datasets, widely studied machine learning models, and state-of-the-art (SOTA) adversarial attacks (in total 344=483*4*4 = 48 settings). Experimental results show that MLMD can achieve strong performance, with detection accuracy up to 0.984, 0.967, and 0.901 on AG-NEWS, IMDB, and SST-2 datasets, respectively. Additionally, MLMD is superior, or at least comparable to, the SOTA detection defenses in detection accuracy and F1 score. Among many defenses based on the off-manifold assumption of adversarial examples, this work offers a new angle for capturing the manifold change. The code for this work is openly accessible at \url{https://github.com/mlmddetection/MLMDdetection}.



Evaluating Membership Inference Through Adversarial Robustness

October 2022

·

10 Reads

·

16 Citations

The Computer Journal

The usage of deep learning is being escalated in many applications. Due to its outstanding performance, it is being used in a variety of security and privacy-sensitive areas in addition to conventional applications. One of the key aspects of deep learning efficacy is to have abundant data. This trait leads to the usage of data which can be highly sensitive and private, which in turn causes wariness with regard to deep learning in the general public. Membership inference attacks are considered lethal as they can be used to figure out whether a piece of data belongs to the training dataset or not. This can be problematic with regard to leakage of training data information and its characteristics. To highlight the significance of these types of attacks, we propose an enhanced methodology for membership inference attacks based on adversarial robustness, by adjusting the directions of adversarial perturbations through label smoothing under a white-box setting. We evaluate our proposed method on three datasets: Fashion-MNIST, CIFAR-10 and CIFAR-100. Our experimental results reveal that the performance of our method surpasses that of the existing adversarial robustness-based method when attacking normally trained models. Additionally, through comparing our technique with the state-of-the-art metric-based membership inference methods, our proposed method also shows better performance when attacking adversarially trained models. The code for reproducing the results of this work is available at https://github.com/plll4zzx/Evaluating-Membership-Inference-Through-Adversarial-Robustness.


Average distance for normally trained model on Fashion's member dataset.Average distance for normally trained model on Fashion's non-member dataset.
Average distance for normally trained model on CIFAR-100's member dataset.
Average distance for normally trained model on CIFAR-100's non-member dataset.
Average distance for PGD-AT trained model on Fashion's member dataset.
Average distance for PGD-AT trained model on Fashion's non-member dataset.

+1

Evaluating Membership Inference Through Adversarial Robustness

May 2022

·

44 Reads

·

1 Citation

The usage of deep learning is being escalated in many applications. Due to its outstanding performance, it is being used in a variety of security and privacy-sensitive areas in addition to conventional applications. One of the key aspects of deep learning efficacy is to have abundant data. This trait leads to the usage of data which can be highly sensitive and private, which in turn causes wariness with regard to deep learning in the general public. Membership inference attacks are considered lethal as they can be used to figure out whether a piece of data belongs to the training dataset or not. This can be problematic with regards to leakage of training data information and its characteristics. To highlight the significance of these types of attacks, we propose an enhanced methodology for membership inference attacks based on adversarial robustness, by adjusting the directions of adversarial perturbations through label smoothing under a white-box setting. We evaluate our proposed method on three datasets: Fashion-MNIST, CIFAR-10, and CIFAR-100. Our experimental results reveal that the performance of our method surpasses that of the existing adversarial robustness-based method when attacking normally trained models. Additionally, through comparing our technique with the state-of-the-art metric-based membership inference methods, our proposed method also shows better performance when attacking adversarially trained models. The code for reproducing the results of this work is available at \url{https://github.com/plll4zzx/Evaluating-Membership-Inference-Through-Adversarial-Robustness}.


Figure 2: Overview of DRR.
Detailed settings for the experiment.
Self-Supervised Adversarial Example Detection by Disentangled Representation

May 2021

·

82 Reads

Deep learning models are known to be vulnerable to adversarial examples that are elaborately designed for malicious purposes and are imperceptible to the human perceptual system. Autoencoder, when trained solely over benign examples, has been widely used for (self-supervised) adversarial detection based on the assumption that adversarial examples yield larger reconstruction error. However, because lacking adversarial examples in its training and the too strong generalization ability of autoencoder, this assumption does not always hold true in practice. To alleviate this problem, we explore to detect adversarial examples by disentangled representations of images under the autoencoder structure. By disentangling input images as class features and semantic features, we train an autoencoder, assisted by a discriminator network, over both correctly paired class/semantic features and incorrectly paired class/semantic features to reconstruct benign and counterexamples. This mimics the behavior of adversarial examples and can reduce the unnecessary generalization ability of autoencoder. Compared with the state-of-the-art self-supervised detection methods, our method exhibits better performance in various measurements (i.e., AUC, FPR, TPR) over different datasets (MNIST, Fashion-MNIST and CIFAR-10), different adversarial attack methods (FGSM, BIM, PGD, DeepFool, and CW) and different victim models (8-layer CNN and 16-layer VGG). We compare our method with the state-of-the-art self-supervised detection methods under different adversarial attacks and different victim models (30 attack settings), and it exhibits better performance in various measurements (AUC, FPR, TPR) for most attacks settings. Ideally, AUC is 1 and our method achieves 0.99+ on CIFAR-10 for all attacks. Notably, different from other Autoencoder-based detectors, our method can provide resistance to the adaptive adversary.


Classification of Motor Imagery EEG Based on Time-Domain and Frequency-Domain Dual-Stream Convolutional Neural Network

April 2021

·

58 Reads

·

41 Citations

IRBM

Background and objective An important task of the brain-computer interface (BCI) of motor imagery is to extract effective time-domain features,frequency-domain features or time-frequency domain features from the raw electroencephalogram (EEG) signals for classification of motor imagery. However, choosing an appropriate method to combine time domain and frequency domain features to improve the performance of motor imagery recognition is still a research hotspot. Methods In order to fully extract and utilize the time-domain and frequency-domain features of EEG in classification tasks, this paper proposed a novel dual-stream convolutional neural network (DCNN), which can use time domain signal and frequency domain signal as the inputs, and the extracted time-domain features and frequency-domain features are fused by linear weighting for classification training. Furthermore, the weight can be learned by the DCNN automatically. Results The experiments based on BCI competition II dataset III and BCI competition IV dataset 2a showed that the model proposed by this study has better performance than other conventional methods. The model used time-frequency signal as the inputs had better performance than the model only used time-domain signals or frequency-domain signals. The accuracy of classification was improved for each subject compared with the models only used one signals as the inputs. Conclusions Further analysis shown that the fusion weight of different subject is specifically, adjusting the weight coefficient automatically is helpful to improve the classification accuracy.


The dual negative selection algorithm and its application for network anomaly detection

January 2017

·

15 Reads

·

3 Citations

International Journal of Information and Communication Technology

Negative selection algorithm (NSA) is an important artificial immune detectors generation method for network anomaly detection. In this paper, we put forward the dual negative selection algorithm (DNSA) which includes two negative selection processes. In the first negative selection process, every randomly generated candidate detector tolerates with mature detector set and becomes semi-mature detector when not matches with any existing mature detectors. In the second negative selection process, the semi-mature detector tolerates with self set and becomes mature detector when not matches with any self element. The DNSA avoids the unnecessary and time-consuming self-tolerance process of candidate detector within the coverage of existing mature detectors, thus greatly reduces detector set size, and significantly improves detector generation efficiency. Theoretical analysis and simulations show that the DNSA effectively improves detector generation efficiency, and more suitable for network anomaly detection than traditional NSAs.



Citations (13)


... Among these, contrastive learning has gained prominence due to its stronger transferability and stable, fast convergence during the training process. However, these models may inadvertently expose private information contained within training data, such as gradient inversion [52], inference attack [4,33,39], adversarial example [47,49], and data poisoning [43]. They expose vulnerabilities in model training and deployment, raising widespread societal concerns and prompting extensive policy discussions. ...

Reference:

When Better Features Mean Greater Risks: The Performance-Privacy Trade-Off in Contrastive Learning
Masked Language Model Based Textual Adversarial Example Detection
  • Citing Conference Paper
  • July 2023

... Among these, contrastive learning has gained prominence due to its stronger transferability and stable, fast convergence during the training process. However, these models may inadvertently expose private information contained within training data, such as gradient inversion [52], inference attack [4,33,39], adversarial example [47,49], and data poisoning [43]. They expose vulnerabilities in model training and deployment, raising widespread societal concerns and prompting extensive policy discussions. ...

Self-Supervised Adversarial Example Detection by Disentangled Representation
  • Citing Conference Paper
  • December 2022

... This dual role-as both a privacy threat and a regulatory metric-underscores the critical importance of understanding and mitigating MIA risks, particularly in self-supervised learning (SSL) frameworks where traditional defenses may fall short. Most existing MIA studies focus on enhancing the efficiency of membership inference in classification tasks [2,29,41] or other specific tasks [16,18,40], whereas how to balance model performance and privacy protection is currently lacking [31,48]. This gap is particularly evident in self-supervised learning models, where traditional MIA strategies are often ineffective due to the models' unsupervised training objectives and frameworks. ...

Evaluating Membership Inference Through Adversarial Robustness
  • Citing Article
  • October 2022

The Computer Journal

... Meanwhile, the amplitude of the Mu and Beta rhythms increases in the ipsilateral motor-sensory area, a phenomenon referred to as ERS. The ERD/ERS phenomena during the left and right-hand MI exhibit lateralized activation patterns in the sensorimotor cortex, specifically localized to the electrode positions C3 (left hemisphere) and C4 (right hemisphere) (Huang et al 2022a). Based on this neurophysiological mechanism, we select the central dipoles closest to the C3, C4, and Cz (central region) electrodes of the brain. ...

Classification of Motor Imagery EEG Based on Time-Domain and Frequency-Domain Dual-Stream Convolutional Neural Network
  • Citing Article
  • April 2021

IRBM

... The EPAADPS performed better than the NSA in experiments. The subspace density technique was used to improve NSA in generating optimal detectors [27], and a dual NSA algorithm was proposed to produce potent and mature detectors for network anomaly detection [28]. The theory of Delaunay triangulation integrated with the negative selection algorithm (described as ASTC-RNSA) was proposed in Ref. [29]. ...

The dual negative selection algorithm and its application for network anomaly detection
  • Citing Article
  • January 2017

International Journal of Information and Communication Technology

... It's hard to analyze EEG by linear method, such as time domain analysis and frequency domain analysis because of the noregularity caused by nonlinear and non-stationary factors. Therefore non-linear analysis methods could better facilitate opening out the characteristics and mechanisms of EEG [2] [15]. ...

Feature extraction of time-amplitude-frequency analysis for classifying single 116192093
  • Citing Article
  • June 2014

Journal of Fiber Bioengineering and Informatics

... Two methods, namely the derivative reconstruction method and the coordinate delay reconstruction method, have been utilized for phase space reconstruction. By reconstructing the phase space of EEG signals, the sensitivity of attractors can enhance the classification accuracy of Brain-Computer Interfaces (BCIs) based on phase space features [33], [34]. The choice of embedding dimension (m) and time delay (τ ) is crucial during the phase space reconstruction process, as they directly impact the accuracy of the invariant properties of the resulting strange attractors. ...

Extracting features from phase space of EEG signals in brain—Computer interfaces
  • Citing Article
  • March 2015

Neurocomputing

... CB-RNSA is based on the hierarchical clustering of the self-set, the detection rate of CB-RNSA is higher than that of the classic NSA, V-detector algorithms, and the false alarm rate is lower than the same algorithms [70]. PRR-2NSA is a dual NSA based on pattern recognition receptors theory [71]. The real NSA based on the grid file of feature space (GF-RNSA) aims to improve the exponential worst-case complexity of existing NSA algorithms [72]. ...

The Dual Negative Selection Algorithm Based on Pattern Recognition Receptor Theory and Its Application in Two-class Data Classification
  • Citing Article
  • August 2013

Journal of Computers

... A virus detection approach is introduced based on Immune Concentration Algorithm in [85]. The authors of [86] formulate a multi-layered immune network intrusion detection defense model (MINID) as a pattern recognition task based on the theory of Pattern Recognition Receptors (PRR). ...

A Novel Multi-layered Immune Network Intrusion Detection Defense Model: MINID
  • Citing Article
  • March 2013

Journal of Networks

... Based on the generation process, [22] incorporated further training strategies into the training phase to generate self-detectors covering self-regions. Meanwhile, Zheng et al. [23] added a negative selection when the detector was generated to avoid redundant coverage between mature detectors. In terms of incorporating other algorithms, Aydin et al. [24] applied chaotic mapping for parameter selection, which obtained a better coverage. ...

Dual negative selection algorithm
  • Citing Article
  • April 2013

Scientia Sinica Informationis