Yichao Wu’s research while affiliated with Lee's Pharmaceutical (Hong Kong) Limited and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (39)


A Renaissance of Explicit Motion Information Mining from Transformers for Action Recognition
  • Preprint

January 2024

·

9 Reads

Peiqin Zhuang

·

·

Yichao Wu

·

[...]

·

Wanli Ouyang



Improving Robust Fariness via Balance Adversarial Training

June 2023

·

5 Reads

·

5 Citations

Proceedings of the AAAI Conference on Artificial Intelligence

Adversarial training (AT) methods are effective against adversarial attacks, yet they introduce severe disparity of accuracy and robustness between different classes, known as the robust fairness problem. Previously proposed Fair Robust Learning (FRL) adaptively reweights different classes to improve fairness. However, the performance of the better-performed classes decreases, leading to a strong performance drop. In this paper, we observed two unfair phenomena during adversarial training: different difficulties in generating adversarial examples from each class (source-class fairness) and disparate target class tendencies when generating adversarial examples (target-class fairness). From the observations, we propose Balance Adversarial Training (BAT) to address the robust fairness problem. Regarding source-class fairness, we adjust the attack strength and difficulties of each class to generate samples near the decision boundary for easier and fairer model learning; considering target-class fairness, by introducing a uniform distribution constraint, we encourage the adversarial example generation process for each class with a fair tendency. Extensive experiments conducted on multiple datasets (CIFAR-10, CIFAR-100, and ImageNette) demonstrate that our BAT can significantly outperform other baselines in mitigating the robust fairness problem (+5-10\% on the worst class accuracy)(Our codes can be found at https://github.com/silvercherry/Improving-Robust-Fairness-via-Balance-Adversarial-Training).


Latent Distribution Adjusting for Face Anti-Spoofing

May 2023

·

11 Reads

With the development of deep learning, the field of face anti-spoofing (FAS) has witnessed great progress. FAS is usually considered a classification problem, where each class is assumed to contain a single cluster optimized by softmax loss. In practical deployment, one class can contain several local clusters, and a single-center is insufficient to capture the inherent structure of the FAS data. However, few approaches consider large distribution discrepancies in the field of FAS. In this work, we propose a unified framework called Latent Distribution Adjusting (LDA) with properties of latent, discriminative, adaptive, generic to improve the robustness of the FAS model by adjusting complex data distribution with multiple prototypes. 1) Latent. LDA attempts to model the data of each class as a Gaussian mixture distribution, and acquire a flexible number of centers for each class in the last fully connected layer implicitly. 2) Discriminative. To enhance the intra-class compactness and inter-class discrepancy, we propose a margin-based loss for providing distribution constrains for prototype learning. 3) Adaptive. To make LDA more efficient and decrease redundant parameters, we propose Adaptive Prototype Selection (APS) by selecting the appropriate number of centers adaptively according to different distributions. 4) Generic. Furthermore, LDA can adapt to unseen distribution by utilizing very few training data without re-training. Extensive experiments demonstrate that our framework can 1) make the final representation space both intra-class compact and inter-class separable, 2) outperform the state-of-the-art methods on multiple standard FAS benchmarks.


Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework

May 2023

·

11 Reads

Denoising diffusion models have shown remarkable potential in various generation tasks. The open-source large-scale text-to-image model, Stable Diffusion, becomes prevalent as it can generate realistic artistic or facial images with personalization through fine-tuning on a limited number of new samples. However, this has raised privacy concerns as adversaries can acquire facial images online and fine-tune text-to-image models for malicious editing, leading to baseless scandals, defamation, and disruption to victims' lives. Prior research efforts have focused on deriving adversarial loss from conventional training processes for facial privacy protection through adversarial perturbations. However, existing algorithms face two issues: 1) they neglect the image-text fusion module, which is the vital module of text-to-image diffusion models, and 2) their defensive performance is unstable against different attacker prompts. In this paper, we propose the Adversarial Decoupling Augmentation Framework (ADAF), addressing these issues by targeting the image-text fusion module to enhance the defensive performance of facial privacy protection algorithms. ADAF introduces multi-level text-related augmentations for defense stability against various attacker prompts. Concretely, considering the vision, text, and common unit space, we propose Vision-Adversarial Loss, Prompt-Robust Augmentation, and Attention-Decoupling Loss. Extensive experiments on CelebA-HQ and VGGFace2 demonstrate ADAF's promising performance, surpassing existing algorithms.


PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection

November 2022

·

5 Reads

·

92 Citations

Lecture Notes in Computer Science

In this paper, we delve into two key techniques in Semi-Supervised Object Detection (SSOD), namely pseudo labeling and consistency training. We observe that these two techniques currently neglect some important properties of object detection, hindering efficient learning on unlabeled data. Specifically, for pseudo labeling, existing works only focus on the classification score yet fail to guarantee the localization precision of pseudo boxes; For consistency training, the widely adopted random-resize training only considers the label-level consistency but misses the feature-level one, which also plays an important role in ensuring the scale invariance. To address the problems incurred by noisy pseudo boxes, we design Noisy Pseudo box Learning (NPL) that includes Prediction-guided Label Assignment (PLA) and Positive-proposal Consistency Voting (PCV). PLA relies on model predictions to assign labels and makes it robust to even coarse pseudo boxes; while PCV leverages the regression consistency of positive proposals to reflect the localization quality of pseudo boxes. Furthermore, in consistency training, we propose Multi-view Scale-invariant Learning (MSL) that includes mechanisms of both label- and feature-level consistency, where feature consistency is achieved by aligning shifted feature pyramids between two images with identical content but varied scales. On COCO benchmark, our method, termed PSEudo labeling and COnsistency training (PseCo), outperforms the SOTA (Soft Teacher) by 2.0, 1.8, 2.0 points under 1%, 5%, and 10% labelling ratios, respectively. It also significantly improves the learning efficiency for SSOD, e.g., PseCo halves the training time of the SOTA approach but achieves even better performance. Code is available at https://github.com/ligang-cs/PseCo. KeywordsSemi-supervised learningObject detection


OneFace: One Threshold for All

October 2022

·

42 Reads

·

6 Citations

Lecture Notes in Computer Science

Face recognition (FR) has witnessed remarkable progress with the surge of deep learning. Current FR evaluation protocols usually adopt different thresholds to calculate the True Accept Rate (TAR) under a pre-defined False Accept Rate (FAR) for different datasets. However, in practice, when the FR model is deployed on industry systems (e.g., hardware devices), only one fixed threshold is adopted for all scenarios to distinguish whether a face image pair belongs to the same identity. Therefore, current evaluation protocols using different thresholds for different datasets are not fully compatible with the practical evaluation scenarios with one fixed threshold, and it is critical to measure the performance of FR models by using one threshold for all datasets. In this paper, we rethink the limitations of existing evaluation protocols for FR and propose to evaluate the performance of FR models from a new perspective. Specifically, in our OneFace, we first propose the One-Threshold-for-All (OTA) evaluation protocol for FR, which utilizes one fixed threshold called as Calibration Threshold to measure the performance on different datasets. Then, to improve the performance of FR models under the OTA protocol, we propose the Threshold Consistency Penalty (TCP) to improve the consistency of the thresholds among multiple domains, which includes Implicit Domain Division (IDD) as well as Calibration and Domain Thresholds Estimation (CDTE). Extensive experimental results demonstrate the effectiveness of our method for FR.KeywordsFace recognitionLoss functionFairness


CoupleFace: Relation Matters for Face Recognition Distillation

October 2022

·

21 Reads

·

21 Citations

Lecture Notes in Computer Science

Knowledge distillation is an effective method to improve the performance of a lightweight neural network (i.e., student model) by transferring the knowledge of a well-performed neural network (i.e., teacher model), which has been widely applied in many computer vision tasks, including face recognition (FR). Nevertheless, the current FR distillation methods usually utilize the Feature Consistency Distillation (FCD) (e.g., L2L_2 distance) on the learned embeddings extracted by the teacher and student models for each sample, which is not able to fully transfer the knowledge from the teacher to the student for FR. In this work, we observe that mutual relation knowledge between samples is also important to improve the discriminative ability of the learned representation of the student model, and propose an effective FR distillation method called CoupleFace by additionally introducing the Mutual Relation Distillation (MRD) into existing distillation framework. Specifically, in MRD, we first propose to mine the informative mutual relations, and then introduce the Relation-Aware Distillation (RAD) loss to transfer the mutual relation knowledge of the teacher model to the student model. Extensive experimental results on multiple benchmark datasets demonstrate the effectiveness of our proposed CoupleFace for FR. Moreover, based on our proposed CoupleFace, we have won the first place in the ICCV21 Masked Face Recognition Challenge (MS1M track). KeywordsFace recognitionKnowledge distillationLoss function


Improving Robust Fairness via Balance Adversarial Training

September 2022

·

16 Reads

Adversarial training (AT) methods are effective against adversarial attacks, yet they introduce severe disparity of accuracy and robustness between different classes, known as the robust fairness problem. Previously proposed Fair Robust Learning (FRL) adaptively reweights different classes to improve fairness. However, the performance of the better-performed classes decreases, leading to a strong performance drop. In this paper, we observed two unfair phenomena during adversarial training: different difficulties in generating adversarial examples from each class (source-class fairness) and disparate target class tendencies when generating adversarial examples (target-class fairness). From the observations, we propose Balance Adversarial Training (BAT) to address the robust fairness problem. Regarding source-class fairness, we adjust the attack strength and difficulties of each class to generate samples near the decision boundary for easier and fairer model learning; considering target-class fairness, by introducing a uniform distribution constraint, we encourage the adversarial example generation process for each class with a fair tendency. Extensive experiments conducted on multiple datasets (CIFAR-10, CIFAR-100, and ImageNette) demonstrate that our method can significantly outperform other baselines in mitigating the robust fairness problem (+5-10\% on the worst class accuracy)


Citations (19)


... First, the training methods for the WISE system drew inspiration from the FaceNet training protocol. Investigating methods similar to those demonstrated in other works [33][34][35] for improving embryo identification holds the potential for valuable advancements in the field. Furthermore, we intend to conduct prospective validation in the future and explore the integration of WISE with the existing systems and standard operating procedures (SOPs) of IVF labs to accelerate the embryo identification process. ...

Reference:

WISE: whole-scenario embryo identification using self-supervised learning encoder in IVF
ICD-Face: Intra-class Compactness Distillation for Face Recognition
  • Citing Conference Paper
  • October 2023

... Furthermore, Carlini [2] demonstrated that diffusion models can memorize individual images from the training data during generation and regenerate these images during testing when given specific prompts. These findings suggest that diffusion models may pose greater privacy risks [5,8,9,13,26] than earlier generative models, such as GANs [10]. Additionally, research by Somepalli [58] indicates that diffusion models may directly replicate content from their training sets during image generation, often without the user's awareness, raising concerns about data ownership and copyright. ...

Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks
  • Citing Conference Paper
  • October 2023

... Xu et al. [26] empirically find the serious deficiency of AT in fairness and attempt to mitigate this problem using the proposed Fair-Robust-Learning framework. More recently, Sun et al. [19] propose Balance Adversarial Training (BAT) to improve robust fairness by balancing the source-class and target-class fairness. Ma et al. [10] theoretically study the trade-offs between adversarial robustness and class-wise fairness, and a fairly adversarial training method is proposed to mitigate the unfairness problem. ...

Improving Robust Fariness via Balance Adversarial Training
  • Citing Article
  • June 2023

Proceedings of the AAAI Conference on Artificial Intelligence

... Active Teacher [54] evaluates unlabeled samples based on three key criteria, maximizing the utilization of limited label information. PseCo [55] introduces multi-view scaleinvariant learning for object detection. Additionally, Zhang et al. [56] and PCL [57] aim to refine pseudo labels for more reliable training. ...

PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection
  • Citing Chapter
  • November 2022

Lecture Notes in Computer Science

... However, this transferring may be limited and insufficient, which cannot preserve inter-sample relations well. Relation-level knowledge transfer attempts to transfer structural relations between samples [9], where relation matters a lot in recognition tasks [10]. In this way, high-order knowledge can be mined and transferred. ...

CoupleFace: Relation Matters for Face Recognition Distillation
  • Citing Chapter
  • October 2022

Lecture Notes in Computer Science

... Evaluation bias is also related to the decisions made at this stage of the face recognition pipeline, including pairing selection, threshold optimisation, distance and normalisation functions. For example, the selected threshold can vary across datasets, and final model performance is often susceptible to the changes in these thresholds [105]. Studies have found that a single fixed threshold often causes higher variance across demographic groups than an adaptive threshold per-group threshold [105]. ...

OneFace: One Threshold for All
  • Citing Chapter
  • October 2022

Lecture Notes in Computer Science

... This subsection introduces the evaluation metrics used in the experiments, namely, the equal error rate (EER) and TAR@FAR = 0.01 [17,[38][39][40], to comprehensively assess the performance of the proposed framework. Additionally, to provide a more intuitive comparison of algorithm performance, three statistical metrics were used. ...

AnchorFace: Boosting TAR@FAR for Practical Face Recognition
  • Citing Article
  • June 2022

Proceedings of the AAAI Conference on Artificial Intelligence

... To address this challenge, we first comprehensively study multistage deep learning methods in computer vision and find that the use of knowledge distillation (KD) can deplete the negative impact of pseudo-labels and considerably enhance model performance in both supervised and unsupervised settings [2], [8], [34], [37], [64]. In this case, a teacher network iteratively optimizes the student network by replacing pseudolabels with the KD results, i.e., the implicit feature distribution (dark knowledge [27]), so that the student can comprehensively distinguish similarities and differences among samples and extract significant features (see Fig. 1). ...

Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-Guided Feature Imitation
  • Citing Article
  • June 2022

Proceedings of the AAAI Conference on Artificial Intelligence

... The VGG-19A, VGG-19B, VGG-19C, and VGG-19D were trained using RMSProp gradient based optimization algorithm, with an initial learning rate of 0.00002 [25]. The neural networks were trained for a total of 89 to 100 epochs [26]. The TensorFlow optimal learning rate finder was used after 10 and 15 epochs [27]. ...

Robust Face Anti-Spoofing with Dual Probabilistic Modeling
  • Citing Preprint
  • File available
  • April 2022

... Kim et al. noted that significant resolution degradation can make images unidentifiable, causing models to rely on other features, such as hairstyle and clothing, which can bias model training Kim et al. [2022]. Recent studies have addressed this by imposing different weights based on the image quality Chang et al. [2020], Kim et al. [2022], Liu et al. [2021], Tran et al. [2017]. For example, AdaFace demonstrated that the feature norm is positively correlated with the image quality and designed an adaptive margin based on the feature norm to emphasize high-quality hard samples and de-emphasize low-quality ones Kim et al. [2022]. ...

DAM: Discrepancy Alignment Metric for Face Recognition
  • Citing Conference Paper
  • October 2021