Article

MixFairFace: Towards Ultimate Fairness via MixFair Adapter in Face Recognition

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Although significant progress has been made in face recognition, demographic bias still exists in face recognition systems. For instance, it usually happens that the face recognition performance for a certain demographic group is lower than the others. In this paper, we propose MixFairFace framework to improve the fairness in face recognition models. First of all, we argue that the commonly used attribute-based fairness metric is not appropriate for face recognition. A face recognition system can only be considered fair while every person has a close performance. Hence, we propose a new evaluation protocol to fairly evaluate the fairness performance of different approaches. Different from previous approaches that require sensitive attribute labels such as race and gender for reducing the demographic bias, we aim at addressing the identity bias in face representation, i.e., the performance inconsistency between different identities, without the need for sensitive attribute labels. To this end, we propose MixFair Adapter to determine and reduce the identity bias of training samples. Our extensive experiments demonstrate that our MixFairFace approach achieves state-of-the-art fairness performance on all benchmark datasets.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In the dataset construction stage, efforts are made to create datasets with balanced racial proportions [20,21], sample or augment data to minimize disparities in recognition accuracy between attributes [13,16], and propose methods for data augmentation [11]. In the model construction stage, strategies involve mitigating bias in model performance through score normalization between attributes [18] and dynamically adjusting hyperparameters based on attributes [20,22,19], under the assumption of the existence of racially biased datasets. ...
... Xu et al. [22] proposed an approach where they dynamically balances the false positive rate (FPR) between each training sample without the need for demographic labels. Wang et al. [19] proposed MixFair Adapter to estimate and reduce the identity bias, the performance inconsistency between different identities, by reducing the feature discriminability differences. ...
... Our proposed LabellessFace is inspired by the techniques of Xu et al. [22] and Wang et al. [19], particularly because it does not require demographic labels. Unlike their approaches, we consider the degree of favoritism towards specific identity across the entire dataset. ...
Preprint
Demographic bias is one of the major challenges for face recognition systems. The majority of existing studies on demographic biases are heavily dependent on specific demographic groups or demographic classifier, making it difficult to address performance for unrecognised groups. This paper introduces ``LabellessFace'', a novel framework that improves demographic bias in face recognition without requiring demographic group labeling typically required for fairness considerations. We propose a novel fairness enhancement metric called the class favoritism level, which assesses the extent of favoritism towards specific classes across the dataset. Leveraging this metric, we introduce the fair class margin penalty, an extension of existing margin-based metric learning. This method dynamically adjusts learning parameters based on class favoritism levels, promoting fairness across all attributes. By treating each class as an individual in facial recognition systems, we facilitate learning that minimizes biases in authentication accuracy among individuals. Comprehensive experiments have demonstrated that our proposed method is effective for enhancing fairness while maintaining authentication accuracy.
Article
Full-text available
Systems incorporating biometric technologies have become ubiquitous in personal, commercial, and governmental identity management applications. Both cooperative (e.g. access control) and non-cooperative (e.g. surveillance and forensics) systems have benefited from biometrics. Such systems rely on the uniqueness of certain biological or behavioural characteristics of human beings, which enable for individuals to be reliably recognised using automated algorithms. Recently, however, there has been a wave of public and academic concerns regarding the existence of systemic bias in automated decision systems (including biometrics). Most prominently, face recognition algorithms have often been labelled as ``racist'' or ``biased'' by the media, non-governmental organisations, and researchers alike. The main contributions of this article are: (1) an overview of the topic of algorithmic bias in the context of biometrics, (2) a comprehensive survey of the existing literature on biometric bias estimation and mitigation, (3) a discussion of the pertinent technical and social matters, and (4) an outline of the remaining challenges and future work items, both from technological and social points of view.
Conference Paper
Full-text available
Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples. This includes distribution shifts, outliers, and adversarial examples. To address these issues, we propose Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations. Manifold Mixup leverages semantic interpolations as additional training signal, obtaining neural networks with smoother decision boundaries at multiple levels of representation. As a result, neural networks trained with Manifold Mixup learn flatter class-representations, that is, with fewer directions of variance. We prove theory on why this flattening happens under ideal conditions, validate it empirically on practical situations , and connect it to the previous works on information theory and generalization. In spite of incurring no significant computation and being implemented in a few lines of code, Manifold Mixup improves strong baselines in supervised learning, robustness to single-step adversarial attacks , and test log-likelihood.
Article
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Conference Paper
In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base. More specifically, we propose a benchmark task to recognize one million celebrities from their face images, by using all the possibly collected face images of this individual on the web as training data. The rich information provided by the knowledge base helps to conduct disambiguation and improve the recognition accuracy, and contributes to various real-world applications, such as image captioning and news video analysis. Associated with this task, we design and provide concrete measurement set, evaluation protocol, as well as training data. We also present in details our experiment setup and report promising baseline results. Our benchmark task could lead to one of the largest classification problems in computer vision. To the best of our knowledge, our training dataset, which contains 10M images in version 1, is the largest publicly available one in the world.
  • P J Grother
  • M L Ngan
  • K K Hanaoka
Grother, P. J.; Ngan, M. L.; Hanaoka, K. K.; et al. 2019. Face Recognition Vendor Test Part 3: Demographic Effects.
Pytorch: An imperative style, high-performance deep learning library
  • A Paszke
  • S Gross
  • F Massa
  • A Lerer
  • J Bradbury
  • G Chanan
  • T Killeen
  • Z Lin
  • N Gimelshein
  • L Antiga
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. 2019. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (NeurIPS).
FairCal: Fairness Calibration for Face Verification
  • T Salvador
  • S Cairns
  • V Voleti
  • N Marshall
  • A Oberman
Salvador, T.; Cairns, S.; Voleti, V.; Marshall, N.; and Oberman, A. 2022. FairCal: Fairness Calibration for Face Verification. In International Conference on Learning Representations (ICLR).
mixup: Beyond Empirical Risk Minimization
  • H Zhang
  • M Cisse
  • Y N Dauphin
  • D Lopez-Paz
Zhang, H.; Cisse, M.; Dauphin, Y. N.; and Lopez-Paz, D. 2018. mixup: Beyond Empirical Risk Minimization. In International Conference on Learning Representations (ICLR).