Table 1 - uploaded by Dong Yi
Content may be subject to copyright.
The information of CASIA-WebFace and comparison to other large scale face datasets. 

The information of CASIA-WebFace and comparison to other large scale face datasets. 

Source publication
Article
Full-text available
Pushing by big data and deep convolutional neural network (CNN), the performance of face recognition is becoming comparable to human. Using private large scale training datasets, several groups achieve very high performance on LFW, i.e., 97% to 99%. While there are many open source implementations of CNN, none of large scale face dataset is publicl...

Contexts in source publication

Context 1
... statistics of the proposed CASIA-WebFace dataset is shown in Table 1. Except for Facebook's SFC dataset, the scale of CASIA-WebFace has the largest scale. ...
Context 2
... they are not overlapped to the proposed CASIA-WebFace, it's very reasonable to report performance on LFW and YTF. Be- sides that, the trained deep network is also evaluated accord- ing to a more challenging and practical protocol, BLUFR, which can reflect the performance of face recognition in real Protocol DeepFace 1 95.92 ± 0.29% unsupervised DeepFace 1 97.00 ± 0.28% restricted DeepFace 3 ...
Context 3
... the representation by PCA on YTF, the accu- racy improves to 90.60% remarkably. When tuning the rep- Protocol DeepFace 1 91.4 ± 1.1% supervised Ours A 1 88.00 ± 1.50% unsupervised Ours D 1 90.60 ± 1.24% unsupervised Ours E 1 92.24 ± 1.28% supervised Table 5. The performance of our methods and DeepFace on Youtube Faces (YTF). ...

Similar publications

Chapter
Full-text available
In this chapter, we describe the concepts of traditional machine learning. In particular, we introduce the key features of supervised learning, heuristic learning, discriminative learning, single-task learning and random data partitioning. We also identify general issues of traditional machine learning, and discuss how traditional learning approach...

Citations

... This is due to the insufficient identities in suitable (ICAO compliant [26]) and publicly available datasets. Compared to the face recognition datasets [52], and even face presentation attack (spoofing) databases [56], existing MAD datasets are hundred times smaller. Besides, only a few datasets (detailed information of eight MAD datasets can be found in Section 4.1) are available for MAD development research. ...
... As discussed in Section 2, the insufficient number of identities in the existing MAD datasets is one possible reason for the poor generalization of MAD performance. To address this issue, we utilize the in-the-wild CASIA-WebFace dataset [52] as our assumed to be "mostly normal" samples, considering its diversity in capture environment, sensor, and identity. Specifically, the CASIA-WebFace dataset [52] consists of 494,414 images across 10,575 identities collected from the web and is used for face verification and identification tasks. ...
... To address this issue, we utilize the in-the-wild CASIA-WebFace dataset [52] as our assumed to be "mostly normal" samples, considering its diversity in capture environment, sensor, and identity. Specifically, the CASIA-WebFace dataset [52] consists of 494,414 images across 10,575 identities collected from the web and is used for face verification and identification tasks. Then, we use an additional MAD dataset, namely SMDD [9], together with CASIA-WebFace for exploration of reconstruction behavior. ...
Preprint
Full-text available
The supervised-learning-based morphing attack detection (MAD) solutions achieve outstanding success in dealing with attacks from known morphing techniques and known data sources. However, given variations in the morphing attacks, the performance of supervised MAD solutions drops significantly due to the insufficient diversity and quantity of the existing MAD datasets. To address this concern, we propose a completely unsupervised MAD solution via self-paced anomaly detection (SPL-MAD) by leveraging the existing large-scale face recognition (FR) datasets and the unsupervised nature of convolutional autoencoders. Using general FR datasets that might contain unintentionally and unlabeled manipulated samples to train an autoencoder can lead to a diverse reconstruction behavior of attack and bona fide samples. We analyze this behavior empirically to provide a solid theoretical ground for designing our unsupervised MAD solution. This also results in proposing to integrate our adapted modified self-paced learning paradigm to enhance the reconstruction error separability between the bona fide and attack samples in a completely unsupervised manner. Our experimental results on a diverse set of MAD evaluation datasets show that the proposed unsupervised SPL-MAD solution outperforms the overall performance of a wide range of supervised MAD solutions and provides higher generalizability on unknown attacks.
... The extended VGGFace2 [26] dataset consists about three million images of 9'131 identities with facial images varying in pose, background, age and illumination, yet all images are of relatively high resolution. CASIA-WebFace [27] features about half a million images from 10'000 identities. It is also often used for face verification and face identification. ...
Preprint
Full-text available
Automatic face recognition is a research area with high popularity. Many different face recognition algorithms have been proposed in the last thirty years of intensive research in the field. With the popularity of deep learning and its capability to solve a huge variety of different problems, face recognition researchers have concentrated effort on creating better models under this paradigm. From the year 2015, state-of-the-art face recognition has been rooted in deep learning models. Despite the availability of large-scale and diverse datasets for evaluating the performance of face recognition algorithms, many of the modern datasets just combine different factors that influence face recognition, such as face pose, occlusion, illumination, facial expression and image quality. When algorithms produce errors on these datasets, it is not clear which of the factors has caused this error and, hence, there is no guidance in which direction more research is required. This work is a followup from our previous works developed in 2014 and eventually published in 2016, showing the impact of various facial aspects on face recognition algorithms. By comparing the current state-of-the-art with the best systems from the past, we demonstrate that faces under strong occlusions, some types of illumination, and strong expressions are problems mastered by deep learning algorithms, whereas recognition with low-resolution images, extreme pose variations, and open-set recognition is still an open problem. To show this, we run a sequence of experiments using six different datasets and five different face recognition algorithms in an open-source and reproducible manner. We provide the source code to run all of our experiments, which is easily extensible so that utilizing your own deep network in our evaluation is just a few minutes away.
... And the numbers of samples included are 18,171, 36,575, and 68,195, respectively. We use the model trained on a random subset of CASIA [38] datasets for testing clustering. ...
Article
Full-text available
In recent research, supervised image clustering based on Graph Neural Networks (GNN) connectivity prediction has demonstrated considerable improvements over traditional clustering algorithms. However, existing supervised image clustering algorithms are usually time-consuming and limit their applications. In order to infer the connectivity between image instances, they usually created a subgraph for each image instance. Due to the creation and process of a large number of subgraphs as the input of GNN, the computation overheads are enormous. To address the high computation overhead problem in the GNN connectivity prediction, we present a time-efficient and effective GNN-based supervised clustering framework based on density division namely DDC-GNN. DDC-GNN divides all image instances into high-density parts and low-density parts, and only performs GNN subgraph connectivity prediction on the low-density parts, resulting in a significant reduction in redundant calculations. We test two typical models in the GNN connectivity prediction module in the DDC-GNN framework, which are the graph convolutional networks (GCN)-based model and the graph auto-encoder (GAE)-based model. Meanwhile, adaptive subgraphs are generated to ensure sufficient contextual information extraction for low-density parts instead of the fixed-size subgraphs. According to the experiments on different datasets, DDC-GNN achieves higher accuracy and is almost five times quicker than those without the density division strategy.
... As a source wild dataset we use VGGFace2 [7](∼3M images, ∼9k classes, ∼360 samples per class, Licence -CC BY-SA 4.0) due to large average number of samples per identity (in comparison to other popular wild face datasets like CASIA-WebFace [65], MS-Celeb-1M [20,25], Glint360K [2], WebFace260M [69]) in order to have enough samples per identity after filtering. ...
Preprint
Full-text available
Face morphing attack detection (MAD) is one of the most challenging tasks in the field of face recognition nowadays. In this work, we introduce a novel deep learning strategy for a single image face morphing detection, which implies the discrimination of morphed face images along with a sophisticated face recognition task in a complex classification scheme. It is directed onto learning the deep facial features, which carry information about the authenticity of these features. Our work also introduces several additional contributions: the public and easy-to-use face morphing detection benchmark and the results of our wild datasets filtering strategy. Our method, which we call MorDeephy, achieved the state of the art performance and demonstrated a prominent ability for generalising the task of morphing detection to unseen scenarios.
... Most of these datasets are targeted at adults or celebrities. In addition, most of the databases do not contain newly collected data; rather, they are composed by merging or modifying existing data such as LFW [6], CFP [7], and CASIA-WebFace [8]. ...
Article
Full-text available
Most face datasets target adults who can make their own decisions. In the case of children, consent from parents or guardians is necessary to collect biometric information, thus making it very difficult. As a result, the amount of data on children is quite small and inevitably private. In this work, we built a database by collecting face data of 74 children aged 2–7 years in daycare facilities. In addition, we conducted an experiment to determine the best location to perform face recognition on children by installing cameras in various locations. This study presents the points and methods to be considered to build a children’s face dataset and also studies the optimal camera installation setups for the face recognition of children.
... We have added the "information-theoretic analyses" as one of the comparison items to emphasize the specific contribution of the proposed RD-Stego. To verify our claims, we use the following datasets-FaceScrub [14], CASIA-WebFace [15], and CelebA-HQ/CelebA [16] to train the proposed model and use ImageNet [17] to evaluate and test for cross-domain performance. Experimental results show that the proposed approach can generate photo-realistic stego-images without sacrificing the embedded information capacity compared with all related methods. ...
... Table 2 summarizes the characteristics of our experimental environments, including the hardware specifications and software environment settings. We use the following datasets-FaceScrub [14], CASIA-WebFace [15], and CelebA-HQ/CelebA [16] to train RD-Stego and use ImageNet [17] to investigate cross-domain performance. FaceScrub comprises 106,863 face images of 530 male and female celebrities, including 200 images per person. ...
Article
Full-text available
Steganography is one of the most crucial methods for information hiding, which embeds secret data on an ordinary file or a cover message for avoiding detection. We designed a novel rate-distortion-based large-capacity secure steganographic system, called rate-distortion-based Stego (RD-Stego), to effectively solve the above requirement. The considered effectiveness of our system design includes embedding capacity, adaptability to chosen cover attacks, and the stability of the trained model. The proposed stego scheme can hide multiple three-channel color images and QR codes within another three-channel color image with low visual distortion. Empirically, with a certain degree of robustness against the chosen cover attack, we state that the system offers up to 192+ bits-per-pixel (bpp) embedding of a payload and leaks no secret-related information. Moreover, to provide theoretical foundations for our cost function design, a mutual information-based explanation of the choices of regulation processes is herein included. Finally, we justify our system’s claimed advantages through a series of experiments with publicly available benchmark datasets.
... To solve the above limitations, we firstly proposed to label the object image qualities with pairwise ranks, as labeling which image is sharper than the other is much more easier than labeling blur levels. Therefore, we construct a new object-BA dataset (named FIB) by hiring crowdsourcing annotators to judge the blur ranking of all image pairs from two popular face datasets [16,17]. In this way, we can get the total ranking of all images as shown in Fig.1 (b). ...
... In this paper, we focus on face image blurriness assessment. To enrich the dataset, we sample images of different blur levels ran- domly from LFW [16] and CASIA-WebFace [17] to construct our supervised training dataset FIB. The detailed labeling steps are as follows. ...
... To fully evaluate the effectiveness of the quality assessment algorithms, we construct three test datasets for evaluating them under different settings. The first one is the intradataset testing, in which test data is sampled exclusively from LFW [16] and CASIA-WebFace [17] as the labeled data. The second one is the half-inter-dataset testing, in which test data is sampled exclusively from WiderFace [18] as the unlabeled data. ...
Preprint
Assessing the blurriness of an object image is fundamentally important to improve the performance for object recognition and retrieval. The main challenge lies in the lack of abundant images with reliable labels and effective learning strategies. Current datasets are labeled with limited and confused quality levels. To overcome this limitation, we propose to label the rank relationships between pairwise images rather their quality levels, since it is much easier for humans to label, and establish a large-scale realistic face image blur assessment dataset with reliable labels. Based on this dataset, we propose a method to obtain the blur scores only with the pairwise rank labels as supervision. Moreover, to further improve the performance, we propose a self-supervised method based on quadruplet ranking consistency to leverage the unlabeled data more effectively. The supervised and self-supervised methods constitute a final semi-supervised learning framework, which can be trained end-to-end. Experimental results demonstrate the effectiveness of our method.
... where α is a weighting parameter used to balance between the two training losses. Fig. 2: Examples from the authentic (CASIA-WebFace [45]) and SFace datasets. The synthetic images also show a large variance in appearance and pose. ...
... We used CASIA-WebFace [45] to train SyleGAN2-ADA and the base (pre-trained) FR model P, this data is noted as the "authentic" data. The dataset contains 494,414 images of 10,575 different identities. ...
... We train five instances of ResNet-50. The first instance is trained with CosFace loss [43] on the authentic CASIA-WebFace dataset [45] and is used for KT (model P). We followed the parameter selection in [43] for the CosFace loss margin value, and scale parameter to 0.35 and 64, respectively. ...
Preprint
Full-text available
Recent deep face recognition models proposed in the literature utilized large-scale public datasets such as MS-Celeb-1M and VGGFace2 for training very deep neural networks, achieving state-of-the-art performance on mainstream benchmarks. Recently, many of these datasets, e.g., MS-Celeb-1M and VGGFace2, are retracted due to credible privacy and ethical concerns. This motivates this work to propose and investigate the feasibility of using a privacy-friendly synthetically generated face dataset to train face recognition models. Towards this end, we utilize a class-conditional generative adversarial network to generate class-labeled synthetic face images, namely SFace. To address the privacy aspect of using such data to train a face recognition model, we provide extensive evaluation experiments on the identity relation between the synthetic dataset and the original authentic dataset used to train the generative model. Our reported evaluation proved that associating an identity of the authentic dataset to one with the same class label in the synthetic dataset is hardly possible. We also propose to train face recognition on our privacy-friendly dataset, SFace, using three different learning strategies, multi-class classification, label-free knowledge transfer, and combined learning of multi-class classification and knowledge transfer. The reported evaluation results on five authentic face benchmarks demonstrated that the privacy-friendly synthetic dataset has high potential to be used for training face recognition models, achieving, for example, a verification accuracy of 91.87\% on LFW using multi-class classification and 99.13\% using the combined learning strategy.
... This need for data if quantization is used on FR networks follows that of deep FR models reliance on large-scale training datasets [3], [8], [16] such as MS1M [25], VGGFace2 [5] and CASIA-WebFace [26]. Existing efficient FR solutions are not different as they also require face image databases, whether for conventional training and/or knowledge distillation (KD) from teacher networks [8]- [11], [16], [17]. ...
... very challenging when a database is widely distributed, which puts the privacy rights of individuals in jeopardy. Following such concerns, datasets such as VGGFace2 [5] and CASIA-WebFace [26] are not anymore publicly accessible in many countries. Companies like Facebook announced that they will shut down their FR system due to such privacy concerns [33]. ...
Preprint
Full-text available
Deep learning-based face recognition models follow the common trend in deep neural networks by utilizing full-precision floating-point networks with high computational costs. Deploying such networks in use-cases constrained by computational requirements is often infeasible due to the large memory required by the full-precision model. Previous compact face recognition approaches proposed to design special compact architectures and train them from scratch using real training data, which may not be available in a real-world scenario due to privacy concerns. We present in this work the QuantFace solution based on low-bit precision format model quantization. QuantFace reduces the required computational cost of the existing face recognition models without the need for designing a particular architecture or accessing real training data. QuantFace introduces privacy-friendly synthetic face data to the quantization process to mitigate potential privacy concerns and issues related to the accessibility to real training data. Through extensive evaluation experiments on seven benchmarks and four network architectures, we demonstrate that QuantFace can successfully reduce the model size up to 5x while maintaining, to a large degree, the verification performance of the full-precision model without accessing real training datasets.
... To train the Light CNN network, we choose CASIA-WebFace [34] and MS-Celeb-1 M [8] visible face databases as the pretraining dataset, and randomly choose one sample from each identity as the validation set and the other face images as our training set. Dropout is operated on some fully connected (FC) layers and its ratio is set to 0.7. ...
Article
Full-text available
The hyperspectral imaging, capturing discriminative information across a series of spectrum bands, leads to building a robust face recognition system. Motivated by the success of deep convolutional network and transferring learning, this paper proposed an end-to-end hyperspectral face recognition model based on a light Convolutional Neural Network (CNN) and transfer learning. To boost the performance of hyperspectral face recognition, the Max-Feature-Map (MFM) activation function and fine-tuning (structure regularization(L2SP) or attention feature distillation (AFD)) are introduced to optimize the deep network, which will learn the fine feature representation across different bands. Especially, this method incorporates regularization and AFD cooperation into the transfer learning strategy on the visible face data. By feeding back hyperspectral images to the pretrained light CNN network, we can design an end-to end model that can leverage the generalization ability for hyperspectral face images. Finally, the entire model is trained and verified on the PolyU-HSFD, CMU, and UWA hyperspectral face datasets using the associated standard evaluation protocols. Experimental results demonstrate that the improved Light CNN network can get good representations of hyperspectral face features and the joint training with a combination of L2SP and AFD exhibits better recognition performance than the state-of-the-art methods based on other deep networks.