ArticlePDF Available

Abstract and Figures

Face Recognition is among the best examples of computer vision problems where the supremacy of deep learning techniques compared to standard ones is undeniable. Unfortunately, it has been shown that they are vulnerable to adversarial examples - input images to which a human imperceptible perturbation is added to lead a learning model to output a wrong prediction. Moreover, in applications such as biometric systems and forensics, cross-resolution scenarios are easily met with a non-negligible impact on the recognition performance and adversary’s success. Despite the existence of such vulnerabilities set a harsh limit to the spread of deep learning-based face recognition systems to real-world applications, a comprehensive analysis of their behavior when threatened in a cross-resolution setting is missing in the literature. In this context, we posit our study, where we harness several of the strongest adversarial attacks against deep learning-based face recognition systems considering the cross-resolution domain. To craft adversarial instances, we exploit attacks based on three different metrics, i.e., L1, L2, and L∞, and we study the resilience of the models across resolutions. We then evaluate the performance of the systems against the face identification protocol, open- and close-set. In our study, we find that the deep representation attacks represents a much dangerous menace to a face recognition system than the ones based on the classification output independently from the used metric. Furthermore, we notice that the input image’s resolution has a non-negligible impact on an adversary’s success in deceiving a learning model. Finally, by comparing the performance of the threatened networks under analysis, we show how they can benefit from a cross-resolution training approach in terms of resilience to adversarial attacks.
Content may be subject to copyright.
1
Pattern Recognition Letters
journal homepage: www.elsevier.com
Cross-Resolution Face Recognition Adversarial Attacks
Fabio Valerio Massolia,∗∗, Fabrizio Falchia, Giuseppe Amatoa
aISTI-CNR, via G. Moruzzi 1, 56124 Pisa, Italy
ABSTRACT
Face Recognition is among the best examples of computer vision problems where the supremacy of
deep learning techniques compared to standard ones is undeniable. Unfortunately, it has been shown
that they are vulnerable to adversarial examples - input images to which a human imperceptible per-
turbation is added to lead a learning model to output a wrong prediction. Moreover, in applications
such as biometric systems and forensics, cross-resolution scenarios are easily met with a non-negli-
gible impact on the recognition performance and adversary’s success. Despite the existence of such
vulnerabilities set a harsh limit to the spread of deep learning-based face recognition systems to real–
world applications, a comprehensive analysis of their behavior when threatened in a cross-resolution
setting is missing in the literature. In this context, we posit our study, where we harness several of
the strongest adversarial attacks against deep learning-based face recognition systems considering the
cross-resolution domain. To craft adversarial instances, we exploit attacks based on three dierent
metrics, i.e., L1,L2, and L, and we study the resilience of the models across resolutions. We then
evaluate the performance of the systems against the face identification protocol, open- and close-set.
In our study, we find that the deep representation attacks represents a much dangerous menace to a
face recognition system than the ones based on the classification output independently from the used
metric. Furthermore, we notice that the input image’s resolution has a non-negligible impact on an
adversary’s success in deceiving a learning model. Finally, by comparing the performance of the
threatened networks under analysis, we show how they can benefit from a cross-resolution training
approach in terms of resilience to adversarial attacks.
c
2020 Elsevier Ltd. All rights reserved.
1. Introduction1
Face Recognition (Wang and Deng,2018;Deng et al.,2019)2
(FR) represents one of the most astonishing applications of3
Neural Networks (NNs), especially considering Deep Convo-4
lutional Neural Networks (DCNNs), that ultimately overcame5
standard computer vision techniques such as Gabor-Fisher (Liu6
and Wechsler,2002) and local binary patterns (Ahonen et al.,7
2006). The study of such a problem began in the early 90s8
when Turk and Pentland (1991) proposed the Eigenfaces ap-9
proach, and it only required two decades for Deep Learning10
(DL) approaches to start to dominate the field reaching recog-11
nition performance up to 99.80% Wang and Deng (2018), thus12
overcoming human ability. DL-based FR systems do not ex-13
ploit the output of a classifier directly. Instead, they leverage14
∗∗Corresponding author
e-mail: fabio.massoli@isti.cnr.it (Fabio Valerio Massoli)
the representation power (LeCun et al.,2015) of the learning 15
models to extract face descriptors, i.e., multidimensional vec- 16
tors, also called deep features or deep representations, to fulfill 17
the recognition task. 18
Although FR systems obtain very high performance when 19
trained with datasets comprising images acquired under con- 20
trolled conditions, e.g., high-resolution, they suer a drastic 21
drop in reliability when tested against cross-resolution (CR) 22
scenarios (Massoli et al.,2019) that naturally arise, for ex- 23
ample, in surveillance applications (Zou and Yuen,2011;Am- 24
ato et al.,2019;Cheng et al.,2018). To counteract such a weak- 25
ness, Ekenel and Sankur (2005) and Luo et al. (2019) proposed 26
approaches that were not based on NNs. Instead, only recently 27
such a problem has been tackled in the DL field (Massoli et al.,28
2020;Zhang et al.,2018). 29
To make the situation even worse, recently Szegedy et al. 30
(2013); Biggio et al. (2013) showed that DL models are vulner- 31
2
able to the so-called adversarial examples - images to which a32
specific amount of noise, undetectable to humans, is added to33
induce a NN to output a wrong prediction. Unfortunately, the34
ability of an insightful adversary to jeopardize these learning35
models, considering both the digital (Dong et al.,2019;Song36
et al.,2018;Qiu et al.,2019;Kakizaki and Yoshida,2019;37
Goswami et al.,2018) and physical (Sharif et al.,2016;Kur-38
akin et al.,2016) domains, represents a significant concern in39
security-related applications such as DL-based biometrics sys-40
tems (Sundararajan and Woodard,2018) and forensics (Spaun,41
2011). Thus, limiting their adoption in these fields.42
In this context, we posit our contribution that we summarize43
as follows: i) we threaten two DCNNs by exploiting adversarial44
attacks based on three dierent metrics, i.e., L1,L2, and L; ii)45
we generate attacks not only towards a classification objective46
but also against a similarity one. Indeed, FR systems typic-47
ally do not exploit a DCNN classification output. Instead, they48
leverage the ability of NNs to generate discriminative deep rep-49
resentations among which a similarity criterion is evaluated to50
fulfill the recognition task; iii) we conduct the attacks in a cross-51
resolution domain, thus emulating a real-world scenario for an52
FR system; iv) we analyze the success rates of the various at-53
tacks across resolutions, studying if a DL model can benefit54
from a cross-resolution training procedure in terms of robust-55
ness to adversarial attacks; v) we analyze the robustness of the56
models through the face identification protocol (Grother et al.,57
2019) considering both the open- and close-set settings.58
The rest of the paper is structured as follows. In Section 2, we59
briefly present some related works, while in Section 3, we de-60
scribe the attacks algorithms we use. Subsequently, in section 4,61
we explain our experimental procedure and the dataset we use,62
while in Section 5, we present the results from the experimental63
campaign. Finally, in Section 6, we report our conclusions.64
2. Related Works65
To the best of our knowledge, this is the first work that tackles66
the problem of adversarial attacks against FR systems in a CR67
scenario. For such a reason, in what follows, we briefly cite a68
few articles related to the topics of the cross-resolution FR and69
adversarial attacks against an FR system.70
2.1. Cross-Resolution Face Recognition71
CR scenarios are met whenever images at dierent resolu-72
tions have to be matched. Such a situation typically happens,73
for example, in biometric and forensics applications. Super-74
Resolution (SR) techniques are among the most studied solu-75
tions to such a problem, and Singh et al. (2018) proposed to76
synthesize high-resolution faces from low-resolution ones by77
employing a multi-level sparse representation of the given in-78
puts. Zangeneh et al. (2020) formulated a mapping of the low-79
and the high-resolution images to a common space by lever-80
aging a DL architecture made by two distinct branches, one for81
each image. Luo et al. (2019) exploited the dictionary learning82
approach based on learning multiple dictionaries, each being83
associated with a resolution. The most comprehensive study84
and widely tested method to improve an FR system’s perform- 85
ance in a CR scenario was recently proposed by Massoli et al. 86
(2020). In their work, the authors formulated a training proced- 87
ure to fine-tune a state-of-the-art model to the CR domain. They 88
tested their models on several benchmark datasets by showing 89
their superior performance compared to the results available in 90
the literature. 91
2.2. Face Recognition Adversarial Attacks 92
As we mentioned at the beginning of this section, we are 93
the first to study adversarial attacks in a cross-resolution do- 94
main. Due to the lack of papers than can be directly compared 95
to our study, in what follows we only briefly cite a few art- 96
icles concerning adversarial attacks against FR systems. Sharif 97
et al. (2016) demonstrated the feasibility and eectiveness of 98
physical attacks by impersonating other identities using eye- 99
glass frames with a malicious texture. Zhong and Deng (2020)100
observed the superior transferability properties of feature-based 101
attacks compared to label-based ones. Moreover, they proposed 102
a drop-out method for DCNNs to enhance further the transfer- 103
ability of the attacks. Song et al. (2018) proposed a three-player 104
GAN architecture that leveraged a face recognition network as 105
the third player in the competition between generator and dis- 106
criminator. Dong et al. (2019) successfully performed black- 107
box attacks on FR models and demonstrated their eectiveness 108
in a real-world deployed system. 109
3. Adversarial Attacks 110
3.1. Carlini and Wagner 111
Carlini and Wagner (Carlini and Wagner,2017) (CW) for- 112
mulated one of the strongest currently available attacks. The 113
CW-L2attack is formalized as: 114
min c·f(1
2tanh(w)+1)+k1
2(tanh(w)+1) xk2
2,115
where f(·) is the objective function, xis the input image, wis 116
the adversarial example in the tanh space, and cis a positive 117
constant which value is set by exploiting a binary search pro- 118
cedure. 119
3.2. Elastic Net Attack to DNNs 120
The Elastic Net Attack (Chen et al.,2018) (EAD), leverages 121
the elastic-net regularization which is a well known technique 122
in solving high-dimensional feature selection problems (Zou 123
and Hastie,2005). It is based on the objective proposed 124
in Carlini and Wagner (2017) and it conceives the CW-L2at- 125
tack as a special case. EAD is formulated as: 126
min
xc·f(x,t)+βkxx0k1+kxx0k2
2,127
where f(·) is the objective as in the CW-L2attack, tis the target 128
class, x0is the input image, tis the target label, xis the ad- 129
versarial instance, cis a parameter found by binary search, and 130
βrepresents the weight of the L1penalty term. 131
3
3.3. Jacobian Saliency Map Attack132
The Jacobian Saliency Map Attack (Papernot et al.,2016)133
(JSMA) exploits an “input-perturbation-to-output” mapping.134
Dierently from the backpropagation-based attacks, JSMA135
leverages the model derivative concerning the classification out-136
put rather than the derivative of the loss function. The attack is137
formalized as: arg min
δx
kδxks.t.F(X+δx)=Y, where138
Fis the function learned by the DNN, Xand Yare the input139
and output of the model, respectively, and δxis the adversarial140
perturbation defined upon the evaluation of the model input sa-141
liency map.142
3.4. Deep Representations Attacks143
Dierently from the previously mentioned attacks, the Deep144
Representations (Sabour et al.,2015) (DR) attack focuses on145
the manipulation of image features. It is formulated as an op-146
timization problem which aims at finding the closest perturbed147
image, to the original one, whose descriptor is as close as pos-148
sible to the one of a target image named the “guide image”.149
Specifically, the adversarials crafting procedure is the follow-150
ing: Iα=arg min
I
kφk(I)φk(Ig)k2
2; subject to kIIsk<151
δ, where φ(·)kis the descriptor extracted at layer kof the152
threatened model, Isand Igare the source and target images, re-153
spectively, Iαis the adversarial example, and δis he maximum154
allowed perturbation in terms of the Lnorm.155
4. Experimental Approach156
4.1. Dataset and Models157
In our experiments, we use the 2.9M images shared among158
the 8631 identities contained in the training set of the VGG-159
Face2 (Cao et al.,2018) dataset. To construct the gallery and160
the queries, we divide the training set into two splits. Concern-161
ing the gallery, we evaluate a single template for each identity162
as the average features vector among all the corresponding face163
images. Regarding the queries, we randomly select 100 iden-164
tities, and for each of them, we randomly pick ten correctly165
classified images, ending up with 1000 queries.166
Concerning the learning models, we analyze the performance167
of two DCNNs: the face classifier from Cao et al. (2018) and168
the CR-trained one from Massoli et al. (2020). They share the169
same structure, i.e., a ResNet-50 (He et al.,2016) architecture170
equipped with Squeeze-and-Excitation (Hu et al.,2017) blocks.171
For both models, we adopt the same preprocessing steps for172
the images. First, following the same procedure as in Massoli173
et al. (2020), we synthesize dierent resolution versions of the174
input that allow us to evaluate the performance of the models175
in a cross-resolution scenario. Specifically, in our analysis, we176
consider images at 16, 24, 64, and 256 pixels (shortest side).177
Next, each image is resized to have the shortest side of 256178
pixels, and then it is cropped to a square picture of size 224x224179
pixels. Finally, we subtract the channel mean from each pixel.180
4.2. Adversarial Attacks 181
Concerning the generation of the adversarial instances, we 182
exploit the five algorithms we described in Section 3. We 183
use the implementations available in the foolbox library (https: 184
//foolbox.readthedocs.io/en/stable/), with the only exception of 185
the DR one that we build on top of the L-Broyden-Fletcher- 186
Goldfarb-Shanno (L-BFGS) (Szegedy et al.,2013), optimiza- 187
tion procedure. More precisely, the L-BFGS algorithm requires 188
a function to optimize. To our aim, we implement such a func- 189
tion by employing a k-NN algorithm as guidance in the ad- 190
versarial search. We fit the classifier to the gallery templates 191
we mentioned at the beginning of this section. Then, we start 192
the crafting procedure and stop it as soon as the k-NN classifies 193
the malicious image as belonging to the targeted identity. In 194
Figure 1, we report a schematic view of the procedure we just 195
described. 196
Figure 1: Schematic representation of our approach to crafting DR attacks.
The colored regions are the k-NN decision boundaries for ten dierent identity
templates (white triangles). The initial location of the green star represents
a correctly classified features vector. The adversarial features vector’s final
position is represented by the red encircled star.
4.3. Face Identification Metrics 197
FR systems typically deal with sensitive scenarios such as 198
biometric and forensics applications. Hence, dierent error 199
types have distinct relevance while evaluating system perform- 200
ance, and a simple accuracy measure is not enough to properly 201
evaluate and compare the performance of FR systems. Instead, 202
as mentioned in Section 1, we focus our study on the face iden- 203
tification protocol. Specifically, we consider both the close- and 204
open-set settings. 205
Concerning the close-set setting, we evaluate the Cumulative 206
Match Characteristic (CMC), a metric that represents a sum- 207
marized accuracy evaluated on mated searches only, i.e., con- 208
sidering queries that correspond to identities already available 209
the gallery. The CMC value at rank one is usually named “hit 210
rate,” and it is the most typical summary indicator of an al- 211
gorithm’s ecacy. As we mentioned above, we select 100 iden- 212
tities to construct the queries. Thus, we end up with a gallery 213
containing 8631 identities that comprise a hundred mated ones 214
and 8531 un-mated ones acting as “distractors”. 215
In the open-set setting, dierently from the close-set one, we 216
consider both mated and un-mated queries. To this aim, we re- 217
move half of the queries identities from the gallery, ending up 218
with 50 mated and 50 un-mated persons and a gallery contain- 219
ing 8581 templates. With that set, there are two dierent types 220
4
of errors that are usually evaluated, i.e., the False Positive Iden-221
tification Rate (FPIR) and the False Negative Identification Rate222
(FNIR) or “miss rate”. Concerning the former, it represents the223
number of un-mated queries that return a positive match at or224
above a specific similarity threshold. On the other hand, the225
FNIR represents the number of mated searches that return can-226
didates with a similarity score below the threshold or outside227
the top R ranks.228
The FNIR and FPIR, parametrized by the similarity229
threshold, can be combined to construct the Detection Error230
Tradeo(DET), which is typically used to report the two types231
of error trade-o. We use the DET to evaluate the performance232
of the learning models in the experiments.233
5. Experimental Results234
We dedicate this section to report the results of our experi-235
mental campaigns. As we mentioned in Section 1, we aim to236
study the behavior of DL-based FR systems when threatened237
by adversarial attacks in a CR domain. Concerning the FR,238
as backbone features extractors, we consider the well-known239
DCNN from Cao et al. (2018) that set the state-of-the-art on240
the NIST datasets (Klare et al.,2015;Whitelam et al.,2017;241
Maze et al.,2018) and the CR model from Massoli et al. (2020)242
that set the state-of-the-art in the cross-resolution domain.243
To craft adversarial examples, we harness the algorithms we244
described in Section 3. Moreover, being interested in the CR245
scenario, we consider input faces at 16, 24, 64, and 256 pixels246
(shortest side). Concerning the FR task, we keep the gallery at247
the original resolution.248
As mentioned in Section 2, to our knowledge, we are the first249
to conduct this type of study. Thus, a direct comparison with250
previously published works is not possible. Hence, in what fol-251
lows, we only report our results. We hope that our study will252
stimulate further researches in this direction. Throughout this253
section, we refer to the model from Cao et al. (2018) as “Base”254
model and to the one from Massoli et al. (2020) as “Cross-255
Resolution” model.256
5.1. Threatening the Classification 257
We report the results from the attacks against the classifica- 258
tion in Table 1. Concerning the attacks, we use the following 259
configurations. For JSMA, we consider 1000 iterations, a per- 260
turbation per pixel equals to 0.1, 0.3, and 0.5 (percentage over 261
the allowed pixel range), and a maximum number of times each 262
pixel can be modified of 10. For CW-L2, we consider 10 binary 263
search steps and 10 and 100 iterations. Concerning EAD, we 264
use the same parameters as for the CW-L2attack and a value 265
for the weight of the L1penalty term equals to 0.1 and 1. Fur- 266
thermore, since the DR (Sabour et al.,2015) attack is the least 267
time demanding compared to the others, we enlarge the set of 268
hyperparameters for it. Thus, we dedicate Figure 2 to report 269
their results. 270
From Table 1, we notice that there is no clear signature for 271
which model is more robust against adversarial attacks. On the 272
other hand, we see that, on average, an adversary’s success rate 273
decreases as the resolution increases while keeping the attack 274
configuration fixed. Let us now turn our attention to a single at- 275
tack, for example, CW-L2. It is interesting to notice the impact 276
of a dierent choice of hyperparameters. Indeed, even though 277
from the configuration (10-10), the “Base” model seems to be 278
more resilient compared to the “Cross-Resolution” one, this is 279
not true. Indeed, by just increasing the strength of the attack, 280
i.e., (10-100) configuration for which we grow the number of 281
steps, we reach 100% of attack success rate for both models. 282
From Figure 2 we observe that it is undeniable that the deep 283
features extracted by the “Cross-Resolution” model are much 284
more robust than those extracted from the “Base” NN. Thus, 285
confirming our previous assertion about the benefit of CR train- 286
ing. From the first plot of Figure 2, we see that the success rate 287
of the attack is almost 0% for the “Base” model. Instead, in the 288
second plot, it looks like that both models have the same resi- 289
lience. This is not in contrast with our previous conclusions. 290
Indeed, as it has been shown in appendix 1 of Massoli et al. 291
(2020), the “Base” model is not able to generate meaningful 292
deep representation at very low resolutions. Thus, it is almost 293
impossible to craft targeted attacks based on deep features. To 294
sustain even more our assertion, we run a test with untargeted 295
DR attacks in which we easily reach a success rate of 100% for 296
Table 1: Attack success rate against classification for “Base” and “Cross-Resolution” models. The first column reports the specific configuration used for each
attack. The four values reported in the second and third main columns represent the success rate at a resolution of 16, 24, 64, and 256 pixels, respectively.
Attack Success Rate (%)
Attack Configuration Base Model Cross-Resolution Model
16 24 64 256 16 24 64 256
JSMA (1000-0.1-1.0) 76.1 61.8 25.5 11.5 65.5 62.8 17.1 6.9
JSMA (1000-0.3-1.0) 96.6 92.5 75.7 61.2 96.0 94.7 70.0 50.1
JSMA (1000-0.5-1.0) 98.5 95.8 86.4 76.6 97.6 97.0 100. 69.6
CW-L2(10-10) 82.9 72.9 45.9 32.7 86.4 83.3 52.8 37.4
CW-L2(10-100) 100. 100. 100. 100. 100. 100. 100. 100.
EAD (10-0.1-10) 95.7 98.2 94.5 87.0 96.7 99.6 98.8 98.5
EAD (10-0.1-100) 100. 100. 100. 100. 100. 100. 100. 100.
EAD (10-1.0-10) 83.4 85.1 50.2 27.9 72.6 94.4 86.9 73.8
EAD (10-1.0-100) 98.5 99.8 98.7 91.0 97.5 99.8 100. 99.6
5
Figure 2: DR (Sabour et al.,2015) attack success rate as function of the maximum allowed perturbation δconsidering 100 and 1000 iteration steps. Each plot
represents a dierent input resolution.
the “Base” model.297
Finally, we can notice that from our results, there is no clear298
evidence in favor of a specific metric since with the proper hy-299
perparameters, we reached high success rates with the L1,L2,300
and L.301
5.2. Threatening the Face Recognition302
We now turn our attention to DL-based FR systems. We be-303
gin our analysis by considering the face identification protocol304
in the close-set scenario, and we then move the open-set one.305
We refer the reader to Section 4for a detailed description of the306
metrics we use to assess the performance of the systems under307
analysis.308
5.2.1. Close-set309
As mentioned in Section 4, we use the CMC to evaluate the310
performance of the threatened models in the close-set scenario.311
Specifically, we summarize our results in Table 2 by reporting312
the hit rate, i.e., the CMC value at a rank equals to one, with313
the exception of the DR (Sabour et al.,2015) attack to which314
we dedicate Figure 3. From a defensive point of view, the more315
resilient a model, the lower the hit rate, while from an attacker316
perspective, it is the other way round.317
By looking at Table 2 and Figure 3 we can assert that the DR318
attack is much more eective in fooling a DL-based FR sys-319
tem than the classification-based ones with respect to any type 320
of metric. From the attacker’s point of view, this is a funda- 321
mental result. Indeed, by comparing the results from Table 1 322
and Table 2, we see that even though the attacks fool the clas- 323
sification, it is not guaranteed that they can evade a similarity- 324
based system. Thus, deep representation attacks might be a 325
better choice to attack an FR system. Moreover, we see how 326
the “Cross-Resolution”-based system exhibits higher robust- 327
ness than the one based on the “Base” model. Thus, again, we 328
find that DCNNs benefit from a CR training approach (Mas- 329
soli et al.,2020) in terms of resilience to adversarial attacks. 330
Indeed, it is undeniable that the “Cross-Resolution”-based sys- 331
tem is much more resilient against adversarial attacks than the 332
“Base”-based one across all resolutions. 333
5.2.2. Open-set 334
To report the results for the face identification protocol in 335
the open-set setting, we exploit the DET. Two fundamental as- 336
pects dierentiate the DET from the CMC. Indeed, the former 337
applies a threshold among the similarity of the features, and it 338
comprises queries of identities that are not present in the gal- 339
lery. Instead, the latter does not use any threshold, i.e., it does 340
not discern among “weak” and “strong” similarity scores, and 341
it requires queries related to already known identities. 342
As we mentioned in Section 4, the DET represents the er- 343
Table 2: Attacks hit rate. The first column reports the configuration for each attack. The four values reported in the second and third main columns are the results
at a resolution of 16, 24, 64, and 256 pixels, respectively As a reference, we report in the first row the hit rate for the authentic images.
Hit Rate (%)
Attack Configuration Base Model Cross-Resolution Model
16 24 64 256 16 24 64 256
Auth 79.5 95.3 99.8 99.9 96.7 98.8 99.4 99.7
JSMA (1000-0.1-1.0) 12.1 10.7 12.9 12.2 11.9 9.8 9.4 13.0
JSMA (1000-0.3-1.0) 14.0 9.3 10.7 10.6 9.8 10.0 7.4 8.9
JSMA (1000-0.5-1.0) 13.6 10.6 10.0 10.3 10.0 10.2 3.0 6.8
CW-L2(10-10) 10.9 6.5 6.1 3.7 10.8 9.3 5.5 5.1
CW-L2(10-100) 7.6 4.1 6.1 2.3 9.2 9.3 3.6 4.6
EAD (10-0.1-10) 31.8 32.6 27.8 25.1 19.2 16.8 19.4 19.7
EAD (10-0.1-100) 17.5 9.7 6.3 6.2 13.8 11.6 6.8 5.3
EAD (10-1.0-10) 44.8 38.0 26.7 25.5 20.8 25.7 20.1 21.7
EAD (10-1.0-100) 34.8 30.3 20.7 16.8 17.3 16.5 17.4 17.2
6
Figure 3: DR (Sabour et al.,2015) hit rate as function of the maximum allowed perturbation δconsidering 100 and 1000 attack steps. Each plot represents a
dierent input resolution.
ror trade-obetween the FNIR and the FPIR. To summarize344
the performance of the FR systems, we report the FPIR at a345
reference value of the FNIR equals to 1.e2. Compared to the346
close-set settings, the adversary’s goal is to lower the curve as347
much as possible, while from a defensive point of view, a higher348
curve represents a more resilient model. The results are repor-349
ted in Table 3 with the exception of DR (Sabour et al.,2015) to350
which we dedicate Figure 4.351
Analyzing the results reported in Table 3 and Figure 4 we352
obtain the same conclusions we report for the close-set setting.353
Specifically, by comparing the results from Table 3 to the ones354
in Figure 4 we see that the DR attack is much more eect-355
ive in fooling the FR system compared to others and that the356
“Cross-Resolution”-based system is much more resilient than357
the “Base”-based one against adversarial attacks.358
6. Conclusions359
DCNN-based FR systems leverage the representation power360
of learning models. Unfortunately, they also share their weak-361
nesses. Indeed, it has been recently shown that these systems362
suer a drastic drop in their performance when tested in a cross-363
resolution domain. The situation becomes even worse when364
an adversary comes into play. Indeed, an FR system can be365
deceived by adversarial examples. These weaknesses pose a366
severe limit to the spread of these systems to sensitive real- 367
world applications such as biometric systems and forensics. 368
In such a context, we proposed our analysis in which we 369
compared the resilience to adversarial attacks of FR systems 370
based on the deep features extracted by NNs in a CR scenario. 371
We studied two dierent DCNN models: a former one, trained 372
only on high-resolution images and a latter one, trained on a 373
cross-resolution domain. To generate adversarial instances, we 374
harnessed several algorithms based on dierent metrics and ob- 375
jectives, and we craft malicious samples considering input im- 376
ages at a resolution of 16, 24, 64, and 256 pixels. Concerning 377
the measures of the performance of the FR systems, we adopted 378
the face identification protocol. Specifically, we considered the 379
close- and open-set settings for which we evaluated the CMC 380
and DET. 381
From our analysis, we notice that, given a specific configur- 382
ation, the attack success rate is higher at lower resolutions, for 383
example, at 16 and 24 pixels, than at higher ones, such as 64 384
and 256 pixels. Such behavior was somehow expected since, at 385
a very low-resolution part of the face information can be lost, 386
thus simplifying the eort of an adversary. 387
By looking at the results from the FR systems, it is evident 388
that a DCNN benefits from a CR training procedure since it 389
empowers the learning model to extract more robust deep rep- 390
resentations. Moreover, we observed that DR attacks represent 391
Table 3: FPIR@FNIR=1.e2. The first column reports the configuration for each attack. The four values reported in the second and third main columns are the
results at a resolution of 16, 24, 64, and 256 pixels, respectively. As a reference, we report in the first row the results for the authentic images.
FPIR@FNIR=1.e2
Attack Configuration Base Model Cross-Resolution Model
16 24 64 256 16 24 64 256
Auth 75.0 40.8 0.8 1.0 38.6 20.2 3.6 3.2
JSMA (1000-0.1-1.0) 99.3 99.1 100. 95.1 99.1 98.4 100. 98.1
JSMA (1000-0.3-1.0) 99.0 99.1 97.2 99.7 97.8 98.6 99.0 100.
JSMA (1000-0.5-1.0) 98.0 98.1 98.2 97.0 99.4 98.6 99.0 98.7
CW-L2(10-10) 99.5 98.1 99.5 97.4 99.0 98.1 98.9 98.9
CW-L2(10-100) 100. 99.0 99.5 99.4 99.6 98.1 99.6 99.2
EAD (10-0.1-10) 95.3 93.2 98.7 99.5 98.4 98.8 96.0 97.6
EAD (10-0.1-100) 98.0 99.4 99.4 99.0 100. 98.8 98.6 99.2
EAD (10-1.0-10) 95.6 96.3 98.3 95.3 96.3 98.1 96.7 96.7
EAD (10-1.0-100) 98.8 97.9 97.1 98.6 98.6 98.1 99.0 97.7
7
Figure 4: FPIR@FNIR=1.e2for the DR (Sabour et al.,2015) attack as function of the maximum allowed perturbation δconsidering 100 and 1000 attack steps.
Each plot represents a dierent input resolution.
a much greater menace to an FR system than the ones based392
on the classification output of the threatened models for each of393
the considered metrics, i.e., L1,L2and L. Such a result held394
for the close- as well as for the open-set settings.395
References396
Ahonen, T., Hadid, A., Pietikainen, M., 2006. Face description with local bin-397
ary patterns: Application to face recognition. IEEE TPAMI , 2037–2041.398
Amato, G., Falchi, F., Gennaro, C., Massoli, F.V., Passalis, N., Tefas, A.,399
Trivilini, A., Vairo, C., 2019. Face verification and recognition for digital400
forensics and information security, in: ISDFS, IEEE. pp. 1–6.401
Biggio, B., Corona, I., Maiorca, D., Nelson, B., ˇ
Srndi´
c, N., Laskov, P., Giacinto,402
G., Roli, F., 2013. Evasion attacks against machine learning at test time, in:403
ECML PKDD, Springer. pp. 387–402.404
Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A., 2018. Vggface2: A405
dataset for recognising faces across pose and age, in: International Confer-406
ence on Automatic Face & Gesture Recognition, IEEE. pp. 67–74.407
Carlini, N., Wagner, D., 2017. Towards evaluating the robustness of neural408
networks, in: Symposium on security and privacy, IEEE. pp. 39–57.409
Chen, P.Y., Sharma, Y., Zhang, H., Yi, J., Hsieh, C.J., 2018. Ead: elastic-net410
attacks to deep neural networks via adversarial examples, in: Thirty-second411
AAAI conference on artificial intelligence.412
Cheng, Z., Zhu, X., Gong, S., 2018. Surveillance face recognition challenge.413
arXiv preprint arXiv:1804.09691 .414
Deng, J., Guo, J., Xue, N., Zafeiriou, S., 2019. Arcface: Additive angular415
margin loss for deep face recognition, in: CVPR, IEEE. pp. 4690–4699.416
Dong, Y., Su, H., Wu, B., Li, Z., Liu, W., Zhang, T., Zhu, J., 2019. Ecient417
decision-based black-box adversarial attacks on face recognition, in: CVPR,418
IEEE. pp. 7714–7722.419
Ekenel, H.K., Sankur, B., 2005. Multiresolution face recognition. Image and420
Vision Computing 23, 469–477.421
Goswami, G., Ratha, N., Agarwal, A., Singh, R., Vatsa, M., 2018. Unravel-422
ling robustness of deep learning based face recognition against adversarial423
attacks, in: Thirty-Second AAAI Conference on Artificial Intelligence.424
Grother, P., Grother, P., Ngan, M., Hanaoka, K., 2019. Face RecognitionVendor425
Test (FRVT) Part 2: Identification. US Department of Commerce, National426
Institute of Standards and Technology.427
He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image428
recognition, in: CVPR, IEEE. pp. 770–778.429
Hu, J., Shen, L., Sun, G., 2017. Squeeze-and-excitation networks. arxiv.430
Kakizaki, K., Yoshida, K., 2019. Adversarial image translation: Unrestricted431
adversarial examples in face recognition systems. arXiv:1905.03421.432
Klare, B.F., Klein, B., Taborsky, E., Blanton, A., Cheney, J., Allen, K., Grother,433
P., Mah, A., Jain, A.K., 2015. Pushing the frontiers of unconstrained face434
detection and recognition: Iarpa janus benchmark a, in: CVPR, IEEE. pp.435
1931–1939.436
Kurakin, A., Goodfellow, I., Bengio, S., 2016. Adversarial examples in the437
physical world. arXiv preprint arXiv:1607.02533 .438
LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. nature 521, 436–444.439
Liu, C., Wechsler, H., 2002. Gabor feature based classification using the en-440
hanced fisher linear discriminant model for face recognition. IEEE Transac-441
tions on Image processing 11, 467–476.442
Luo, X., Xu, Y., Yang, J., 2019. Multi-resolution dictionary learning for face 443
recognition. Pattern Recognition 93, 283–292. 444
Massoli, F.V., Amato, G., Falchi, F., 2020. Cross-resolution learning for face 445
recognition. Image and Vision Computing , 103927. 446
Massoli, F.V., Amato, G., Falchi, F., Gennaro, C., Vairo, C., 2019. Improving 447
multi-scale face recognition using vggface2, in: International Conference on 448
Image Analysis and Processing, Springer. pp. 21–29. 449
Maze, B., Adams, J., Duncan, J.A., Kalka, N., Miller, T., Otto, C., Jain, A.K., 450
Niggel, W.T., Anderson, J., Cheney, J., et al., 2018. Iarpa janus benchmark- 451
c: Face dataset and protocol, in: ICB, IEEE. pp. 158–165. 452
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A., 453
2016. The limitations of deep learning in adversarial settings, in: 2016 454
IEEE European symposium on security and privacy, IEEE. pp. 372–387. 455
Qiu, H., Xiao, C., Yang, L., Yan, X., Lee, H., Li, B., 2019. Semanticadv: Gen- 456
erating adversarial examples via attribute-conditional image editing. arXiv 457
preprint arXiv:1906.07927 . 458
Sabour, S., Cao, Y., Faghri, F., Fleet, D.J., 2015. Adversarial manipulation of 459
deep representations. arXiv preprint arXiv:1511.05122 . 460
Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K., 2016. Accessorize to 461
a crime: Real and stealthy attacks on state-of-the-art face recognition, in: 462
SIGSAC CCS, ACM. pp. 1528–1540. 463
Singh, M., Nagpal, S., Singh, R., Vatsa, M., Majumdar, A., 2018. Magnifyme: 464
Aiding cross resolution face recognition via identity aware synthesis. arXiv 465
preprint arXiv:1802.08057 . 466
Song, Q., Wu, Y., Yang, L., 2018. Attacks on state-of-the-art face recogni- 467
tion using attentional adversarial attack generative network. arXiv preprint 468
arXiv:1811.12026 . 469
Spaun, N.A., 2011. Face recognition in forensic science, in: Handbook of face 470
recognition. Springer, pp. 655–670. 471
Sundararajan, K., Woodard, D.L., 2018. Deep learning for biometrics: a survey. 472
ACM Computing Surveys (CSUR) 51, 65. 473
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., 474
Fergus, R., 2013. Intriguing properties of neural networks. arXiv preprint 475
arXiv:1312.6199 . 476
Turk, M.A., Pentland, A.P., 1991. Face recognition using eigenfaces, in: Pro- 477
ceedings. 1991 IEEE Computer Society Conference on Computer Vision 478
and Pattern Recognition, IEEE. pp. 586–591. 479
Wang, M., Deng, W., 2018. Deep face recognition: A survey. arXiv preprint 480
arXiv:1804.06655 . 481
Whitelam, C., Taborsky, E., Blanton, A., Maze, B., Adams, J., Miller, T., Kalka, 482
N., Jain, A.K., Duncan, J.A., Allen, K., et al., 2017. Iarpa janus benchmark- 483
b face dataset, in: CVPR Workshops, IEEE. pp. 90–98. 484
Zangeneh, E., Rahmati, M., Mohsenzadeh, Y., 2020. Low resolution face re- 485
cognition using a two-branch deep convolutional neural network architec- 486
ture. Expert Systems with Applications 139, 112854. 487
Zhang, K., Zhang, Z., Cheng, C.W., Hsu, W.H., Qiao, Y., Liu, W., Zhang, T., 488
2018. Super-identity convolutional neural network for face hallucination, in: 489
European Conference on Computer Vision (ECCV), pp. 183–198. 490
Zhong, Y., Deng, W., 2020. Towards transferable adversarial attack against 491
deep face recognition. arXiv preprint arXiv:2004.05790 . 492
Zou, H., Hastie, T., 2005. Regularization and variable selection via the elastic 493
net. Journal of the royal statistical society: series B 67, 301–320. 494
Zou, W.W., Yuen, P.C., 2011. Very low resolution face recognition problem. 495
IEEE Transactions on image processing 21, 327–340. 496
... An illustration of a common adversarial attack on image classification AI model. The shown adversarial perturbation (amplified for illustration) is added into a sample to force the model to make a wrong prediction with high confidence [73]. ...
... An illustration of a trojan attack on face classification AI model. The trojan trigger added into a sample activates the trojan attack and predicts/generates the trojan target label [73]. ...
... An illustration of a common adversarial attack on image classification AI model. The shown adversarial perturbation (amplified for illustration) is added into a sample to force the model to make a wrong prediction with high confidence [73]. ...
... An illustration of a trojan attack on face classification AI model. The trojan trigger added into a sample activates the trojan attack and predicts/generates the trojan target label [73]. ...
Article
As the globally increasing population drives rapid urbanization in various parts of the world, there is a great need to deliberate on the future of the cities worth living. In particular, as modern smart cities embrace more and more data-driven artificial intelligence services, it is worth remembering that (1) technology can facilitate prosperity, wellbeing, urban livability, or social justice, but only when it has the right analog complements (such as well-thought out policies, mature institutions, responsible governance); and (2) the ultimate objective of these smart cities is to facilitate and enhance human welfare and social flourishing. Researchers have shown that various technological business models and features can in fact contribute to social problems such as extremism, polarization, misinformation, and Internet addiction. In the light of these observations, addressing the philosophical and ethical questions involved in ensuring the security, safety, and interpretability of such AI algorithms that will form the technological bedrock of future cities assumes paramount importance. Globally there are calls for technology to be made more humane and human-centered. In this paper, we analyze and explore key challenges including security, robustness, interpretability, and ethical (data and algorithmic) challenges to a successful deployment of AI in human-centric applications, with a particular emphasis on the convergence of these concepts/challenges. We provide a detailed review of existing literature on these key challenges and analyze how one of these challenges may lead to others or help in solving other challenges. The paper also advises on the current limitations, pitfalls, and future directions of research in these domains, and how it can fill the current gaps and lead to better solutions. We believe such rigorous analysis will provide a baseline for future research in the domain.
... In addition to this, there are several studies on universal adversarial perturbations [16,17], which are able to take effect on any image. Some studies are devoted to the application of adversarial examples to real-world scenarios, such as face recognition, autonomous driving, etc. [18][19][20][21][22]. Studying both adversarial attack and defense [23][24][25][26] is of significance, not only in revealing the vulnerability of DNNs, but also in improving the robustness of DNNs. ...
Article
Full-text available
The transferability of adversarial examples allows the attacker to fool deep neural networks (DNNs) without knowing any information about the target models. The current input transformation-based method generates adversarial examples by transforming the image in the input space, which implicitly integrates a set of models by concatenating image transformation into the trained model. However, the input transformation-based methods ignore the manifold embedding and hardly extract intrinsic information from high-dimensional data. To this end, we propose a novel feature transformation-based method (FTM), which conducts feature transformation in the feature space. FTM can improve the robustness of adversarial example by transforming the features of data. Combining with FTM, the intrinsic features of adversarial examples are extracted to generate transferable adversarial examples. The experimental results on two benchmark datasets show that FTM could effectively improve the attack success rate (ASR) of the state-of-the-art (SOTA) methods. FTM improves the attack success rate of the Scale-Invariant Method on Inception_v3 from 62.6% to 75.1% on ImageNet, which is a large margin of 12.5%.
... 222-229. [33] Face Recognition is among the best examples of computer vision problems where the supremacy of deep learning techniques compared to standard ones is undeniable. Unfortunately, it has been shown that they are vulnerable to adversarial examples -input images to which a human imperceptible perturbation is added to lead a learning model to output a wrong prediction. ...
Technical Report
Full-text available
The Artificial Intelligence for Media and Humanities laboratory (AIMH) has the mission to investigate and advance the state of the art in the Artificial Intelligence field, specifically addressing applications to digital media and digital humanities, and taking also into account issues related to scalability. This report summarize the 2020 activities of the research group.
... 222-229. [33] Face Recognition is among the best examples of computer vision problems where the supremacy of deep learning techniques compared to standard ones is undeniable. Unfortunately, it has been shown that they are vulnerable to adversarial examples -input images to which a human imperceptible perturbation is added to lead a learning model to output a wrong prediction. ...
Article
Heterogeneous face recognition is a challenging problem where the probe and gallery images belong to different modalities such as, low and high resolution, visible and near-infrared spectrum. A Generative Adversarial Network (GAN) enables us to learn an image to image transformation model for enhancing the resolution of a face image. Such a model would be helpful in a heterogeneous face recognition scenario. However, unsupervised GAN based transformation methods in their native formulation might alter useful discriminative information in the transformed face images. This affects the performance of face recognition algorithms when applied on the transformed images. We propose a Supervised Resolution Enhancement and Recognition Network (SUPREAR-NET), which does not corrupt the useful class-specific information of the face image and transforms a low resolution probe image into a high resolution one, followed by effective matching with the gallery using a trained discriminative model. We show the results for cross-resolution face recognition on three datasets including the FaceSurv face dataset, containing poor quality low resolution videos captured at a standoff distance up to 10 meters from the camera. On the FaceSurv, NIST MEDS and CMU MultiPIE datasets, the proposed algorithm outperforms recent unsupervised and supervised GAN algorithms.
Article
Over the past two decades, biometric recognition has exploded into a plethora of different applications around the globe. This proliferation can be attributed to the high levels of authentication accuracy and user convenience that biometric recognition systems afford end-users. However, in-spite of the success of biometric recognition systems, there are a number of outstanding problems and concerns pertaining to the various sub-modules of biometric recognition systems that create an element of mistrust in their use -both by the scientific community and also the public at large. Some of these problems include: i) questions related to system recognition performance, ii) security (spoof attacks, adversarial attacks, template reconstruction attacks and demographic information leakage), iii) uncertainty over the bias and fairness of the systems to all users, iv) explainability of the seemingly black-box decisions made by most recognition systems, and v) concerns over data centralization and user privacy. In this paper, we provide an overview of each of the aforementioned open-ended challenges. We survey work that has been conducted to address each of these concerns and highlight the issues requiring further attention. Finally, we provide insights into how the biometric community can address core biometric recognition systems design issues to better instill trust, fairness, and security for all.
Preprint
Over the past two decades, biometric recognition has exploded into a plethora of different applications around the globe. This proliferation can be attributed to the high levels of authentication accuracy and user convenience that biometric recognition systems afford end-users. However, in-spite of the success of biometric recognition systems, there are a number of outstanding problems and concerns pertaining to the various sub-modules of biometric recognition systems that create an element of mistrust in their use - both by the scientific community and also the public at large. Some of these problems include: i) questions related to system recognition performance, ii) security (spoof attacks, adversarial attacks, template reconstruction attacks and demographic information leakage), iii) uncertainty over the bias and fairness of the systems to all users, iv) explainability of the seemingly black-box decisions made by most recognition systems, and v) concerns over data centralization and user privacy. In this paper, we provide an overview of each of the aforementioned open-ended challenges. We survey work that has been conducted to address each of these concerns and highlight the issues requiring further attention. Finally, we provide insights into how the biometric community can address core biometric recognition systems design issues to better instill trust, fairness, and security for all.
Article
Face images captured in unconstrained settings may suffer from one or multiple degradations, which would degrade the visual aesthetics of images and the performance of face recognition methods. However, many current methods only focus on a specific degradation or restoring the images without considering face identity. To address these problems, an identity-preservation-based deep learning method is proposed for super-resolving blurry face images. First, an extra recognition module is designed and integrated with the restoration module to extract different levels of identity-related and semantic features. Second, an assemble loss function is developed to use the identity preservation information as regularization and prior to guide the restoration and recognition process. Finally, qualitative and quantitative evaluations are conducted to demonstrate the effectiveness of the proposed method for face recovery and face recognition. The results indicate that facial identity can serve as an effective prior to face image restoration.
Article
Full-text available
With the broad use of face recognition, its weakness gradually emerges that it is able to be attacked. Therefore, it is very important to study how face recognition networks are subject to attacks. Generating adversarial examples is an effective attack method, which misleads the face recognition system through obfuscation attack (rejecting a genuine subject) or impersonation attack (matching to an impostor). In this paper, we introduce a novel GAN, Attentional Adversarial Attack Generative Network (A3GN), to generate adversarial examples that mislead the network to identify someone as the target person not misclassify inconspicuously. For capturing the geometric and context information of the target person, this work adds a conditional variational autoencoder and attention modules to learn the instance-level correspondences between faces. Unlike traditional two-player GAN, this work introduces a face recognition network as the third player to participate in the competition between generator and discriminator which allows the attacker to impersonate the target person better. The generated faces which are hard to arouse the notice of onlookers can evade recognition by state-of-the-art networks and most of them are recognized as the target person.
Chapter
Full-text available
Convolutional neural networks have reached extremely high performances on the Face Recognition task. These models are commonly trained by using high-resolution images and for this reason, their discrimination ability is usually degraded when they are tested against low-resolution images. Thus, Low-Resolution Face Recognition remains an open challenge for deep learning models. Such a scenario is of particular interest for surveillance systems in which it usually happens that a low-resolution probe has to be matched with higher resolution galleries. This task can be especially hard to accomplish since the probe can have resolutions as low as 8, 16 and 24 pixels per side while the typical input of state-of-the-art neural network is 224. In this paper, we described the training campaign we used to fine-tune a ResNet-50 architecture, with Squeeze-and-Excitation blocks, on the tasks of very low and mixed resolutions face recognition. For the training process we used the VGGFace2 dataset and then we tested the performance of the final model on the IJB-B dataset; in particular, we tested the neural network on the 1:1 verification task. In our experiments we considered two different scenarios: (1) probe and gallery with same resolution; (2) probe and gallery with mixed resolutions.
Chapter
Recent studies have shown that DNNs are vulnerable to adversarial examples which are manipulated instances targeting to mislead DNNs to make incorrect predictions. Currently, most such adversarial examples try to guarantee “subtle perturbation” by limiting the Lp norm of the perturbation. In this paper, we propose SemanticAdv to generate a new type of semantically realistic adversarial examples via attribute-conditioned image editing. Compared to existing methods, our SemanticAdv enables fine-grained analysis and evaluation of DNNs with input variations in the attribute space. We conduct comprehensive experiments to show that our adversarial examples not only exhibit semantically meaningful appearances but also achieve high targeted attack success rates under both whitebox and blackbox settings. Moreover, we show that the existing pixel-based and attribute-based defense methods fail to defend against SemanticAdv. We demonstrate the applicability of SemanticAdv on both face recognition and general street-view images to show its generalization. We believe that our work can shed light on further understanding about vulnerabilities of DNNs as well as novel defense approaches. Our implementation is available at https://github.com/AI-secure/SemanticAdv .
Article
Convolutional Neural Network models have reached extremely high performance on the Face Recognition task. Mostly used datasets, such as VGGFace2, focus on gender, pose, and age variations, in the attempt of balancing them to empower models to better generalize to unseen data. Nevertheless, image resolution variability is not usually discussed, which may lead to a resizing of 256 pixels. While specific datasets for very low-resolution faces have been proposed, less attention has been paid on the task of cross-resolution matching. Hence, the discrimination power of a neural network might seriously degrade in such a scenario. Surveillance systems and forensic applications are particularly susceptible to this problem since, in these cases, it is common that a low-resolution query has to be matched against higher-resolution galleries. Although it is always possible to either increase the resolution of the query image or to reduce the size of the gallery (less frequently), to the best of our knowledge, extensive experimentation of cross-resolution matching was missing in the recent deep learning-based literature. In the context of low- and cross-resolution Face Recognition, the contribution of our work is fourfold: i) we proposed a training procedure to fine-tune a state-of-the-art model to empower it to extract resolution-robust deep features; ii) we conducted an extensive test campaign by using high-resolution datasets (IJB-B and IJB-C) and surveillance-camera-quality datasets (QMUL-SurvFace, TinyFace, and SCface) showing the effectiveness of our algorithm to train a resolution-robust model; iii) even though our main focus was the cross-resolution Face Recognition, by using our training algorithm we also improved upon state-of-the-art model performances considering low-resolution matches; iv) we showed that our approach could be more effective concerning preprocessing faces with super-resolution techniques. The python code of the proposed method will be available at https://github.com/fvmassoli/cross-res-fr.
Article
In recent years, there has been a growing interest in the study of dictionary learning for face recognition. Most of the conventional dictionary learning methods focus only on a single resolution, which ignores the variability of resolutions of real-world face images. In order to address the above issue, this paper proposes a novel multi-resolution dictionary learning method that provides multiple dictionaries each being associated with a resolution. Especially, to enhance the robustness of the model, our method adds a relatively strong constraint to keep the similarity of representations obtained using different dictionaries in the training phase. We compare the proposed method to several state-of-the-art dictionary learning methods by applying this method to multi-resolution face recognition. The experimental results demonstrate that our method outperforms many recently proposed dictionary learning methods. The MATLAB codes of the proposed method will be available at http://www.yongxu.org/lunwen.html.