PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Crafting adversarial examples is crucial for evaluating and enhancing the robustness of Deep Neural Networks (DNNs), presenting a challenge equivalent to maximizing a non-differentiable 0-1 loss function. However, existing single objective methods, namely adversarial attacks focus on a surrogate loss function, do not fully harness the benefits of engaging multiple loss functions, as a result of insufficient understanding of their synergistic and conflicting nature. To overcome these limitations, we propose the Multi-Objective Set-based Attack (MOS Attack), a novel adversarial attack framework leveraging multiple loss functions and automatically uncovering their interrelations. The MOS Attack adopts a set-based multi-objective optimization strategy, enabling the incorporation of numerous loss functions without additional parameters. It also automatically mines synergistic patterns among various losses, facilitating the generation of potent adversarial attacks with fewer objectives. Extensive experiments have shown that our MOS Attack outperforms single-objective attacks. Furthermore, by harnessing the identified synergistic patterns, MOS Attack continues to show superior results with a reduced number of loss functions.
Content may be subject to copyright.
MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework
Ping Guo1,2, Cheng Gong1,2, Xi Lin1,2, Fei Liu1,2, Zhichao Lu1,2, Qingfu Zhang1,2*
, Zhenkun Wang3
1City University of Hong Kong; 2CityU Shenzhen Research Institute;
3Southern University of Science and Technology
Abstract
Crafting adversarial examples is crucial for evaluating
and enhancing the robustness of Deep Neural Networks
(DNNs), presenting a challenge equivalent to maximizing a
non-differentiable 0-1 loss function. However, existing sin-
gle objective methods, namely adversarial attacks focus on
a surrogate loss function, do not fully harness the benefits
of engaging multiple loss functions, as a result of insuffi-
cient understanding of their synergistic and conflicting na-
ture. To overcome these limitations, we propose the Multi-
Objective Set-based Attack (MOS Attack), a novel adversar-
ial attack framework leveraging multiple loss functions and
automatically uncovering their interrelations. The MOS At-
tack adopts a set-based multi-objective optimization strat-
egy, enabling the incorporation of numerous loss functions
without additional parameters. It also automatically mines
synergistic patterns among various losses, facilitating the
generation of potent adversarial attacks with fewer objec-
tives. Extensive experiments have shown that our MOS At-
tack outperforms single-objective attacks. Furthermore, by
harnessing the identified synergistic patterns, MOS Attack
continues to show superior results with a reduced number
of loss functions.
1. Introduction
Deep neural network (DNN) models have significantly ad-
vanced the field of computer vision [15,26,28,38], yet
they are vulnerable to adversarial examples [20,43]. Such
examples are inputs that have been subtly modified to cause
misclassification, potentially leading to catastrophic conse-
quences in real-world scenarios [7,16,19]. Consequently,
the development of sophisticated adversarial attack algo-
rithms is crucial for evaluating and enhancing the robust-
ness of these models [11,34,52]. However, devising these
algorithms presents inherent challenges due to the non-
differentiable nature of the original optimization problem,
necessitating the use of surrogate loss functions [21] to fa-
*Corresponding author: qingfu.zhang@cityu.edu.hk
cilitate gradient-based adversarial attacks [11,20,34].
The metric for measuring misclassification is the non-
differentiable 0-1 loss function, which surrogate loss func-
tions endeavor to approximate [30]. Adversarial attacks are
designed to generate a perturbation δthat causes the mis-
classification of an input xwith its corresponding label y.
This can be formulated as [21,34]:
max
δ∈B L0-1(hθ(x+δ),y),(1)
where hθrepresents the DNN model parameterized by θ,
L0-1 denotes the 0-1 loss function, and Bis the set of allow-
able perturbations.
Considering the computational intractability of the prob-
lem in Equation (1) [3], contemporary research commonly
employs a differentiable surrogate loss function in place of
the 0-1 loss function. This approach enables the utiliza-
tion of gradient-based optimization techniques to address
the resultant surrogate optimization problem. It has pro-
pelled considerable progress in gradient-based algorithms,
including the Fast Gradient Sign Method (FGSM) [20],
Projected Gradient Descent (PGD) [34], and Carlini &
Wagner (C&W) attack [8]. Notably, the versatility of the
PGD attack has increased with the adoption of diverse sur-
rogate loss functions (APGD-CE, APGD-DLR) [41] and
the integration of sophisticated optimization techniques
(ACG) [51]. These developments have given rise to more
sophisticated adversarial methods.
While single-objective attacks have attracted consider-
able attention, there is an emerging trend towards integrat-
ing multiple loss functions to bolster the attack’s efficacy.
Some early endeavors include using multiple targeted loss
functions to guide untargeted attack [21] and the strategic
alternation of loss functions in the attack process[2]. Fur-
thermore, the adoption of diverse surrogate loss functions
such as GAMA [41], BCE [46], and DLR [11] has been
instrumental in advancing adversarial attacks.
Despite the potential advantages for incorporating mul-
tiple loss functions, direct optimization with a vast array of
adversarial examples is inefficient. Moreover, the method-
ology for targeting suitable loss functions to mount effective
adversarial attacks is lacking. Therefore, it is imperative
1
arXiv:2501.07251v1 [cs.LG] 13 Jan 2025
to develop a scalable framework that can efficiently coor-
dinate multiple surrogate loss functions, concentrate on a
limited subset, and reduce the number of adversarial exam-
ples needed for optimization.
To bridge this knowledge gap, we introduce the Multi-
Objective Set-based Attack (MOS Attack), a novel frame-
work for conducting multi-objective adversarial attacks and
investigating the interactions among various surrogate loss
functions. Our framework notably offers: 1) a scalable,
parameter-free template for executing multi-objective ad-
versarial attacks, and 2) automated method for the dis-
covery of synergistic loss patterns. The MOS Attack em-
ploys a suite of surrogate loss functions and initiates an
adaptable number of adversarial examples, thereby defin-
ing a smooth set-based optimization problem. Subse-
quently, single-objective gradient-based optimization tech-
niques, which require only minimal adjustments, can effi-
ciently address this problem. Following the optimization
phase, an automated analysis identifies synergistic patterns
within the adversarial examples. These patterns enable the
construction of powerful multi-objective adversarial attacks
that require fewer objectives, allowing a more efficient allo-
cation of computational resources to each objective.
We have implemented our approach using four widely
recognized surrogate loss functions as outlined in previ-
ous research [41,46], as well as four additional functions
identified through extensive loss function searches [30,50].
The resulting MOS-81Attack has proven highly effective
through extensive experimentation on the CIFAR-10 [28]
and ImageNet [15] datasets, outperforming state-of-the-art
methods that leverage advanced gradient-based optimiza-
tion or eight distinct single-objective attacks for each sur-
rogate loss. Moreover, by examining the synergistic pat-
terns uncovered by MOS-8 Attack, we have developed a
tri-objective attack, MOS-3, which has also shown supe-
rior performance.
Our contributions can be summarized as follows:
We introduce the first multi-objective adversarial attack
framework, the MOS Attack, which tackles the challenge
of generating adversarial examples with multiple loss
functions. This framework is parameter-free and readily
extensible with new loss functions.
Our framework also offers an automated method for iden-
tifying synergistic patterns among loss functions, which
can be used to construct powerful multi-objective attacks
with fewer objectives, facilitating a more efficient alloca-
tion of computational resources.
We have implemented our framework with 8 loss func-
tions to form the MOS-8 Attack, which has been exten-
sively tested on CIFAR-10 and ImageNet datasets. Ad-
1We use MOS-8 Attack to denote MOS Attack implemented with 8 loss
functions, MOS-3to denote the attack implemented with three selected
loss functions.
ditionally, synergistic analysis over these 8 loss functions
has been conducted to provide insights regarding their in-
teractions and led to the development of a powerful tri-
objective attack, MOS-3.
2. Background
Adversarial attacks encompass methods that create adver-
sarial examples, which are used to assess and enhance
model robustness [11,34]. A white-box threat model is of-
ten considered for evaluating adversarial robustness, where
the adversary has full access to the model’s architecture, pa-
rameters, and gradients. While white-box existing strate-
gies mainly focus on one surrogate loss function [1,18,25,
51], a recent trend is the integration of multiple loss func-
tions into the attack paradigm [5,14,33,42,44].
Single-Objective Attacks. White-box attack methodolo-
gies typically employ a singular surrogate loss function,
focusing on optimization to craft adversarial examples.
Established strategies include the FGSM [20], C&W at-
tack [8], and PGD attack [34]. Croce et al. proposed a novel
parameter-free approach, Auto-PGD (APGD) attack, utiliz-
ing both Cross Entropy (CE) and the Difference of Logits
Ratio (DLR) loss functions. These were subsequently incor-
porated into the AutoAttack framework as APGD-CE and
APGD-DLR [11]. Expanding upon this, Yamamura et al.
enhanced APGD with conjugate gradient techniques, result-
ing in the creation of the powerful Auto Conjugate Gradient
(ACG) attack [51].
Multi-Objective Attacks. Recent advancements in adver-
sarial research have involved the integration of multiple sur-
rogate loss functions into the attack framework. Gowal et
al. introduced multiple targeted losses to enhance untar-
geted PGD attacks [21]. Further, work by Nikolaos et
al. established that the strategic variation of surrogate loss
functions considerably improve adversarial attack perfor-
mance [2]. However, these studies typically lack a system-
atic approach and a solid theoretical underpinning for man-
aging multiple losses.
Concurrently, researchers have expanded the adversarial
attack framework by introducing other types of objectives.
Williams et al. investigated the inclusion of additional norm
constraints [48], while Guo et al. and Liu et al. have investi-
gated the trade-off between perturbation intensity and con-
fidence measures [24,33]. These efforts have contributed to
the development of more diversified attack methodologies.
Our approach represents the first attempt to systemati-
cally incorporate multiple loss functions into adversarial at-
tacks and optimize the corresponding multi-objective op-
timization problem using a minimal set of examples via
smooth set-based optimization techniques.
2
Pareto Front
f1(δ)
f2(δ)
Solution
Weight Vector
Contour Line
(a) Decomposition-based Optimization.
Pareto Front
f1(δ)
f2(δ)
Solution
Weight Vector
Contour Line
Virtual Solution
Opt. Traj.
(b) Set-based Optimization.
Pareto Front
f1(δ)
f2(δ)
Solution
Weight Vector
Contour Line
Virtual Solution
Opt. Traj.
(c) Smooth Set-based Optimization (MOS).
Figure 1. Comparison of different optimization methods for conducting multi-objective adversarial attacks.
3. Multi-Objective Set-based Attack
In this section, we propose the problem formulation of the
smooth set-based approach for multi-objective adversarial
attacks. We begin by defining the multi-objective adversar-
ial attack, which employs multiple surrogate loss functions,
as a multi-objective optimization problem. Subsequently,
we introduce the decomposed subproblems and identify
three optimization challenges. Finally, we propose the for-
mulation of the smooth set-based optimization problem as a
solution to the challenges posed by the multi-objective na-
ture of adversarial attacks.
3.1. Multi-Objective Adversarial Attack
This study seeks to simultaneously optimize multiple sur-
rogate loss functions, rather than relying on a singular loss
function, to craft adversarial examples. Given mloss func-
tions L1, . . . , Lm, we define a multi-objective optimization
problem as follows:
max
δ∈B f(δ)=(f1(δ), . . . , fm(δ)),
fi(δ) = Li(hθ(x+δ), y),i {1, . . . , m}.
(2)
The notation remains consistent with the single-objective
scenario as depicted in Equation (1). Furthermore, this pa-
per adopts the extensively utilized -ball as the constraint
set for perturbations, denoted by B={δ:δϵ}.
Examples with higher values across multiple surrogate
loss functions are more susceptible to misclassification by
the model. In existing literature, this statement is supported
by frequent misclassifications of adversarial examples with
high values on singular loss functions [11,34,51].
Nevertheless, since there is often no single solution that
maximizes all the loss functions simultaneously, a set of
best trade-off solutions becomes necessary. This set of so-
lutions is called the Pareto set, and the corresponding values
of the objective functions are called the Pareto front. A for-
mal description of Pareto optimality is delineated below:
Definition 3.1 (Pareto Optimal).A solution δis Pareto
optimal if there is no other solution δsuch that fi(δ)
fi(δ)for all i {1, . . . , m}and fi(δ)> fi(δ)for at
least one i {1, . . . , m}.
Definition 3.2 (Pareto Set and Pareto Front).The Pareto set
is the set of all Pareto optimal solutions, and the Pareto front
is the set of all the values of the objective functions at the
Pareto optimal solutions.
3.2. Decomposition-Based Optimization
In this research, we employ the Tchebycheff decomposi-
tion to transform the multi-objective problem into a suite of
single-objective subproblems. Contrary to the linear scalar-
ization method, the Tchebycheff approach is capable of
targeting any location on the Pareto front. This is well-
recognized in the discipline of multi-objective optimiza-
tion [13,35,53]. By adopting this method, given Kweight
vectors w1,...,wK, the k-th decomposed subproblem is
defined as follows:
max
δ∈B gk(δ|wk) = min
iwki|fi(δ)z
i|,(3)
where wki is the i-th element of the k-th weight vector, and
z
idenotes the ideal value for the i-th objective. Upon solv-
ing these subproblems, a set of solutions correlated with the
weight vectors is obtained. This set can approximate the
Pareto set, as illustrated in Figure 1a.
The Tchebycheff method is instrumental in identifying
both convex and non-convex parts of the Pareto front with
vertical contour lines [45], as demonstrated in Figure 1a.
Nonetheless, it presents three challenges in optimization:
Complexity: Accurate approximation of the Pareto front
necessitates multiple points, exceeding the number of ob-
jectives (> m).
Ambiguity: The selection of appropriate weight vectors
is challenging.
Non-differentiability: The function gkcontains non-
differentiable points.
3.3. Smooth Set-based Optimization
To address the challenges associated with Tchebycheff de-
composition, we propose a formulation that leverages a
smooth set-based optimization approach. The primary is-
sue is the number of adversarial examples needed to max-
imize the surrogate loss functions. We tackle this by de-
liberately selecting a set of Kadversarial examples, with
3
K < m. Additionally, we default the weight vector to an
all-ones configuration to eliminate the ambiguity in select-
ing the weight vector. Lastly, we smooth the optimization
problem to circumvent non-differentiability issues.
Set-based Optimization Suppose we have a set of Kad-
versarial examples ={δ1,...,δK}to accommodate
multiple objectives and one weight vector wfor specify-
ing the contour lines. The set-based optimization problem
can be formulated as:
max
g(|w) = min
iwi|max
ki
fi(δki)z
i|,(4)
where wiis the i-th element of the weight vector, and z
iis
the optimal value of the i-th objective function.
A Geometric Interpretation. The inner maximization prob-
lem as formulated in Equation (4) allows each perturbation
vector, δ, to impart its dimensionality upon the objec-
tive function. We conceive a ’virtual adversarial example’
as a combination of the most advantageous dimensional at-
tributes of adversarial examples. The essence of the set-
based optimization procedure lies in pushing this virtual ad-
versarial example towards extreme points along the contour
lines, as depicted in Figure 1b.
We investigate the relationship between the number of
adversarial examples Kand the number of loss functions
m. Specifically:
K < m: A smaller number of adversarial examples are
utilized to address a multitude of objectives. This ap-
proach enables optimization of the functions using re-
duced resources. In the extreme scenario where K= 1, a
single solution must fulfill all objectives.
K=m: there exists a theoretical optimal solution com-
prising the individual optimal adversarial examples for
each objective function. Through proper optimization,
this ideal state may be achieved.
So far, the first two challenges of decomposition-based
optimization have been addressed by the set-based opti-
mization problem. However, the third challenge remains
unresolved. This can lead to oscillation in the optimization
process, as illustrated in Figure 1b. Therefore, we need to
design a smooth approximation of the set-based optimiza-
tion problem.
Smooth Set-based Optimization To smooth the above op-
timization problem, we need to take advantage of smooth
max and smooth min operators [6,31,32].
max {x1, . . . , xm} µlog m
X
i=1
exi!,
min {x1, . . . , xm}≈−µlog m
X
i=1
exi!,
(5)
where µis a smoothing parameter. A proof of the above
approximation can be found in [6].
Using the above operators, the objective function in
Equation (4) can be approximated as:
g(|w) = min
iwi|max
ki
fj(δki)z
i|,
µlog m
X
i
e(wi|µlog(PK
k=1 efi(δk))z
i|)!.(6)
Furthermore, if we consider z
i= 0 and a uniform
weight vector with wi=wj,i, j, we can get our final op-
timization problem as:
max
g() = µlog m
X
i
(
K
X
k=1
efi(δk))1!.(7)
The above formulation eases the optimization process
and avoids oscillations in the optimization process, which is
analyzed in the multi-objective literature [32]. We illustrate
the smoothed set-based optimization problem along with a
possible optimization trajectory in Figure 1c.
4. Methodology
4.1. MOS Attack: Implementation by APGD
By formulating a smooth set-based optimization problem,
we can now apply gradient-based optimization algorithms
for its efficient resolution. Within the domain of adversarial
attacks, our framework incorporates the well-known APGD
algorithm [11]. In this section, we provide a detailed expla-
nation of our attack, as outlined in Algorithm 1.
Initialization. The initialization process involves specify-
ing the input parameters and determining the initial adver-
sarial examples. We follow a procedure similar to that used
in the APGD. However, our approach take as input 1) a
set of adversarial examples , and 2) an objective function
g()defined in Equation (7).
Momentum-based Update Rule. We adopt the same
momentum-based update rule as in APGD, which is consid-
ered to be stable and efficient. The details are delineated in
lines 9and 10 of Algorithm 1. Our modification addresses
the optimization of a set of adversarial examples rather than
a single example. Therefore, we have adjusted the update
rule to: 1) optimize Xand concurrently, and 2) imple-
ment a set-based projection operator.
Optimzation Representation. Considering our function g
incorporates and subsequent projection requires X’s
range, concurrent optimization of both is essential. No-
tably, the statement in line 9consistently applies because
Xg() = g(), with X=+x.
4
Table 1. The loss function utilized for implementing our attack.
ID Loss Function Formula
0 Cross Entropy Loss [8,20,29,34,43]hy(x) + log(PK
i=1 ehi(x))
1 Marginal Loss [8,10,11,21,41]hy(x) + maxj=yhj(x)
2 Difference of Logits Ratio [10](hy(x) + maxj=yhj(x))/(hπ1(x)hπ3(x))
3 Boosted Cross-Entropy Loss [46]log py(x)log(1 maxj=ypj(x))
4 Searched Loss 1 [50]Piexp(10p/maxjpj)
5 Searched Loss 2 [50]exp(max(softmax(h+ 2softmax(5h))))
6 Searched Loss 3 [50] softmax(softmax(2 exp(h)h))(softmax(2h)+2yone-hot)
7 Searched Loss 4 [50](softmax(softmax(2h) + hyone-hot)yone-hot)2
Table 2. The notations used in the loss functions.
Notation Description
xThe adversarial example.
hThe vector of logits.
hyThe logit corresponding to the true class.
hjThe logit corresponding to the j-th class.
hπiThe i-th highest logit.
pyThe probability corresponding to the true class.
pjThe probability corresponding to the j-th class.
yone-hot The one-hot vector corresponding to the true class.
Algorithm 1 MOS Attack
1: Input: g,B,(0),η,Niter ,W={w0, . . . , wn}
2: Output: adv
3: X(0) x+(0)
4: X(1) PB(X(0) +ηg((0)))
5: (1) X(1) x
6: gmax max g((0)), g((1))
7: Xmax X(0) if gmax g((0))else Xmax X(1)
8: for k= 1 to Niter 1do
9: Z(k+1) PB(X(k)+ηg((k)))
10: X(k+1) PB(X(k)+αZ(k+1) X(k)+ (1
α)X(k)X(k+1))
11: (k+1) X(k+1) x
12: if g((k+1))> gmax then
13: Xmax X(k+1) and gmax g((k+1))
14: end if
15: if kWthen
16: if Condition 1 or Condition 2 then
17: ηη/2and X(k+1) Xmax and (k+1)
Xmax x
18: end if
19: end if
20: end for
Set-based Projection. A pivotal component in gradient-
based adversarial methodologies is the projection operator,
which constrains the adversarial examples within the de-
fined perturbation bounds. In our context, the challenge en-
tails projecting an ensemble of adversarial examples. This
is executed by individually projecting each example within
the allowable perturbation boundary.
Step Size Adjustment. We use the same step size control
method in APGD. The initial step size ηis set to 2ϵ, where ϵ
is the perturbation budget. When the checkpoint is reached,
the following two conditions are checked:
1. Ninc < ρ(wjwj1),
2. ηwj1=ηwjand gwj1
max =gwj
max,
where Ninc = #{i=wj1, . . . , wj1|g((i+1))>
g((i))}and gk
max = max{g((i))|i= 1, . . . , k}.
4.2. Automated Synergistic Pattern Mining
Few solutions automatically maximize different loss func-
tions in groups in smooth set-based optimization [31]. To
mine these loss synergistic patterns, we propose an auto-
mated mining method. This method includes two steps: 1)
determining the dominant examples that contribute to the
loss maximization, and 2) determining the synergistic pat-
tern of these dominant examples.
Determining Dominant Examples. With a set of Kper-
turbations ={δ1,...,δK}from the MOS Attack, we
aim to identify the dominant perturbations that maximize
the loss functions. Formally, we want to find an index vec-
tor β= [β1, . . . , βK], βi {0,1},i, for specifying a sub-
set of perturbations β={δi|βi= 1}that still maximize
the loss functions.
We first perform min-max normalization on the loss
functions fi(δk),i, and then the above formulation can be
rewritten as a bi-objective optimization problem:
min
β(
m
X
i=1
K
max
k=1
¯
fi(δk)K
max
k=1 βk¯
fi(δk),β0),(8)
where ¯
fi(δk)is the normalized loss function. The first term
serves to minimize the optimization gap. The 0norm,
which is the number of non-zero elements in a vector, aims
to minimize the number of selected examples.
Smooth Relaxation. Since the above problem is an NP-hard
combinatorial optimization problem, we relax it by intro-
ducing a smooth relaxation. Specifically, we relax the first
objective by incorporating smooth operators in Equation (5)
and the second objective by replacing the 0norm with the
1norm. The relaxed problem is then:
min
β
m
X
i=1
µlog( PK
k=1 efi(δk)
PK
k=1 eβkfi(δk) ) + λβ1,
s.t. β[0,1]K,
(9)
where λcontrols the sparsity.
The above problem is smooth and fully differentiable,
and we can solve it using gradient-based methods. After the
above problem is solved, we can get the dominant example
index β. Here, we set a threhold Tto further make βbinary.
5
Table 3. Overall Results. A comparative analysis of attack success rate among MOS-8 attacks with APGD-CE, ACG-CW, and APGD-All.
For MOS-8 Attack, we record its Kvalue, while for others it denoted the number of restarts. Notably, for APGD-All, we have documented
the index of the surrogate loss functions corresponding to the highest attack success rate. The optimal outcome is highlighted in bold and
marked with a grey background. The second-best performance is underscored for emphasis.
Attack Success Rate
Single-Objective Multi-Objective
CIFAR-10 (ϵ= 8/255) APGD
(1)
APGD
(5)
ACG
(5)
All
(1)*8
MOS-8
(1)
MOS-8
(5)
Diff.(5)
MOS|CEID Paper Architecture
0 Rade et al. (2022) [36] (ddpm) PreActResNet-18 39.17 39.28 42.65 42.78 (6) 42.59 42.77 +3.49
1 Rade et al. (2022) [36] (extra) PreActResNet-18 38.55 38.72 42.12 42.21 (6) 42.03 42.23 +3.51
2 Sehwag et al. (2022) [40] ResNet-18 41.57 41.76 44.79 44.16 (6) 43.79 44.18 +2.42
3 Chen et al. (2020) [9] ResNet-50 45.80 45.95 48.28 48.04 (4) 48.09 48.14 +3.49
4 Gowal et al. (2020) [22] WideResNet-28-10 34.31 34.46 36.90 36.96 (6) 36.77 36.95 +2.19
5 Wang et al. (2023) [47] WideResNet-28-10 29.72 29.91 N/A 32.44 (6) 32.25 32.49 +2.58
6 Rebuffi et al. (2021) [37] WideResNet-28-10 35.97 36.15 38.80 39.05 (6) 38.91 39.14 +2.99
7 Sehwag et al. (2022) [40] WideResNet-34-10 36.85 36.96 40.18 39.36 (5) 38.97 39.38 +2.43
8 Rade et al. (2022) [36] WideResNet-34-10 34.29 34.45 36.83 36.97 (6) 36.69 36.94 +2.49
9 Gowal et al. (2021) [23] WideResNet-70-16 31.43 31.62 33.04 33.50 (5) 33.33 33.51 +1.89
10 Gowal et al. (2020) [22] WideResNet-70-16 31.89 32.07 33.70 33.94 (5) 33.72 33.92 +1.85
11 Rebuf et al. (2021) [37] WideResNet-70-16 30.45 30.72 32.75 33.06 (6) 32.79 33.10 +2.38
Average Rank 5.92 4.92 2.79 2.00 3.50 1.58
ImageNet (ϵ= 4/255)
12 Salman et al. (2020) [39] ResNet-18 70.60 70.74 73.72 74.38 (5) 74.24 74.52 +3.87
13 Salman et al. (2020) [39] ResNet-50 61.38 61.58 63.70 64.92 (7) 64.5 64.94 +3.36
14 Wong et al. (2020) [49] ResNet-50 70.28 70.46 71.94 73.20 (5) 72.96 73.10 +2.64
15 Engstrom et al. (2019) [17] ResNet-50 67.62 67.82 68.60 70.12 (5) 69.86 69.92 +2.10
16 Salman et al. (2020) [39] WideResNet-50-2 59.02 59.12 59.92 61.26 (5) 60.76 61.14 +2.02
Average Rank 6.00 5.00 4.00 1.40 3.00 1.60
: reported results from the paper [51], with attack step Niter = 100.
Determining Loss Synergistic Patterns. For every dom-
inant perturbation δ, we check its contribution to the loss
functions. In particular, for each perturbation δ, if its i-
th loss value ¯
fi(δ)> C maxδ¯
fi(δ), we consider it
as a contribution to the i-th loss function. Thus, for every
dominant perturbation, we can get a contribution combina-
tion, which we call a loss synergistic pattern. We can record
the loss synergistic pattern for each dominant perturbation
across the dataset to facilitate the analysis of coupling ef-
fects between loss functions.
4.3. Implementation: Loss Functions
The final step of implementing our attack is to specify mul-
tiple surrogate loss functions. We incorporate a selection of
significant loss functions that are well-documented in exist-
ing literature [10,34,46], along with innovative loss func-
tions that have been identified through rigorous exploration
in the domain of loss search [30,50]. Details of these loss
functions can be found in Table 1and Table 2.
5. Experiment
5.1. Experiment Setup
Datasets and Models. We employed 17 distinct from Ro-
bustBench [12], which includes 12 models [4,27,36,37,
40,47] trained on the CIFAR-10 [28] dataset and 5 mod-
els [39,49] based on ImageNet [15] dataset. For perfor-
mance evaluation, we used all 10,000 test images from the
CIFAR-10 validation dataset and 5,000 images from Ima-
geNet validation dataset. To enable direct comparison with
the reported accuracy of the ACG attack, we preserved the
same image indexing for the ImageNet dataset as [51].
Comparative Attacks. For comparative purposes, we
incorporate the widely recognized APGD-CE attack, the
state-of-the-art ACG-CW, and the comprehensive APGD-
All attack. The latter aggregates optimal outcomes from an
ensemble of eight distinct APGD attacks, each employing
unique loss functions from Table 1.
Attack Parameters. Notably, the number of iterations for
our implemented attacks, including MOS Attack, APGD-
CE, and APGD-All, are uniformly set to 50. This choice
ensures thorough and rigorous testing of all methods. Ad-
6
0+1+2+3+4+5+6+7
61.3%
0+1+2+3+6+7
16.6%
Other
9.67%
4+5
6.56%
4+5+6
1.94%
0+1+2+3+5+6+7
1.5%
0+1+2+3+4+5+7
1.37%
6
1.1%
Loss Combination
0+1+2+3+4+5+6+7
0+1+2+3+6+7
Other
4+5
4+5+6
0+1+2+3+5+6+7
0+1+2+3+4+5+7
6
CIFAR-10
(a) CIFAR-10
0+1+2+3+4+5+6+7
57.8%
0+1+2+3+6+7
17.2%
Other
13.5%
4+5
4.58%
4+5+6
4.03%
0+1+2+3+7
1.64%
1+4+5+6
1.29%
Loss Combination
0+1+2+3+4+5+6+7
0+1+2+3+6+7
Other
4+5
4+5+6
0+1+2+3+7
1+4+5+6
ImageNet
(b) ImageNet
Figure 2. Occurrences of different loss synergistic patterns across CIFAR-10 and ImageNet datasets. We only retain the top patterns that
account for more than 1% of the adversarial examples.
ditionally, the remaining attack parameters follow the same
configuration as outlined in APGD [11].
5.2. Overall Results
This section presents the comparative results of our pro-
posed MOS-8 attack alongside other competing algorithms,
delineating them in terms of Attack Success Rate (ASR).
Detailed outcomes are provided in Table Table 3.
Single-objective v.s. Multi-objective. The results demon-
strate that multi-objective approaches outperform single-
objective approaches. The most effective single-objective
approach is the ACG-CW attack, utilizing 5 restarts and 100
attack steps; however, despite a considerably higher number
of attack steps Niter = 100, it only achieved the best ASR in
3 out of 17 instances, with a rate of 3 out of 12 for CIFAR-
10 and failing to succeed in any of the 5 cases for ImageNet.
MOS-8 v.s. APGD-All. The MOS-8 Attack demonstrates
a slight superiority over APGD-All. Notably, the MOS-8
Attack achieved comparable or better results with only five
adversarial examples, whereas APGD-All utilized eight.
MOS-8 Attack achieved an average rank of 1.58 on CIFAR-
10 and 1.60 on ImageNet, while APGD-All attained an av-
erage rank of 2.00 on CIFAR-10 and 1.40 on ImageNet.
Loss Functions. APGD-All’s findings underscored the su-
periority of loss 4-7 in Table 1, as attacks using them con-
sistently achieved the highest ASR out of 8 attack across
all models on both CIFAR-10 and ImageNet. This obser-
vation reveals the importance of selecting appropriate loss
functions for adversarial attacks.
Model Robutness. As the complexity of the model esca-
lates, mirrored by the sophistication of the architecture, the
performance disparity between MOS-8 Attack and APGD-
CE narrows. This indicates an incremental trend of model
robustness, making them more challenging to be attacked.
Table 4. A marked discrepancy from the theoretical upper bound
of set-based optimization, as estimated by comprehensive attacks.
ID Architecture MOS-8
(1)
MOS-8
(8)
Upper
Bound Diff.
0 R-18 42.59 42.84 42.92 -0.33/-0.08
1 R-18 42.03 42.21 42.37 -0.34/-0.16
2 R-18 43.79 44.18 44.40 -0.61/-0.22
3 R-50 48.09 48.22 48.36 -0.27/-0.14
4 WR-28-10 36.77 36.96 37.17 -0.40/-0.21
5 WR-28-10 32.25 32.47 32.67 -0.42/-0.30
6 WR-28-10 38.91 39.12 39.26 -0.35/-0.14
7 WR-34-10 38.97 39.39 39.73 -0.76/-0.34
8 WR-34-10 36.69 36.95 37.16 -0.47/-0.21
9 WR-70-16 33.33 33.52 33.82 -0.49/0.30
10 WR-70-16 33.72 33.95 34.12 -0.40/-0.17
11 WR-70-16 32.79 33.08 33.32 -0.53/-0.24
5.3. MOS Attack Upper Bound
To evaluate the gap between the performance of our adver-
sarial examples and the hypothetical optimal set delineated
in Section 3.3, we conducted an array of APGD attacks on
CIFAR-10 dataset. Specifically, we implemented 8 separate
APGD attacks, each employing a unique loss function and
accompanied by five restarts. For each image in the dataset,
we identified the single most effective adversarial example
out of the 40 (8 attacks x 5 restarts) created. The ASR was
then calculated based on these examples to serve as an indi-
cator of the maximum achievable performance.
Results. The comparison between the MOS-8 Attack with
K= 1,K= 8, and the upper bound is presented in Table 4.
Generally, the discrepancy is minimal. Even when a single
adversarial example is tailored to address all loss functions
in MOS-8 Attack, near-optimal outcomes are achieved. Ad-
ditionally, leveraging eight adversarial examples brings the
results within a negligible difference from the upper bound,
with less than a 0.35% gap in ASR.
7
7487 2592 6663
4219 5540
2903 1009
2806
1574
2187
750
831
4199
755
3328 2495 3463
443
151
R-18 R-50 WR-28-10 WR-34-10 WR-70-16
0
20
40
60
80
100
Loss Combination
0+1+2+3+6+7
4+5
4+5+6
0+1+2+3+4+5+7
6
0+1+2+3+5+6+7
1+4+5+6
Others
0+2+3+4+5+6+7
0+2+3+6+7
0+3+4+5+6+7
CIFAR-10
Model Structure
Percentage (%)
(a) CIFAR-10
1184 3560 1184
312 945 318
241 855 290
973 2815 847
R-18 R-50 WR-50-2
0
20
40
60
80
100
Loss Combination
0+1+2+3+6+7
4+5
4+5+6
0+1+2+3+7
1+4+5+6
Others
ImageNet
Model Structure
Percentage (%)
(b) ImageNet
Figure 3. Detailed distribution of loss synergistic patterns across
different model architectures. We only retain the top patterns that
account for more than 1% of the adversarial examples.
5.4. MOS Attack Analysis.
In this section, we employ our framework to conduct an
automated analysis of the relationships among various loss
functions. The solutions used for analysis is obtained from
MOS-8 Attack with K= 8 for both CIFAR-10 and Im-
ageNet datasets. The parameters selected were a sparsity
coefficient of λ= 1, a binary threshold of T= 0.85, and a
contribution threshold of C= 0.75.
We start by identifying the synergistic patterns among
loss functions for all model architectures within each
dataset. Subsequently, informed by these patterns, we de-
sign the MOS-3attack, utilizing three selected surrogate
loss functions.
5.4.1 Loss Synergistic Pattern
Figure 2depicts the synergistic loss patterns for CIFAR-
10 and ImageNet. A significant portion of the adversar-
ial examples—61.3% for CIFAR-10 and 57.8% for Ima-
geNet—contribute to all loss functions, indicating that the
majority of solutions optimize them concurrently. This ob-
servation suggests a low level of conflict among the loss
functions and helps explain why employing a single loss
function (K= 1) can yield near-optimal results.
Table 5. The comparative results of MOS-3Attack and MOS-3
Attack, with reference results from MOS-8 Attack.
CIFAR-10 (ϵ= 8/255)
ID All
(1)*8
MOS-8
(5)
MOS-3
(1)
MOS-3
(3)
MOS-3
(1)
MOS-3
(3)
9 33.50 (5) 33.51 31.19 31.47 33.51 33.60
10 33.94 (5) 33.92 31.63 31.83 33.91 33.93
11 33.06 (6) 33.10 30.23 30.43 33.03 33.07
ImageNet (ϵ= 4/255)
16 61.26 (5) 61.14 58.82 59.24 60.86 61.08
Transferability of Synergistic Patterns. We extended our
analysis to the transferability of these patterns across differ-
ent model architectures. We removed the common pattern
containing all the losses and plotted the pattern distributions
for each model architecture. As depicted in Figure 3, the
patterns demonstrate stability across datasets and models,
with a minor exception observed in ResNet-50’s patterns for
the CIFAR-10 dataset, which exhibited some unique, less
common patterns.
5.4.2 MOS-3Attack
The predominant patterns are 0+1+2+3+6+7 and 4+5, as
they ranked first and second in both datasets, as shown in
Figure 3. We subsequently constructed a compact version
of MOS Attack, termed MOS-3Attack, using losses 5, 6,
and 7. For validation of the effectiveness of MOS-3At-
tack, we compared it against MOS-3 Attack, which is con-
structed utilizing the first three loss functions.
Results. As illustrated in Table 5, MOS-3Attack outper-
forms MOS-3 Attack. MOS-3Attack has achieved bet-
ter performance across all models with K= 1 adversarial
example, surpassing that of MOS-3 Attack with K= 3
adversarial examples. Moreover, MOS-3Attack’s perfor-
mance is comparable to that of MOS-8 Attack. The above
outcomes confirm the value of leveraging loss synergistic
patterns to design more efficient yet effective attacks.
6. Conclusion
Our work has introduced the MOS Attack, a novel multi-
objective adversarial attack framework that effectively
combines multiple surrogate loss functions to generate
adversarial examples. The MOS-8 Attack, utilizing
eight such functions, has shown superior performance on
CIFAR-10 and ImageNet datasets compared to existing
state-of-the-art methods. The framework’s automated
method for identifying synergistic patterns among loss
functions has led to the development of the efficient MOS-
3tri-objective attack. Our contributions offer a scalable
and extensible approach to adversarial machine learning,
highlighting the potential for more resource-efficient
and potent adversarial attack strategies in the future.
8
References
[1] Maksym Andriushchenko, Francesco Croce, Nicolas Flam-
marion, and Matthias Hein. Square attack: A query-efficient
black-box adversarial attack via random search. In Computer
Vision - ECCV 2020 - 16th European Conference. Springer,
2020. 2
[2] Nikolaos Antoniou, Efthymios Georgiou, and Alexandros
Potamianos. Alternating objectives generates stronger pgd-
based adversarial attacks. CoRR, 2022. 1,2
[3] Sanjeev Arora, L´
aszl´
o Babai, Jacques Stern, and Z.
Sweedyk. The hardness of approximate optima in lattices,
codes, and systems of linear equations. J. Comput. Syst. Sci.,
1997. 1
[4] Maximilian Augustin, Alexander Meinke, and Matthias
Hein. Adversarial robustness on in- and out-distribution im-
proves explainability. In Computer Vision - ECCV 2020 -
16th European Conference. Springer, 2020. 6
[5] Alina Elena Baia, Gabriele Di Bari, and Valentina Pog-
gioni. Effective universal unrestricted adversarial attacks us-
ing a MOE approach. In Applications of Evolutionary Com-
putation - 24th International Conference, EvoApplications.
Springer, 2021. 2
[6] Amir Beck and Marc Teboulle. Smoothing and first order
methods: A unified framework. SIAM J. Optim., 2012. 4
[7] Yulong Cao, Chaowei Xiao, Benjamin Cyr, Yimeng Zhou,
Won Park, Sara Rampazzi, Qi Alfred Chen, Kevin Fu, and
Z. Morley Mao. Adversarial sensor attack on lidar-based
perception in autonomous driving. In Proceedings of the
2019 ACM SIGSAC Conference on Computer and Commu-
nications Security, (CCS). ACM, 2019. 1
[8] Nicholas Carlini and David A. Wagner. Towards evaluating
the robustness of neural networks. In 2017 IEEE Symposium
on Security and Privacy, SP. IEEE Computer Society, 2017.
1,2,5
[9] Tianlong Chen, Sijia Liu, Shiyu Chang, Yu Cheng, Lisa
Amini, and Zhangyang Wang. Adversarial robustness:
From self-supervised pre-training to fine-tuning. In 2020
IEEE/CVF Conference on Computer Vision and Pattern
Recognition, CVPR 2020, Seattle, WA, USA, June 13-19,
2020. Computer Vision Foundation / IEEE, 2020. 6
[10] Francesco Croce and Matthias Hein. Minimally distorted
adversarial examples with a fast adaptive boundary attack.
In Proceedings of the 37th International Conference on Ma-
chine Learning, ICML, 2020. 5,6
[11] Francesco Croce and Matthias Hein. Reliable evalua-
tion of adversarial robustness with an ensemble of diverse
parameter-free attacks. In Proceedings of the 37th Inter-
national Conference on Machine Learning, ICML. PMLR,
2020. 1,2,3,4,5,7
[12] Francesco Croce, Maksym Andriushchenko, Vikash Se-
hwag, Edoardo Debenedetti, Nicolas Flammarion, Mung
Chiang, Prateek Mittal, and Matthias Hein. Robustbench: a
standardized adversarial robustness benchmark. In Proceed-
ings of the Neural Information Processing Systems Track on
Datasets and Benchmarks 1, NeurIPS Datasets and Bench-
marks, 2021. 6
[13] Kalyanmoy Deb, Samir Agrawal, Amrit Pratap, and T. Me-
yarivan. A fast and elitist multiobjective genetic algorithm:
NSGA-II. IEEE Trans. Evol. Comput., 2002. 3
[14] Timo M. Deist, Monika Grewal, Frank J. W. M. Dankers,
Tanja Alderliesten, and Peter A. N. Bosman. Multi-objective
learning using HV maximization. In Evolutionary Multi-
Criterion Optimization - 12th International Conference,
EMO. Springer, 2023. 2
[15] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li,
and Li Fei-Fei. Imagenet: A large-scale hierarchical im-
age database. In 2009 IEEE Computer Society Conference
on Computer Vision and Pattern Recognition CVPR. IEEE
Computer Society, 2009. 1,2,6
[16] Yinpeng Dong, Hang Su, Baoyuan Wu, Zhifeng Li, Wei Liu,
Tong Zhang, and Jun Zhu. Efficient decision-based black-
box adversarial attacks on face recognition. In IEEE Confer-
ence on Computer Vision and Pattern Recognition, (CVPR).
Computer Vision Foundation / IEEE, 2019. 1
[17] Logan Engstrom, Andrew Ilyas, Hadi Salman, Shibani San-
turkar, and Dimitris Tsipras. Robustness (python library),
2019. 6
[18] Qi-An Fu, Yinpeng Dong, Hang Su, Jun Zhu, and Chao
Zhang. Autoda: Automated decision-based iterative adver-
sarial attacks. In 31st USENIX Security Symposium, USENIX
Security. USENIX Association, 2022. 2
[19] Narmin Ghaffari Laleh, Daniel Truhn, Gregory Patrick Veld-
huizen, Tianyu Han, Marko van Treeck, Roman D Buelow,
Rupert Langer, Bastian Dislich, Peter Boor, Volkmar Schulz,
et al. Adversarial attacks and adversarial robustness in com-
putational pathology. Nature communications, 2022. 1
[20] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy.
Explaining and harnessing adversarial examples. In 3rd In-
ternational Conference on Learning Representations, ICLR,
2015. 1,2,5
[21] Sven Gowal, Jonathan Uesato, Chongli Qin, Po-Sen Huang,
Timothy A. Mann, and Pushmeet Kohli. An alternative sur-
rogate loss for pgd-based adversarial testing. CoRR, 2019. 1,
2,5
[22] Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy A.
Mann, and Pushmeet Kohli. Uncovering the limits of adver-
sarial training against norm-bounded adversarial examples.
CoRR, abs/2010.03593, 2020. 6
[23] Sven Gowal, Sylvestre-Alvise Rebuffi, Olivia Wiles, Florian
Stimberg, Dan Andrei Calian, and Timothy A. Mann. Im-
proving robustness using generated data. In Advances in
Neural Information Processing Systems 34: Annual Con-
ference on Neural Information Processing Systems 2021,
NeurIPS, 2021. 6
[24] Ping Guo, Cheng Gong, Xi Lin, Zhiyuan Yang, and Qingfu
Zhang. Exploring the adversarial frontier: Quantifying
robustness via adversarial hypervolume. arXiv preprint
arXiv:2403.05100, 2024. 2
[25] Ping Guo, Fei Liu, Xi Lin, Qingchuan Zhao, and Qingfu
Zhang. L-autoda: Large language models for automatically
evolving decision-based adversarial attacks. In Proceedings
of the Genetic and Evolutionary Computation Conference
Companion, GECCO. ACM, 2024. 2
9
[26] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
Deep residual learning for image recognition. In IEEE Con-
ference on Computer Vision and Pattern Recognition, CVPR.
IEEE Computer Society, 2016. 1
[27] Hanxun Huang, Yisen Wang, Sarah M. Erfani, Quanquan
Gu, James Bailey, and Xingjun Ma. Exploring architectural
ingredients of adversarially robust deep neural networks. In
Advances in Neural Information Processing Systems 34: An-
nual Conference on Neural Information Processing Systems
2021, (NeurIPS), 2021. 6
[28] A. Krizhevsky. Learning Multiple Layers of Features from
Tiny Images. Technical report, Univ. Toronto, 2009. 1,2,6
[29] Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. Ad-
versarial examples in the physical world. In 5th International
Conference on Learning Representations, ICLR, 2017. 5
[30] Hao Li, Tianwen Fu, Jifeng Dai, Hongsheng Li, Gao Huang,
and Xizhou Zhu. Autoloss-zero: Searching loss functions
from scratch for generic tasks. In IEEE/CVF Conference
on Computer Vision and Pattern Recognition, CVPR. IEEE,
2022. 1,2,6
[31] Xi Lin, Yilu Liu, Xiaoyuan Zhang, Fei Liu, Zhenkun Wang,
and Qingfu Zhang. Few for many: Tchebycheff set scalar-
ization for many-objective optimization. CoRR, 2024. 4,5
[32] Xi Lin, Xiaoyuan Zhang, Zhiyuan Yang, Fei Liu, Zhenkun
Wang, and Qingfu Zhang. Smooth tchebycheff scalarization
for multi-objective optimization. In Forty-first International
Conference on Machine Learning, ICML, 2024. 4
[33] Shengcai Liu, Ning Lu, Wenjing Hong, Chao Qian, and Ke
Tang. Effective and imperceptible adversarial textual attack
via multi-objectivization. ACM Trans. Evol. Learn. Optim.,
2024. 2
[34] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt,
Dimitris Tsipras, and Adrian Vladu. Towards deep learning
models resistant to adversarial attacks. In 6th International
Conference on Learning Representations, ICLR, 2018. 1,2,
3,5,6
[35] Kaisa Miettinen. Nonlinear multiobjective optimization.
Springer Science & Business Media, 2012. 3
[36] Rahul Rade and Seyed-Mohsen Moosavi-Dezfooli. Reduc-
ing excessive margin to achieve a better accuracy vs. robust-
ness trade-off. In The Tenth International Conference on
Learning Representations, (ICLR). OpenReview.net, 2022.
6
[37] Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A. Calian, Flo-
rian Stimberg, Olivia Wiles, and Timothy A. Mann. Fixing
data augmentation to improve adversarial robustness. CoRR,
2021. 6
[38] Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. Dy-
namic routing between capsules. In Advances in Neural In-
formation Processing Systems 30: Annual Conference on
Neural Information Processing Systems, 2017. 1
[39] Hadi Salman, Andrew Ilyas, Logan Engstrom, Ashish
Kapoor, and Aleksander Madry. Do adversarially robust im-
agenet models transfer better? In Advances in Neural Infor-
mation Processing Systems 33: Annual Conference on Neu-
ral Information Processing Systems 2020, NeurIPS, 2020. 6
[40] Vikash Sehwag, Saeed Mahloujifar, Tinashe Handina, Sihui
Dai, Chong Xiang, Mung Chiang, and Prateek Mittal. Ro-
bust learning meets generative models: Can proxy distribu-
tions improve adversarial robustness? In The Tenth Inter-
national Conference on Learning Representations, (ICLR).
OpenReview.net, 2022. 6
[41] Gaurang Sriramanan, Sravanti Addepalli, Arya Baburaj, and
Venkatesh Babu R. Guided adversarial attack for evaluating
and enhancing adversarial defenses. In Advances in Neu-
ral Information Processing Systems 33: Annual Conference
on Neural Information Processing Systems 2020, NeurIPS,
2020. 1,2,5
[42] Takahiro Suzuki, Shingo Takeshita, and Satoshi Ono. Adver-
sarial example generation using evolutionary multi-objective
optimization. In IEEE Congress on Evolutionary Computa-
tion, CEC. IEEE, 2019. 2
[43] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan
Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus.
Intriguing properties of neural networks. In 2nd Interna-
tional Conference on Learning Representations, ICLR, 2014.
1,5
[44] Hanrui Wang, Shuo Wang, Cunjian Chen, Massimo
Tistarelli, and Zhe Jin. A multi-task adversarial attack
against face authentication. CoRR, 2024. 2
[45] Rui Wang, Qingfu Zhang, and Tao Zhang. Pareto adaptive
scalarising functions for decomposition based algorithms.
In Evolutionary Multi-Criterion Optimization - 8th Interna-
tional Conference, EMO. Springer, 2015. 3
[46] Yisen Wang, Difan Zou, Jinfeng Yi, James Bailey, Xingjun
Ma, and Quanquan Gu. Improving adversarial robustness re-
quires revisiting misclassified examples. In 8th International
Conference on Learning Representations, ICLR. OpenRe-
view.net, 2020. 1,2,5,6
[47] Zekai Wang, Tianyu Pang, Chao Du, Min Lin, Weiwei Liu,
and Shuicheng Yan. Better diffusion models further improve
adversarial training. In International Conference on Machine
Learning, (ICML). PMLR, 2023. 6
[48] Phoenix Neale Williams and Ke Li. Black-box sparse ad-
versarial attack via multi-objective optimisation CVPR pro-
ceedings. In IEEE/CVF Conference on Computer Vision and
Pattern Recognition, CVPR. IEEE, 2023. 2
[49] Eric Wong, Leslie Rice, and J. Zico Kolter. Fast is better than
free: Revisiting adversarial training. In 8th International
Conference on Learning Representations, ICLR. OpenRe-
view.net, 2020. 6
[50] Pengfei Xia, Ziqiang Li, and Bin Li. Tightening the ap-
proximation error of adversarial risk with auto loss func-
tion search. In Proceedings of the Genetic and Evolutionary
Computation Conference Companion, GECCO 2024, Mel-
bourne, VIC, Australia, July 14-18, 2024, 2024. 2,5,6
[51] Keiichiro Yamamura, Haruki Sato, Nariaki Tateiwa, Nozomi
Hata, Toru Mitsutake, Issa Oe, Hiroki Ishikura, and Katsuki
Fujisawa. Diversified adversarial attacks based on conjugate
gradient method. In International Conference on Machine
Learning, ICML. PMLR, 2022. 1,2,3,6
[52] Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing,
Laurent El Ghaoui, and Michael I. Jordan. Theoretically
10
principled trade-off between robustness and accuracy. In
Proceedings of the 36th International Conference on Ma-
chine Learning, ICML. PMLR, 2019. 1
[53] Qingfu Zhang and Hui Li. MOEA/D: A multiobjective evo-
lutionary algorithm based on decomposition. IEEE Trans.
Evol. Comput., 2007. 3
11
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Artificial Intelligence (AI) can support diagnostic workflows in oncology by aiding diagnosis and providing biomarkers directly from routine pathology slides. However, AI applications are vulnerable to adversarial attacks. Hence, it is essential to quantify and mitigate this risk before widespread clinical use. Here, we show that convolutional neural networks (CNNs) are highly susceptible to white- and black-box adversarial attacks in clinically relevant weakly-supervised classification tasks. Adversarially robust training and dual batch normalization (DBN) are possible mitigation strategies but require precise knowledge of the type of attack used in the inference. We demonstrate that vision transformers (ViTs) perform equally well compared to CNNs at baseline, but are orders of magnitude more robust to white- and black-box attacks. At a mechanistic level, we show that this is associated with a more robust latent representation of clinically relevant categories in ViTs compared to CNNs. Our results are in line with previous theoretical studies and provide empirical evidence that ViTs are robust learners in computational pathology. This implies that large-scale rollout of AI models in computational pathology should rely on ViTs rather than CNN-based classifiers to provide inherent protection against perturbation of the input data, especially adversarial attacks.
Conference Paper
Full-text available
Recent studies have shown that Deep Leaning models are susceptible to adversarial examples, which are data, in general images, intentionally modified to fool a machine learning classifier. In this paper, we present a multi-objective nested evolutionary algorithm to generate universal unrestricted adversarial examples in a black-box scenario. The unrestricted attacks are performed through the application of well-known image filters that are available in several image processing libraries, modern cameras, and mobile applications. The multi-objective optimization takes into account not only the attack success rate but also the detection rate. Experimental results showed that this approach is able to create a sequence of filters capable of generating very effective and undetectable attacks.
Article
Deep-learning-based identity management systems, such as face authentication systems, are vulnerable to adversarial attacks. However, existing attacks are typically designed for single-task purposes, which means they are tailored to exploit vulnerabilities unique to the individual target rather than being adaptable for multiple users or systems. This limitation makes them unsuitable for certain attack scenarios, such as morphing, universal, transferable, and counter attacks. In this paper, we propose a multi-task adversarial attack algorithm called MTADV that are adaptable for multiple users or systems. By interpreting these scenarios as multi-task attacks, MTADV is applicable to both single- and multi-task attacks, and feasible in the white- and gray-box settings. Furthermore, MTADV is effective against various face datasets, including LFW, CelebA, and CelebA-HQ, and can work with different deep learning models, such as FaceNet, InsightFace, and CurricularFace. Importantly, MTADV retains its feasibility as a single-task attack targeting a single user/system. To the best of our knowledge, MTADV is the first adversarial attack method that can target all of the aforementioned scenarios in one algorithm.
Article
The field of adversarial textual attack has significantly grown over the last few years, where the commonly considered objective is to craft adversarial examples (AEs) that can successfully fool the target model. However, the imperceptibility of attacks, which is also essential for practical attackers, is often left out by previous studies. In consequence, the crafted AEs tend to have obvious structural and semantic differences from the original human-written text, making them easily perceptible. In this work, we advocate leveraging multi-objectivization to address such issue. Specifically, we reformulate the problem of crafting AEs as a multi-objective optimization problem, where the attack imperceptibility is considered as an auxiliary objective. Then, we propose a simple yet effective evolutionary algorithm, dubbed HydraText, to solve this problem. HydraText can be effectively applied to both score-based and decision-based attack settings. Exhaustive experiments involving 44237 instances demonstrate that HydraText consistently achieves competitive attack success rates and better attack imperceptibility than the recently proposed attack approaches. A human evaluation study also shows that the AEs crafted by HydraText are more indistinguishable from human-written text. Finally, these AEs exhibit good transferability and can bring notable robustness improvement to the target model by adversarial training.
Conference Paper
In Autonomous Vehicles (AVs), one fundamental pillar is perception,which leverages sensors like cameras and LiDARs (Light Detection and Ranging) to understand the driving environment. Due to its direct impact on road safety, multiple prior efforts have been made to study its the security of perception systems. In contrast to prior work that concentrates on camera-based perception, in this work we perform the first security study of LiDAR-based perception in AV settings, which is highly important but unexplored. We consider LiDAR spoofing attacks as the threat model and set the attack goal as spoofing obstacles close to the front of a victim AV. We find that blindly applying LiDAR spoofing is insufficient to achieve this goal due to the machine learning-based object detection process.Thus, we then explore the possibility of strategically controlling the spoofed attack to fool the machine learning model. We formulate this task as an optimization problem and design modeling methods for the input perturbation function and the objective function.We also identify the inherent limitations of directly solving the problem using optimization and design an algorithm that combines optimization and global sampling, which improves the attack success rates to around 75%. As a case study to understand the attack impact at the AV driving decision level, we construct and evaluate two attack scenarios that may damage road safety and mobility.We also discuss defense directions at the AV system, sensor, and machine learning model levels.