Available via license: CC BY 4.0
Content may be subject to copyright.
Appl. Sci. 2025, 15, 647 https://doi.org/10.3390/app15020647
Review
Privacy Auditing in Dierential Private Machine Learning:
The Current Trends
Ivars Namatevs *, Kaspars Sudars, Arturs Nikulins and Kaspars Ozols
Institute of Electronics and Computer Science, 14 Dzerbenes St., LV-1006 Riga, Latvia;
kaspars.sudars@edi.lv (K.S.); arturs.nikulins@edi.lv (A.N.); kaspars.ozols@edi.lv (K.O.)
* Correspondence: ivars.namatevs@edi.lv
Abstract: Dierential privacy has recently gained prominence, especially in the context of
private machine learning. While the denition of dierential privacy makes it possible to
provably limit the amount of information leaked by an algorithm, practical implementa-
tions of dierentially private algorithms often contain subtle vulnerabilities. Therefore,
there is a need for eective methods that can audit dierentially private algorithms
before they are deployed in the real world. The article examines studies that recommend
privacy guarantees for dierential private machine learning. It covers a wide range of
topics on the subject and provides comprehensive guidance for privacy auditing schemes
based on privacy aacks to protect machine-learning models from privacy leakage. Our
results contribute to the growing literature on dierential privacy in the realm of privacy
auditing and beyond and pave the way for future research in the eld of privacy-preserv-
ing models.
Keywords: dierential privacy; dierential private machine learning; dierential privacy
auditing; privacy aacks
1. Introduction
In today’s data-driven world, more and more researchers and data scientists are us-
ing machine learning to develop beer models or more innovative solutions for a beer
future. These models often tend to use sensitive (e.g., health-related personal data and
proprietary data) [1] or private data (e.g., personal identiable information, such as age,
name, and user input data), which can lead to privacy issues [2]. When using data con-
taining sensitive information, the individual’s right to privacy must be respected, both
from an ethical and legal perspective [3]. The functionality of privacy modeling for the
privacy landscape ranges from descriptive queries to training large machine-learning
(ML) models with millions of parameters [4]. Moreover, ML subsets of deep-learning al-
gorithms can analyze and process large amounts of data collected from dierent users or
devices to detect unusual paerns [5]. On the other hand, ML systems are exposed to
several serious vulnerabilities. This logically leads to the consideration that training ML
models are vulnerable to privacy aacks. Therefore, it is crucial for the practical applica-
tion of ML models and algorithms to protect the privacy of input datasets, training data,
or data that must be kept secret during inference.
Numerous works have shown that data and parameters of ML models can leak sen-
sitive information about their training, for example, in statistical modeling [6–9]. There
Academic Editor: Douglas
O'Shaughnessy
Received: 3 December 2024
Revised: 31 December 2024
Accepted: 8 January 2025
Published: 10 January 2025
Citation: Namatevs, I.; Sudars, K.;
Nikulins, A.; Ozols, K. Privacy
Auditing in Dierential Private
Machine Learning: The Current
Trends. Appl. Sci. 2025, 15, 647.
hps://doi.org/10.3390/app15020647
Copyright: © 2025 by the authors.
Licensee MDPI, Basel, Swierland.
This article is an open access article
distributed under the terms and con-
ditions of the Creative Commons At-
tribution (CC BY) license (hps://cre-
ativecommons.org/licenses/by/4.0/).
Appl. Sci. 2025, 15, 647 2 of 57
are several causes of data leakage, such as overing and inuence [10], model architec-
ture [11], or memorization [12]. If your personal or important data are used to train an ML
model, you may want to ensure that an intruder cannot steal your data. To measure and
reduce the likelihood of sensitive data leakage, there are various mitigation and protection
strategies.
A robust framework for protecting sensitive data in statistical databases, especially
through mechanisms such as noise addition and gradient clipping, is a proven mathemat-
ical framework called dierential privacy (DP) [2,13]. The core idea of DP is to add noise to
the data or model parameters to obscure an individual’s inuence on a data release [14],
where the unit of privacy characterizes what you are trying to protect. To clarify DP, it must
be provably guaranteed that an aacker is not able to reliably predict whether or not a
particular individual is included in the dataset. Consequently, such an approach can pro-
vide a strong privacy guarantee for individuals. In this context, DP is a powerful privacy-
preserving tool to prevent sensitive information about an individual from being revealed
in a variety of ML models and to analyze the privacy of/from the training data.
The integration of DP methods into ML models makes them more robust to privacy
aacks and paves the way for dierential private machine learning (DPML) [15–18]. In this
regard, ML models using DP algorithms could guarantee that each user’s contribution to
the dataset does not result in a signicantly dierent model [19]. However, the advantages
of ML models’ accuracy with DP’s strong privacy guarantees and ease of decentralization
[20] come with a price [21], especially when aiming for a low-privacy parameter [22]. For
example, models trained with the Dierentially Private Stochastic Gradient Descent (DP-
SGD) privacy algorithm [23] show a signicant decrease in accuracy compared to non-DP
models [24,25]. The main reason for this could be that the privacy analysis of existing
DPML methods and algorithms (e.g., DP-SGD) is overly cautious in real-world scenarios.
Ensuring privacy in DPML raises the following key questions: How can we guaran-
tee the privacy of the model? Does our model reveal private information? What level of
dierential privacy does an algorithm satisfy? Answering these questions is crucial, be-
cause overestimating the privacy guarantee leads to a decrease in the accuracy of the
model, while underestimating it leads to a privacy leakage [26].
To prevent privacy leakage from ML models, we use a DP framework that adds a
calculated amount of noise or randomness to hide each individual’s contribution to the
data, thus reducing the risk of privacy leakage from small changes in a dataset [27]. A
common approach is to add noise to the data during the training process [28]. The process
of determining how to add noise is called mechanism in the context of DP and can be
inuenced by several factors, including the specic noise distribution (e.g., Laplacian and
Gaussian mechanisms), the desired level of the privacy, and the type of query. DP can also
facilitate eective data-partitioning strategies when sensitive information is distributed
across multiple datasets or partitions. By ensuring that each portion adheres to DP stand-
ards, organizations can analyze aggregated data without compromising individual pri-
vacy. These strategies are used, for example, when data cannot be centralized due to pri-
vacy concerns (e.g., federated learning) [29]. There are other approaches where noise is
added to the inputs, outputs, ground truth labels, or even to the whole model [30]. As a
result, the algorithm can still learn from the data and make accurate predictions about
decisions. Adding noise provides a strong worst-case privacy guarantee for ML algorithms
[31]. Moreover, there is a crucial technique in the DP context—gradient clipping, which is
used in training ML models. It helps to ensure that the contribution of individual data
points to the model’s gradients remains bounded, improving privacy guarantees while
preserving the performance of the model. The purpose of gradient clipping is twofold.
First, bounding the gradients reduces the sensitivity of the model output to individual
training examples, as doing so is essential for ensuring DP. Second, gradient clipping
Appl. Sci. 2025, 15, 647 3 of 57
helps prevent overing by avoiding extreme updates that could lead to the memoriza-
tion of specic data points. Although DP is a formalization stating that a query should not
reveal that an individual is present in a trained dataset. It should be noted that there are
recent approaches in which ML models are trained non-privately and their predictions
are de-noised before being released to satisfy DP [32]. This means that DP gives the user
a quantitative guarantee of how distinguishable an individual’s information can be to a
potential aacker.
Dierential privacy [2,33] ensures that running the algorithm on two adjacent da-
tasets, and , results in two approximately equal distributions that dier in one data
point, and that the two distributions are approximately equal. The privacy level is often
characterized by the privacy parameters (also known as privacy risk): , i.e., the privacy
loss; and , i.e., the probability of deviation from the privacy guarantee. Together, these
parameters form a mathematical framework for quantifying privacy and allow the ne-
tuning the privacy level to balance data utility and privacy concerns. Choosing appropri-
ate privacy parameters is challenging but crucial, as weak parameters can lead to exces-
sive privacy leakage, while strong parameters can compromise the utility of the model
utility [34]. A small ensures that an aacker cannot reliably distinguish whether the
algorithm has processed or ; that is, it provides strong privacy but less accuracy.
Meanwhile, a large provides weaker privacy guarantees [35,36]. This parameter con-
trols the trade-o between privacy and utility. Since there are no guidelines on how to set
the right amount of ϵ and δ in practice, this can be a challenging process. Even when im-
plemented correctly, there are several known cases where published DP algorithms with
miscalculated privacy guarantees incorrectly report a higher level of privacy [37,38]. In
order to provide the expected privacy guarantees for the DPML model, the privacy audit-
ing must be used.
Privacy auditing—the process of testing privacy guarantees—relies on multiple
model training runs in dierent privacy congurations to eectively detect privacy leak-
age [39,40]. There are many reasons why one would want to audit the privacy guarantees
of a privately dierentiated algorithm. First, if we check and the audited value of ϵ is
greater than the (claimed) upper bound, the privacy proof is false, and there is an error or
bug in our algorithm [34,41]. Second, if we audit and the audited value of ϵ matches, then
we can say that our privacy proof is a tight privacy estimate or tight auditing, and our
privacy model does not need to be improved [42]. Tight auditing refers to the process of
empirically estimating the privacy level of a DP algorithm in a way that closely matches
its theoretical privacy guarantees. The goal is to obtain an accurate estimate of the actual
privacy provided by the algorithm when applied to real-world data. Existing auditing
scenarios for DP suer from the limitation that they provide narrow estimates under im-
plausible worst-case assumptions and that they require thousands or millions of training
runs to produce non-trivial statistical estimates of privacy leakage [43]. Third, if we are
unable to rigorously prove how private our model is, then auditing provides a heuristic
measure of how private it is [44].
In practice, the dierential privacy audit [45–50] of ML modeling has been proposed
to empirically measure and analyze the privacy leakage through the DPML algorithm. To
investigate and audit the privacy of data and models, you must rst apply a specic type
of aack, called privacy aack, to a DP algorithm and then perform an analysis, for exam-
ple, a statistical calculation. To evaluate data leakage in ML, we categorize privacy aacks
into membership inference aacks [51–53], data-poisoning aacks [54,55], model extrac-
tion aacks [56,57], model inversion aacks [58,59], and property inference aacks [60]. In
addition, assumptions must be made about the aacker’s knowledge and ability to access
the model in either black-box or white-box seings. Finally, the aacker’s success is con-
verted into an estimate using an aack’s evaluation procedure. The privacy aack,
Appl. Sci. 2025, 15, 647 4 of 57
together with the privacy assessment, forms an auditing scheme. For example, most au-
diting schemes [46,47,61,62] have been developed for centralized seings.
Motivation for the research. The aim of this review paper is to provide a compre-
hensive and clear overview of the study of privacy auditing schemes issued in the context
of dierential private machine learning. The following aspects are considered:
• The implementation of dierential privacy in consumer-use cases makes greater pri-
vacy awareness necessary, thus raising both data-related and technical concerns. As
a result, privacy auditors are looking for scalable, transparent, and powerful auditing
methods that enable accurate privacy assessment under realistic conditions.
• Auditing methods and algorithms have been researched and proven eective for
DPML models. In general, auditing methods can be categorized according to privacy
aacks. However, the implementation of sophisticated privacy auditing requires a
comprehensive privacy-auditing methodology.
• Existing privacy-auditing techniques are not yet well adapted to specic tasks and
models, as there is no clear consensus on the privacy loss parameters to be chosen,
such as ϵ, algorithmic vulnerabilities, and complexity issues. Therefore, there is an
urgent need for eective auditing schemes that can provide empirical guarantees for
privacy loss.
Contributions. This paper provides a comprehensive summary of privacy aacks
and violations with practical auditing procedures for each aack or violation. The main
contributions can be summarized as follows:
• We systematically present types and techniques for privacy aacks in the context of
dierential privacy machine-learning modeling. Recent research on privacy aacks
for privacy auditing is categorized into ve main categories: membership inference
aacks, data-poisoning aacks, model inversion aacks, model extraction aacks,
and property inference.
• A structured literature review of existing approaches to privacy auditing in dieren-
tial privacy is conducted with examples from inuential research papers. The com-
prehensive process of proving auditing schemes is presented. An in-depth analysis
of auditing schemes is provided, along with an abridged description paper of the
papers.
The rest of this article is organized as follows: The following section provides an
overview of the relevant background of the theoretical foundations of dierential privacy,
including its mathematical denitions and basic properties. Section 3 describes the types
of privacy aacks on ML models before evaluating privacy leakage in the context of dif-
ferential privacy. Section 4 presents various privacy auditing schemes based on privacy
aacks and privacy violations, along with some inuential paper examples. Section 5 dis-
cusses the manuscript and provides future research trends.
2. Preliminaries
In this section, the brief theoretical and mathematical foundations for dierential pri-
vacy are presented.
2.1. Dierential Privacy Fundamentals
An individual’s privacy is closely related to intuitive notions of privacy, such as re-
lease of privacy unit and privacy loss [63]. The privacy unit (e.g., person) quanties how
much inuence a person can have on the dataset. The privacy loss quanties how recog-
nizable the data release is. The formalization of dierential privacy is dened in relation
to the privacy unit and the privacy loss. The DP framework can ensure that the insertion
or deletion of a record in a dataset has no eect on the query results, thus ensuring privacy.
Appl. Sci. 2025, 15, 647 5 of 57
To satisfy the DP, a random function called “mechanism” is used. Any function can be a
mechanism as long as we can mathematically prove that the function satises the given
denition of dierential privacy. The relevant denitions, proof, and theorems are pre-
sented below.
DP mechanism: DP relies on rigorous mathematical proofs to ensure privacy guar-
antees. These foundations help us to understand the behavior of DP models and deter-
mine the privacy loss [64,65]. DP is dened in terms of privacy unit (input) and privacy
loss (output) of a randomized function. The description of this function that satises DP
is called the mechanism [66].
Adjacent datasets: Two datasets, and , are adjacent if diers by the
change in a single individual . To determine whether your data analysis is a DP
data analysis, you must provide data transformations that contain each function from a
dataset to a dataset. For example, if you are using functions to help you understand your
data, the properties or statistics you are using are statistical queries.
Inbounded and bounded DP [67,68]: If the dataset is not known, you are operating
under unbounded DP (e.g., the sets of possible datasets is of any size). In contrast, if the
dataset is known, you are operating under bounded DP (e.g., the sets of possible datasets
are known size).
Pure DP [2]: In the original DP, a mechanism, satises -DP if for all pairs of
adjacent datasets, and , diering by one individual, and for all possible sets of outputs,
, of the algorithm, the following identity is as shown:
(1)
where denotes probability, is the privacy budget (also known as a privacy risk or a
privacy loss parameter) representing the degree of privacy protection, and is amount
of information leakage or the maximum dierence between the outcomes of two transfor-
mations.
Approximate DP: In approximate DP, a small failure probability of error, , is added
to pure DP to relax the constraint:
(2)
This makes it easier to design practical algorithms that keep the privacy guarantees
perfect with higher utility, especially when the dataset is large. If , we can achieve a
stricter notation of -dierential privacy.
The privacy loss [36,69]: Let be a mechanism and and adjacent datasets,
the privacy loss for a given output,
(3)
The privacy loss quanties how assure a potential aacker can be based on the odds
ratio of the two possibilities. In this way, the distance between the output distributions for
a given can be measured. In other words, the pair of output distributions provides the
distinguishability of the mechanisms. If the loss is zero, the probabilities match and the
aacker have no advantage. If the loss is positive, the aacker chooses dataset . If the
loss is negative, the aacker chooses dataset . If the loss magnitude is large, there is a
privacy violation.
Hypothesis test interpretation of DP [70]: DP can be interpreted as a hypothesis test
with the null hypothesis that was trained on and the alternative hypothesis that
was trained on . False-positive result (type-I error) is the probability of rejecting the
null hypothesis when it is true, while false-negative result (type-II errors) is the probability
Appl. Sci. 2025, 15, 647 6 of 57
of failing to reject the null hypothesis when the alternative hypothesis is true. For example,
Kairouz et al. (2015) characterised -DP in terms of the false-positive rate (FPR) and
false-negative rate (FNR) that can be achieved by an acceptance region. This characterisa-
tion enables the estimation of the privacy parameters as follows:
,
(4)
Furthermore, from the hypothesis testing perspective of hypothesis testing [66] (Balle
et al., 2020), the aacker can be viewed as performing the following hypothesis testing
problem given the output of either or
H0: The underlying dataset is .
H0: The underlying dataset is .
In other words, for a xed type I error, , the aacker tries to nd a rejection rule, ,
that minimises the type II error, .
Private prediction interface [71]: A prediction interface, , is -dierential private
if for any interactive query generating algorithm, , the output () is -dieren-
tial private with respect to model , where ( ) denotes the sequence of queries
and responses generated in the interaction of and on model .
Rényi DP (RDP) [72]: This DP extends the standard concept of DP by allowing for a
continuum level based on the Rényi divergence. In RDP a randomized mechanism, ,
satises -RDP if for all neighbouring datasets, and , the Rényi divergence of
order between the distribution of the outputs of the algorithm on and is
bounded by :
(5)
The global sensitivity of a function [66]: The sensitivity, , of a function is the max-
imum absolute distance between scalar outputs, and , over all possible adja-
cent datasets, and :
,
(6)
If the query dimension of the function the sensitivity of function is the
maximum dierence between the values that may take on a pair of the adjacent da-
tasets.
Dierential privacy mechanisms. One way to achieve -DP and -DP is to add
noise sampled from Laplace and Gaussian distributions, respectively, where the noise is
proportional to the sensitivity of the mechanism. In general, there are three main mecha-
nisms for adding noise to data used in DP, namely the Laplace mechanism [2,73,74], the
Gaussian mechanism [66,75,76], and the exponential mechanism [13]. It should be noted
that the Laplace mechanism provides -DP and focuses on tasks that return numeric re-
sults. The mechanism achieves privacy via output perturbation, i.e., modifying the output
with Laplace noise. The Laplace distribution has two adjustable parameters, its centre and
its width, . The Gaussian mechanism yields -DP. Considering the proximity of the
original data, the Laplace mechanism would be a beer choice than the Gaussian mecha-
nism, which has a more relaxed denition. It should be noted that the exponential mech-
anism is usually used more for non-numerical data and performs tasks with categorical
outputs. When ϵ is small, the transformation tends to be private. The exponential mecha-
nism is used to privately select the best-scoring response from a set of candidates. The
mechanism associates each candidate, , via a scoring function, .
Appl. Sci. 2025, 15, 647 7 of 57
Upper bound and lower bound [77]: The DP algorithm is accompanied by a mathe-
matical proof that gives an upper bound for the privacy parameters, and . In contrast,
a privacy audit provides a lower bound for the privacy parameters.
2.2. Dierential Privacy Composition
Three core properties of dierential privacy are dened for the development of suit-
able algorithms to preserve privacy and full data protection guarantees. They play a cen-
tral role in understanding the net privacy cost of a combination of DP algorithms. A cru-
cial property of dierential privacy is the composition of dierentially private queries [50,78]
bounds on privacy guarantees.
Sequential composition [2,79] is the most fundamental, in which a series of queries
are computed and released in a single batch. It limits the total privacy cost of obtaining
multiple results on DP mechanisms with the same input data. Suppose a sequence of random-
ized algorithm, , which consists of sequential steps, which is
performed with the privacy budget, , on the same given dataset, , then the out-
put mechanism is , which satisfies
-DP.
Parallel composition [13] is a special case of DP composition when dierent queries
are applied to disjoint subsets of the dataset. If satises-DP and dataset is di-
vided into disjoint parts, such as , then the mechanism that releases
all results, , satises -DP. In this case, the privacy loss is not the sum of all,
, but rather the .
Postprocessing immunity [80] means that you can apply transformations (either de-
terministic or random) to the DP release and know that the result is still dierentially
private. If satises -DP, then for any randomised or deterministic function
satises -DP. Postprocessing immunity guarantees that the output of DP mech-
anism can be used arbitrarily without additional privacy leakage [81]. Since postpro-
cessing of DP outputs does not decrease privacy [61], we can choose a summary function
to preserve as much information about as possible.
2.3. Centralized and Local Models of Dierential Privacy
The two main common models [82,83] for ensuring data privacy, each with dierent
applications and mechanisms, are the central model and the local models. In dierential
privacy, the user datasets are noised either at the data center after receiving client’s data
or it is noised by each user of the data locally.
The classic centralized version of DP requires a trusted curator who is responsible for
adding noise to the data before distributing or analyzing. In centralized dierential pri-
vacy, the data and model are collocated, and the noise is added to the original dataset after
it has been aggregated in a trusted data center. A major problem with centralized dier-
ential privacy seing is that users still need to trust a central authority, namely the admin-
istrator of the dataset, to maintain their privacy [84]. There is also the risk of a hostile
curator [85].
In the local dierential privacy model, the data are made dierentially private before
they come under the control of the curator of the dataset [86]. Noise is added directly to
the user’s dataset [87] before it is transmied to the data center for further processing [88].
In the trade-o between privacy and accuracy, both centralized and local paradigms of
DP can reduce the overall accuracy of the converged model due to the randomization of
information shared by users [85].
Another taxonomic approach is the general distinguish DP into single-party learning
and the multi-party learning [89]. Single-party learning means that the training data of each
data owner is stored in a central place. In multi-party learning, on the other hand, there are
Appl. Sci. 2025, 15, 647 8 of 57
several data owners who each keep their own datasets locally and are often unable to
exchange raw data for data protection reasons.
2.4. Noise Injecting
In the ML pipeline, there are multiple stages at which we can insert noise (perturba-
tion) to achieve DP: (1) on the training data (the input level), (2) during training, (3) on the
trained model, or (4) on the model outputs [90].
Input perturbation. At the input level, we distinguish between central or local seings
[91]. Input perturbation is the simplest method to ensure that an algorithm satises the
DP. It refers to the introduction of random noise into the data (into the input of the algo-
rithm) itself. If a dataset is and each record is a d-dimensional vector,
then a dierential private is denoted as , where is a random d-di-
mensional vector.
Output perturbation. Another common approach is output perturbation [92], which
obtains DP by adding random noise is introduced to the intermediate output or the nal
model output [50]. By intermediate output, we mean the middle layers of the neural net-
works, while the nal model output implies the optimal weight obtained by minimizing
the loss function. The dierential private layer is denoted as
, where rep-
resents the hidden layers in a neural network.
Objective perturbation. In objective perturbation, random noise is introduced to the
underlying objective function of the machine-learning algorithm [50]. As the gradient is
dependent on the privacy-sensitive data, randomization is introduced at each step of the
gradient descend. We can imagine that the utility of the model changes slightly when the
noise is added to the objective function. A dierential private objective function is repre-
sented as
.
Gradient perturbation. In gradient perturbation [19], noise is introduced into the
gradients during the training process to solve the optimal model parameter using gradient
descent methods. The dierential private gradient descent is
, where is a regularization parameter, and is a learning rate.
Each option provides privacy protection at dierent stages of the ML development
process, with privacy protection being weakest when the DP is introduced at the predic-
tion level and strongest when it is introduced at the input level. Keeping the input data
private in dierent ways means that any model trained on that data will also have DP
guarantees. If you introduce DP during training, only that particular model will have DP
guarantees. DP at the prediction level means that only the model’s predictions are pro-
tected, but the model itself is not dierentially private. Note that if perturbations (noise)
are added to data to protect privacy, the magnitude of this noise is often controlled using
norms (also called scaling perturbations with norms).
3. Privacy Aacks
In this section, the alternative privacy aacks related to dierential privacy auditing
are presented, and their classication is proposed, evaluating the privacy guarantees of
the DP mechanisms and algorithms.
3.1. Overview
The assets that are sensitive and potentially threatening the ML models target either
the training data, the model, its architecture, its parameters, and/or the hyperparameters.
They can take place either in the training phase or in the inference phase. During model
training, the aacker aempts to infer and actively modify the training process or the
model. While privacy aacks provide a qualitative assessment of DP, they do not provide
Appl. Sci. 2025, 15, 647 9 of 57
quantitative privacy guarantees, nor do they detect exact dierential privacy violations
with respect to the desired ϵ [84].
Based on a comprehensive overview of the current State of the Art in privacy-related
aacks and the proposed threat models in DP, the dierent types of aacks for DPML
auditing typically include (1) black-box and white-box aacks (also known as seings),
(2) the type of aack, and (3) the centralized or local DP seings. There is an extensive
body of literature tailoring aacks to specic ML problems [93–95].
3.2. White-Box vs. Black-Box Aacks
Depending on the capability and knowledge of the aacker and the analysis of po-
tential information leaks in DPML models, they are generally divided into two main areas:
black-box and white-box aacks (also known as seings) [84,95]. If the aacker has full
access to the target model, including its architecture, training algorithm, model parame-
ters, hyperparameters, gradients, and data distribution, as well as outputs and inputs, we
speak of white-box aacks. On the other hand, if an aacker evaluates the privacy guar-
antees of dierential private mechanisms without accessing their internal workings and
only has access to the output of the model with arbitrary inputs, it is a black-box aack
[26,48]. In this type of aack, the aacker can only query the target model and obtain the
outputs, typically condence scores or class labels.
White-box privacy aacks [96,97], in the context of DP, involve scenarios where the
aacker has full access to the model parameters, architecture, and training data. The at-
tacker can exploit this detailed knowledge to create an aack model that predicts whether
specic records were part of the training dataset based on internal model behavior. Usu-
ally, the aacker tries to identify the most vulnerable space (e.g., the feature space) of the
target model by using the available information and modifying the inputs. They may also
analyze gradients, loss values, or intermediate activations to derive insights of infor-
mation status. Wu et al. [96], for example, focus on implementing a white-box scenario
where the aacker has full access to the weights and outputs of the target model. If this is
the case, we can speak of the strong capability of the aacker on the model. Steinke et al.
[48] implement white-box auditing in their auditing framework, where the aacker has
access to all intermediate values.
Black-box privacy aacks [84,95] involve scenarios in which the aacker has limited
access to a ML model, typically only being able to observe and retrieve its output for spe-
cic inputs. This means that the aacker has no knowledge of the internals of the target
model. In this scenario, the vulnerabilities of the model are identied using information
about the past input/output pairs. Most black-box aacks require the presence of a pre-
dictor [84]. In black-box environments, only the predicted condence probabilities or hard
labels are available [97]. In privacy auditing through black-box access, the aacker only
sees the nal model weights or can only query the model [96]. Black-box aacks are also
used to detect privacy violations that exploit vulnerabilities in dierential privacy algo-
rithms and lead to privacy violations [26].
Grey-box privacy aacks in DPML represent a middle ground between white-box
and black-box aacks, where the aacker has limited access to the model’s internals. This
type of aack occurs when an aacker has some knowledge of the model, such as to spe-
cic parameters or layers, but not to the complete internal workings.
Aacks on collaborative learning assume access to the model parameters or the gra-
dients during training or deal with aacks during inference. In cases where the aacker
has partial access, these are called grey-box aacks (partial white-box aacks) [98,99]. We
consider aackers to be active if they interfere with the training in any way. On the other
hand, if the aackers do not interfere with the training process and try to derive
knowledge after the training, they are considered passive aackers. It is important to add
Appl. Sci. 2025, 15, 647 10 of 57
here that most work assumes that the expected input is fully known, although some pre-
processing may be required.
3.3. Type of Aacks
In the ML context, the aacker aempts to gain access to the model (e.g., to the pa-
rameters) or intends to violate the privacy of the individuals in the training data or per-
form an aack on the dataset used for inference and model evaluation. In this study, pri-
vacy aack techniques are categorized into ve types: membership inference aack, data-
poisoning aacks, model inversion aacks, model extraction aacks, and property infer-
ence aacks [46,62]. The most general form corresponding to the assessment of a data
leakage is membership inference: the inference of whether a particular data point was part
of a model training set or not [52]. Far more powerful aacks, such as model inversion
(e.g., aribute inference [100] or data extraction [101,102], aim to recover partial or even
complete training samples by interacting with an ML model. In the ML context, inference
aacks that aim to infer private information from data analysis tasks are a signicant
threat to privacy-preserving data analysis.
3.3.1. Membership Inference Aack
A membership inference aack (MIA; also known as training data extraction aack)
is used to determine whether a particular data point is part of the training dataset or not
[10,51,52,103,104]. In other words, MIA tries to measure how much information about the
training data leaks through the model [34,90,105]. The success rate of these aacks is in-
uenced by various factors, such as data properties, model characteristics, aack strategies, and
the knowledge of the aacker [106]. This type of aack is based on the aacker’s knowledge
in both white-box and black-box seings [96]. The earlier works in MIAs use average-case
success metrics, such as the aacker’s accuracy in guessing the membership of each sam-
ple in the evaluation set [52]. The MIAs typically consist of four main types:
• White-box membership inference.
• Black-box membership inference.
• Label-only membership inference.
• Transfer membership inference.
In white-box MIAs, the aacker (also known as a full-knowledge aacker) [84,107]
has access to the internal parameters of the target model (e.g., gradients and weights),
along with some additional information [10]. For example, the aacker has access to the
internal weights of the model and thus to the activation values of each layer [108]. The
goal of the aacker is to gain access to the model parameters and gradients to identify
dierences in how the model processes training and non-training data. The main tech-
niques are gradient-based approaches, which exploit the gradients of ML models [41,109] to
infer whether a specic data point was part of the training dataset, which can reveal train-
ing data by examining gradients for target data points, as the gradients for such data can
often dier from the activations during training; and activation analysis [61], as activations
for training data may dier from activations for non-training data in certain layers, which
can be used as a signal to detect membership. In addition, introducing white-box MIAs in
the ML model, it may also be possible to analyse internal gaps of the model and the use
of features by exploiting the internal parameters and hidden layers of a model, as these
often reveal training data [96].
In the black-box seing, the aacker simply queries the ML model with input data and
observes the output, which can return either condence-based scores, hard labels, or class prob-
abilities. The aacker exploits the dierences in the ML model between training data and
unseen data, often leveraging high condence scores for training data points as a signal
Appl. Sci. 2025, 15, 647 11 of 57
of membership. In a black-box seing this type of aack is carried out through techniques
such as shadow model training [90] or condence-based score analysis [110].
By training a series of shadow models—local surrogate models—aacker obtains an
aack model that can infer the membership of a particular record in the training dataset
[111]. This is performed by training a model that has the same architecture as the target
model but uses its own data samples to approximate the training set of the target model.
The aack only requires the exploitation of output prediction vector of the model and is
well feasible against supervised machine-learning models. Instead of creating many
shadow models [112], use only one to model the loss or logit distributions for members
and non-members.
In condence score analysis (also known as condence-based aacks), the aacker
analyzes the condence scores returned by the model by comparing the condence in the
trained samples with the untrained samples (unseen data). The aacker has access to the
labels and prediction vectors to obtain condence scores (probabilities) for the queried
input. This approach is mainly investigated in works such as [52,113]. Carlini et al. [103]
use a condence-based analysis that is maximized at low false-positive rates (FPRs). To
improve performance at low FPRs, Tramèr et al. [49] introduce data poisoning during
training. However, these aacks can be computationally expensive, especially when used
together with shallow models to stimulate the behavior of the target models [114].
In the label-only MIAs, the aacker only has access to the model’s labels to determine
whether a specic data sample was part of the model’s training set. The aacker uses only
the predicted labels to infer membership under input perturbation [20,115], often by lev-
eraging inconsistencies in the model’s predictions between training and non-training
data. Standard label-only MIAs often require a high number of queries to assess the dis-
tance of the target sample from the model’s decision boundary, making them less eective
[54,116]. There are two main techniques in label-only MIAs: adaptive querying, where the
inputs are slightly modied to see if the model changes the label, which could indicate
that the data were part of the training set; and meta-classication, which means that a
secondary model is trained to distinguish between the labels of training and non-training
data to infer membership.
The transfer membership [117] is a case where direct access to the target model is
restricted. The aacker can train an independent model with a similar dataset or use pub-
licly available models that have been trained with similar data. The aacker’s goal is to
train a local model that approximates the behavior of the target model and use this local
model to launch MIAs. There are two main techniques: model approximation [118,119],
which means that the aacker approximates the decision boundary of the target model
and uses black-box seings to infer the membership of the surrogate model; and adver-
sarial examples [120], meaning that the aacker can generate adversarial examples that
are more likely to be misclassied by non-training points to improve the accuracy of mem-
bership inference.
MIA is widely researched in the eld of ML and could serve as a basis for stronger
aacks or be used to audit dierent types of privacy leaks. Most MIAs follow a common
scheme to quantify the information leakage of ML algorithms over training data. For ex-
ample, Ye et al. [113] compare dierent strategies for selecting loss thresholds. Yeom et al.
[10] compares the use of MIA for privacy testing with the use of a global threshold, τ, for
all samples. Later, aack threshold calibration was introduced to improve the aack
threshold, as some samples are more dicult to learn than others [22,103]. Another ap-
proach to MIA is dened by an indistinguishably game between a challenger and an ad-
versary (i.e., privacy auditor) [105]. The challenger tries to nd out whether a particular
data point or a sample of data points was part of the training dataset used to train a par-
ticular model. A list of privacy aacks is shown in Table 1.
Appl. Sci. 2025, 15, 647 12 of 57
Table 1. A list of alternative aacks to evaluate the privacy guarantees.
Aack
DPML
Stages
Impact
Type of Aack
Aack Techniques
Membership
inference
Training
data
White-box member-
ship inference aack
Gradient-based approaches: Exploiting gradients whether specic data
points were part of the training dataset.
Activation analysis: Exploiting the activations for training data based
on the assumption that they dier in certain layers from the activations
for non-training data in certain layers.
Black-box membership
inference aack
Training shadow models: Creating and training a set of models that
mimic the behavior of the targeted model.
Condence score analysis: Construct and analyze condence scores or
condence intervals.
Label-only member-
ship inference aack
Adaptive querying: Modifying the inputs to answer queries that are
individually selected, where each query depends on the answer to the
previous query when the model changes the label.
Meta-classication: Training a secondary model to distinguish be-
tween the labels of training and non-training data.
Transfer membership
inference aack
Model approximation: Using approximation algorithms to test the de-
cision boundaries of the target model.
Adversarial examples: Using adversarial techniques to evaluate pri-
vacy guarantees.
Data poison-
ing
Training
phase/m
odel,
data
Gradient manipulation
aack
The gradients are intentionally altered during the model training pro-
cess.
Targeted label ipping
Label modication of certain data points in the training data without
changing the data themselves.
Backdoor poisoning
Inserting a specic trigger or “backdoor”.
Data injection
Injecting malicious data samples that are designed to disrupt the
model’s training.
Adaptive querying
and poisoning
Injecting a slightly modied version of data points and analyzing how
these changes aect label predictions.
Model inver-
sion
Model
White-box inversion
aacks
The aacker uses detailed insights into the model’s structure and pa-
rameters (e.g., model weights or gradients) to recover private training
data.
Black-box inversion at-
tacks
The aacker iteratively queries the model and uses the outputs to infer
sensitive information without access to the model’s internals.
Inferring sensitive at-
tributes from the
model
Balancing the privacy budget for sensitive and non-sensitive aributes.
Gradient-based inver-
sion aacks
The aacker tries to recover private training data from shared gradi-
ents.
Model extrac-
tion
Model
Adaptive Query-
Flooding Parameter
Duplication (QPD) at-
tack
Allow the aacker to infer model information with black-box access
and no prior knowledge of model parameters or training data.
Equation-solving at-
tack
Targets regression models by adding high-dimensional Gaussian noise
to model coecients.
Membership-based
property inference
Combines membership inference with property inference, targeting
specic subpopulations with unique features.
Appl. Sci. 2025, 15, 647 13 of 57
3.3.2. Data-Poisoning Aack
In data-poisoning aacks, malicious data are injected into the training set in order to
inuence the behavior of the model. These aacks, whether untargeted (random) or tar-
geted [54,121], are a form of undermining the functionality of the model. Common ap-
proaches are to either reduce the accuracy of the model (random) or manipulate the model
to output a label specied by the aacker (targeted) to reduce the performance of the
model or cause targeted misclassication or misprediction. If the aacker tries to elicit a
specic behavior from the model, the aack is called targeted. A non-targeted aack, on
the other hand, means that the attacker is trying to disrupt the “overall functionality of the
model”. Targeted or non-targeted poisoning attacks can include both model poisoning and
data-poisoning attacks. The impact of poisoning attacks, for example, causes the classifier to
change its decision boundary and achieve the attacker’s goal of violating privacy [121].
In the context of DP, during data-poisoning aacks [55,122], the aacker manipulates
and falsies the model at the time of its training or during the inference time of the model
by injecting adversarial examples into the training dataset [54]. In this way, the behavior
of the model is manipulated, and meaningful information is extracted. Poisoning aacks
are not only limited to training data points; they also target model weights. Among these
threats, data poisoning stands out due to its potential to manipulate and undermine the
integrity of AI-driven systems. It is worth noting that this type of aack is not directly
related to data privacy but still poses a threat to ML modeling [123]. In model poisoning,
target model poisoning aims to misclassify selected inputs without modifying them. This
is achieved by manipulating the training process. Data-poisoning aacks are relevant for
DP auditing as they can expose potential vulnerabilities in privacy-preserving models.
The DPs typically consist of ve main types (as shown in Table 1):
Gradient manipulation aacks: In gradient manipulation aacks, especially gradi-
ent inversion aacks [96,124], the aacker manipulates the gradient update by injecting
false or poisoned gradients that either distort the decision boundary of the model or lead
to overing. These aacks allow the aacker to reconstruct private training data from
shared gradients and undermine the privacy guarantees of ML models. This approach
also aims to investigate whether gradient clipping and noise addition can eectively pro-
tect against excessive inuence of individual gradients.
Targeted label ipping: This involves modication of the labels of certain data
points in the training data without changing the data themselves, especially those in sen-
sitive classes [125]. The aacker then checks whether this modied information can be
restored from the model.
Inuence limiting: To assess how DP mechanisms limit the inuence of any single
data point, poisoned records are inserted into the training data to see if their impact on
the model predictions and accuracy can be detected [126].
Backdoor poisoning aacks: This type of aack aims to insert a specic trigger or
“backdoor” [127–129] that later manipulates the behavior of the model when it is activated
in the testing or deployment phase. If the model is inuenced by the backdoor paern in
a way that compromises individual data points, this may indicate vulnerability to targeted
privacy risks. These types of aacks are often evaluated against specic target perturba-
tion learners. The aacker intentionally disrupts some training samples to change the pa-
rameter distribution [130]. Among these threats, data poisoning stands out due to its po-
tential to manipulate and undermine the integrity of ML-driven systems. These aacks
were developed for image datasets [129]. In the original backdoor aacks, the backdoor
paerns are xed [131], e.g., a small group of pixels in the corner of an image. More recent
backdoor aacks can be dynamic [132] or semantic [133]. Backdoor aacks are a popular
approach to poison ML classication models. DP can help prevent backdoor aacks by
Appl. Sci. 2025, 15, 647 14 of 57
ensuring that the training process of the model includes noise addition or privacy ampli-
cation.
Data injection: In this type of aack, an aacker injects malicious data samples that
are designed to disrupt the model’s training [134]. This is dierent to backdoor aacks in
that it may not involve a specic trigger paern but simply serves to corrupt the model’s
decision making. Adding random, noisy samples into the training set can skew the
model’s weight, leading to suboptimal performance.
3.3.3. Model Inversion Aack
These aacks exploit the released model to predict sensitive aributes of individuals
using available background information. Existing DP mechanisms struggle to prevent
these aacks while preserving the utility of the model [108,135,136].
The model inversion aack is a technique in which the aacker aempts to recover
the training dataset from learned parameters. For example, Zhang et al. [58] use model
inversion aacks to reconstruct training images from a neural network-based image
recognition model. In these aacks, the aackers use the released model to infer sensitive
aributes of individuals in the training data or the outputs of DPML models. These aacks
allow aackers to infer sensitive aributes of individuals by exploiting the outputs of the
model [29,135,136].
The idea of model inversion [137] is to invert a given pre-trained model, , in order to
recover a private dataset, , such as images, texts, and graphs. The aacker who aempts
to use the model inversion aack [28,138–140] queries the model with dierent inputs and
observes the outputs. By comparing the outputs for dierent inputs, the aacker identies
and recognizes paerns. By testing each feature, the aacker can consequently infer the
paern in the original training data, resulting in a data leakage.
A common approach for this type of aack is to reconstruct the input data from the
condence score vectors predicted by the target model [100]. The aacker trains a separate
aack model on an auxiliary dataset that acts as the inverse of the target model [141]. The
aack model takes the condence score vectors of the target model as input and tries to
output the original data of the target model [113]. Formally, let be the target model
and be the aack model. Given a data record , the aacker inputs into
and receives and then feeds into and receives , which is expected to ob-
tain ; that is, the outputs of the target models and the aacks model could be
very similar. These aacks can be categorized as follows:
• Learning-based methods.
• White-box inversion aacks.
• Black-box inversion aacks.
• The gradient-based inversion aacks.
Learning-based methods: These methods can reconstruct diverse data for dierent
training samples within each class. Recent advances have improved their accuracy by reg-
ularizing the training process with semantic loss functions and introducing counterexam-
ples to increase the diversity of class-related features [142].
White-box inversion aacks: In such aacks, the aackers have full access to the
structure and parameters of the model. Auditors use the white-box inversion as a worst-
case scenario or use the model’s parameters of the model [143] to assess whether the DP
mechanisms preserve privacy even with highly privileged access.
Black-box inversion aacks: In such aacks, the aackers only have access to the
output labels of the model [144] or obtain the condence vectors. Black-box aacks simu-
late typical model usage scenarios, allowing auditors to assess how much information
leakage occurs purely through interactions with the model’s interface.
Appl. Sci. 2025, 15, 647 15 of 57
Gradient-based inversion aacks (also known as input recovery from gradients):
The attacker accesses gradients shared during the training rounds (especially in FL) and uses
these to infer details about the training [145]. Auditors use these attacks by masking sensitive
information, particularly in collaborative and decentralized working environments.
Traditional DP mechanisms often fail to prevent model inversion aacks while main-
tain model utility. Model inversion aacks are a signicant challenge for DP mechanisms,
especially for regression models and graph neural networks (GNNs). These aacks allow
aackers to infer sensitive aributes of individuals by exploiting the released model and
some background information [135,136].
Model inversion enables an aacker to fully reconstruct private training samples [97].
For visual tasks, the model inversion aack is formulated as an optimization problem. The
aack uses a trained classier to extract representations of the training data. A successful
model inversion aack generates diverse and realistic samples that accurately describe
each class of the original private dataset.
3.3.4. Model Extraction Aack
Model extraction aacks (also known as reconstruction aacks) aim to steal the func-
tionality, replicate ML models, and expose sensitive information of well-trained ML mod-
els [147]. The aacker can approach or replicate a target model by sending numerous que-
ries to infer model parameters or hyperparameters and observing its response. In model
extraction aacks [95,148,149], in the context of DP auditing, aackers aempt to derive a
victim model by extensively querying model parameters or training data in order to train
a surrogate model. The aacker learns a model to try to extract information and possibly
fully reconstruct a target model by creating a new duplicate model that resembles the
target model in a way that behaves very similarly to the aacked model. This type of at-
tack only targets the model itself and not the training data.
This type of aack can be categorized into two classes [150]: (i) accuracy extraction
aacks, i.e., focusing on replacing the target model; and (ii) delity extraction aacks, i.e.,
aim to closely match the prediction of target model.
This threat can increase the privacy risk as a successful model extraction can enable
a subsequent next threat, such as a model inversion. There are two approaches to creating
a surrogate model [62,148]. First, the surrogate model ts the target model to a set of input
points that are not necessarily related to the learning task. Second, create a surrogate
model that matches the accuracy of the target model on the test set that is related to the
learning task and comes from the distribution of the input data.
Model extraction aacks pose a signicant security threat to ML models, especially
those provided via cloud services or public APIs. In these aacks, an aacker repeatedly
queries a target model to train a surrogate model that mimics the functionality of the tar-
get’s model.
3.3.5. Property Inference Aacks
Property inference aacks (also called distribution inference) [151–154] aim to infer
global, aggregate properties of the training data used in machine-learning models, rather
than details of individual data points. These sensitive properties are often based on ratios,
such as the ratio of male to female records in a dataset. The aack aempts to understand
the statistical information of a training dataset from an ML model. In contrast to privacy
aacks that focus on individuals in a training dataset (e.g., membership inference), PIAs
aim to extract population-level features from trained ML models [60]. Existing property
inference aacks can be broadly categorized as follows:
The aacker aacks the training dataset and aempts to leak sensitive statistical in-
formation related to the dataset or a subset of the training dataset, such as specic
Appl. Sci. 2025, 15, 647 16 of 57
aributes, which can have a signicant impact on privacy in the model [151]. The aacker
can also exploit the model’s ability to memorize explicit and implicit properties of the
training data [155]. This can be achieved by poisoning a subset of the training data to
increase the information leakage [153]. There is an option where an aacker can mali-
ciously control a portion of the training data to increase the information leakage. This can
lead to a signicant increase in the eectiveness of the aack and is then referred to as a
property inference poisoning aack [156].
Dierential privacy (DP) auditing uses property inference aacks to test whether the
DP mechanisms are robust against the leakage of information about specic properties,
features, or paerns within the dataset.
4. Privacy Auditing Schemes
In this section, the privacy auditing schemes for dierential privacy auditing are pre-
sented.
4.1. Privacy Auditing in Dierential Privacy
Testing and evaluating DPML models are important to ensure that they eectively
protect privacy while retaining their utility. Since dierential privacy is always a trade-o
between privacy and utility, evaluating the privacy of the model helps in choosing an
appropriate privacy budget. A privacy budget that is high enough to ensure sucient
accuracy and a budget that is low enough to ensure acceptable privacy. Which level of
accuracy and/or privacy is sucient depends on the application of the model [130]. To
address the security-critical privacy issues [157] and detect potential privacy violations and
biases, the privacy auditing procedures can be implemented to empirically evaluate DPML.
Privacy auditing is a set of techniques for empirically verifying the privacy leakage of
an algorithm to ensure that it fulls the dened privacy guarantees and standards. Privacy
auditing of DP models [45–47,158,159] aims to ensure that privacy-preserving mecha-
nisms are eective, reliable and provide the privacy guarantees of DPML models and algo-
rithms. For example, one approach is to use a probabilistic automation model to track
privacy leakage bounds [160]; and another uses canaries to audit ML algorithms [161],
eciently detects -violations [48], estimates privacy loss during a single training run
[118], or transforms into Bayesian posterior belief bounds [34]. To ensure robust pri-
vacy auditing in dierential privacy (DP), it is important to follow the key steps that rig-
orously aack the machine-learning model and verify privacy guarantees:
Dene the scope of the privacy audit: Establish the objectives and purposes of the
audit. This includes determining which specic mechanisms or algorithms are to be eval-
uated and determining the privacy guarantees that are relevant for the audit.
Clear delineation of the privacy guarantees expected by the DP model (dierential
privacy mechanisms), the denition of data protection requirements that are tailored to
the sensitivity of the data, compliance with standards, and the justication of the privacy
parameters, (upper bound), are required. For example, the authors of [162] describe
a privacy auditing pipeline that is divided into two components: the aacker scheme and
the auditing scheme.
Perform privacy aacks and vulnerability implementation: Implement privacy at-
tacks (e.g., membership inference, model extraction, and model inversion) to evaluate the
robustness of the DP mechanism or DPML algorithm. It means to providing robust pri-
vacy guarantees for a DP mechanism that eectively limit the amount of information that
can be inferred about individual data points, regardless of a potential aacker’s strategy
or the conguration of the dataset. For example, simulate black-box or white-box mem-
bership inference [163,164] to assess the impact of model access on privacy leakage. This
Appl. Sci. 2025, 15, 647 17 of 57
gives us the opportunity to test the resilience of the model and measure the success/failure
rates of these aacks.
Analyze and interpret the audit results: The nal step is to empirically estimate the
privacy leakage from a DPML model, denoted as , and compare it with the theoreti-
cal privacy budget, [81]. An important goal of this process is the assessment of the tight-
ness of the privacy budget. The audit is considered tight if the privacy parameter is
. Such an approach can be used to eectively validate DP implementations in the model
or to detect DP violations in case of [165–167].
4.2. Privacy Auditing Techniques
Privacy auditing techniques in dierential privacy are essential to ensure that privacy
guarantees are met in practical implementations. Empirical auditing techniques establish
practical lower bounds on privacy leakage, complimenting the theoretical upper bounds
provided by DP [32]. Before we address privacy auditing schemes, it is necessary to ex-
plain the main auditing techniques (Birhane et al., 2024) that have been used to evaluate
the eectiveness of DP mechanisms and algorithms against privacy aacks in ML models.
Canary-based audits: Canary-based auditing is a technique for assessing the privacy
guarantees of DPML algorithms by introducing specially designed examples, known as
canaries, into the dataset [43,161]. The auditor then tests whether these canaries are in-
cluded in the outputs of the model and distinguishes between models trained with dier-
ent numbers of canaries. An eective DP should limit the sensitivity of the model to the
presence of these canaries, minimizing the privacy risk. Canaries must be carefully de-
signed to ensure that they can detect potential privacy leaks without jeopardizing overall
privacy guarantees. Canary-based auditing often requires dealing with randomized da-
tasets, which enables the development of randomized canaries. The introduction of Lifted
Dierential Privacy (LiDP) [161] by distinguishing between models trained with dierent
numbers of canaries can leverage statistical tests and novel condence intervals to im-
prove sample complexity. There are several canary strategies: (1) a random sample from
a dataset distribution with a false label, (2) the use of an empty sample, (3) an adversarial
sample, and (4) the canary crafting approach [42]. The disadvantage of canaries is that an
aacker must have access to the underlying dataset and knowledge of the domain and
model architecture.
Statistical auditing: In this context, statistical methods are used to empirically eval-
uate privacy guarantees [168]. These include inuence-based aacks and improved pri-
vacy search methods that can be used to detect privacy violations and understand infor-
mation leakage in datasets, thus greatly improving the auditing performance of various
models, such as logistic regression and random forest [62].
Statistical hypothesis testing interpretation: The aim of this approach is to nd the
optimal trade-o between type I and type II errors [169]. This means that no test can ef-
fectively determine whether a specic individual’s data are included in a dataset, ensuring
that both high power and high signicance are simultaneously unaainable simultane-
ously [170,171]. It is used to derive the theoretical upper bound and is very useful in de-
riving the tight compositions [172] and has even motivated a new relaxed notion of DP
called f-DP [76].
Single training run auditing (also known as one-shot auditing): It enables privacy
auditing during a single training run, eliminating the need for multiple retraining ses-
sions. This technique utilizes the parallelism of independently adding or removing mul-
tiple training examples and enables meaningful empirical privacy reduction with only
one training run of the model [43]. The technique is ecient and requires no prior
knowledge of the model architecture or DP algorithm. This method is particularly useful
Appl. Sci. 2025, 15, 647 18 of 57
in FL seings and provides accurate estimates of privacy loss under the Gaussian mecha-
nism [118].
Empirical privacy estimation: In this technique, the actual privacy loss of an algo-
rithm is evaluated by practical experiments rather than theoretical guarantees [173]. This
technique is used to audit the implementations of DP mechanisms or claims about the
models trained with DP [42]. They are also useful to estimate the privacy loss in cases
where a tight analytical upper bound on ϵ is unknown.
Post hoc privacy auditing: This technique traditionally establishes a set of lower
bounds for privacy loss (e.g., thresholds). However, it requires sharing intermediate
model updates and data with the auditor, which can lead to high computational costs
[174].
Worst-case privacy check: In the context of dierential privacy, the worst-case pri-
vacy auditing [32,102] refers to the specic data points or records in a dataset that, if
added, removed or altered from the dataset, could potentially have the greatest impact on
the output of a dierential privacy mechanism. Essentially, these are the most “sensitive”
records where the privacy guarantee is most at risk.
4.3. Privacy Audits
When we focus on auditing DPML models, we rst need to know whether we have
enough access to information to perform a white-box audit or a black-box audit. A white-
box auditing can be dicult to perform on a large scale in practice, as the algorithm to be
audited needs to be signicantly modied, which is not always possible [109]. Neverthe-
less, auditing DPML models in a white-box environment requires minimal assumptions
about the algorithms [43]. Instead, black-box audits are more realistic in practice, as our
aacker can only observe the nal trained model.
Privacy auditing schemes empirically evaluate the privacy leakage of a target ML
model, or its algorithm trained with DP [42,46,47,62,96]. Such schemes use the DP deni-
tion a priori to formalize and quantify the privacy leakage [175]. Currently, most auditing
techniques are based on simulating dierent types of aacks [114] to determine a
lower bound on the privacy loss of a ML model or algorithm [62]. Privacy auditing can be
performed using dierent aacker schemes (processes), which can be broadly categorized
as follows (Table 2):
• Membership inference auditing.
• Poisoning auditing.
• Model inversion auditing.
• Model extraction auditing.
• Property inference auditing.
In summary, privacy auditing schemes leverage various techniques to strike a bal-
ance between privacy, data utility and auditing eciency. The comprehensive privacy
auditing methodology, privacy guarantees with references can be found in Appendix A.
We review the most important works, starting with the use of black-box and white-box
privacy aacks.
4.3.1. Dierential Privacy Auditing Using Membership Inference
Membership inference audits: These audits test the resilience of the model against
membership aacks, where an aacker tries to determine whether certain data points
were included in the training set. The auditor performs MIAs to estimate how much in-
formation about individual records may have been leaked. This category is divided into
the following subcategories:
Appl. Sci. 2025, 15, 647 19 of 57
• Black-box inference membership auditing: This approach relies solely on assessing
the privacy guarantees of machine-learning models by evaluating their vulnerability
to membership inference aacks (MIAs) without accessing the internal workings of
the model.
Song et al. [176] examine how robust training, including a dierential privacy mech-
anism, aects the vulnerability to black-box MIAs. The success of MIAs is measured using
metrics such as aack accuracy and the relationship between model overing and pri-
vacy leakage. It also investigates MIAs under aacker robustness and dierential privacy
conditions and shows that DP models are also vulnerable under black-box conditions.
Carlini et al. [103] present a DP audit method related to black-box threshold MIAs by pro-
posing a rst-principles approach. The authors introduce the likelihood ratio aack
(LiRA). It analyzes the most vulnerable points in the model predictions. The authors ques-
tion the use of existing methodologies that rely on average-case accuracy metrics to eval-
uate empirical privacy, which do not adequately capture an aacker’s ability to identify
the actual members of the training dataset. They propose to measure the aacker’s ability
to infer membership of a dataset using the true-positive rate (TPR) and the low false-pos-
itive rate (FPR) at very low positive rates (e.g., <0.1%). The authors oer a —maximiza-
tion strategy. The authors conclude that even a powerful DP mechanism can sometimes
be vulnerable to carefully constructed black-box accesses. Lu et al. [173] present Eureka, a
novel method for estimating relative DP guarantees in black-box seings, which denes
a mechanism’s privacy concerning a specic input set. At its core, Eureka uses a hypoth-
esis testing technique to empirically estimate privacy loss parameters. By computing out-
puts on adjacent datasets, the potential leakage and thus the degree of privacy guarantee
is determined. The authors use MIAs based on classiers to audit -DP algorithms.
They demonstrate that Eureka achieves tight accuracy bounds in estimating privacy pa-
rameters with relatively low computational cost for large output space.
Kazmi et al. [175] present a black-box privacy auditing method for ML target models
based on a MIA using both training data (i.e., “members” (true positives)) and generated
data not included in the training dataset (i.e., non-members (true negatives)). This method
leverages membership inference as primary method to audit datasets used in the training
of ML models without retraining them (ensembled membership auditing (EMA)). EMA
aggregates the membership scores on a data-by-data basis based on individual data, using
statistical tests. The method, which authors call PANORAMIA, quanties privacy leakage
for large-scale ML models without controlling the training process or retraining of the
model. Koskela et al. [177] use the total variation distance (TVD), a statistical measure that
quanties the dierence between two probability distributions, between the output dis-
tributions of a model when trained on two neighboring datasets. The authors suggest that
TV distance can serve as a robust indicator of privacy guarantee when examining the out-
puts of a DP mechanism. The auditing process compares two TVD how much they dier
from each other across dierent outputs generated from adjacent datasets to approximate
the privacy parameters. It is directly related to the privacy parameter, , and provides a
tangible way to evaluate privacy loss. The auditing process utilizes a small hold-out da-
taset that has not been exposed during training. Their approach allows for the use of an
arbitrary hockey-stick divergence to measure the distance between the score distributions
of audit training and test samples. This work ts well with FL scenarios.
White-box inference membership auditing: White-box audits leverage full access to
the internal parameters of a model, including gradients and weights. They are often used
in corporate ML research, where the internal parameters of the model are available and
allow a detailed analysis of the DP eciency.
Leino and Fredrikson [107] propose a calibrated white-box membership inference at-
tack by evaluating the resulting privacy risk, which also leverages the intermediate
Appl. Sci. 2025, 15, 647 20 of 57
representations of the target model. The work investigates how MIAs exploit the tendency
of deep networks to memorize specic data points, leading to overing. They linearly
approximated each layer, launched a separate aack on each layer, and trained the target
model (DP-SGD) that combines the outputs of the layer-wise aacks. The high-precision
calibration ensures that the aack can condently identify whether a data point was part
of the training set. Chen et al. [178] evaluate a dierential private convolutional neural
network (CNN) and Lasso regression model with and without sparsity using a MIA on
high-dimensional training data, using genomic data as an example. They show that spar-
sity of the model in contrast to the non-private seing can improve the accuracy of the
model in the non-private seing. By applying a regularization technique (e.g., Lasso), the
study demonstrates that sparsity can complement DP eorts.
There are seminal works that use both white-box and black-box seings: Nasr et al.
[47] extended the study of [46] Jagielski et al. (2020) on empirical privacy estimation tech-
niques by analyzing DP-SGD through an increasing series of black-box membership inference
to white-box poisoning aacks. They are the rst to audit DP-SGD tightly. To do so, they use
aacker-crafted datasets and active white-box aacks that insert canary gradients into the
intermediate steps of DP-SGD. Tramèr et al. [49] propose a method for auditing backprop-
agation clipping algorithm (a modication of the DP-SGD algorithm), assuming that it
works in black-box or white-box seings. The goal was to empirically evaluate how often
the outputs of and are distinguishable. The auditor’s task with MIA is to
maximize the FPR/TPR ratio to assess the strength of the privacy mechanism. Nasr et al.
[42] follow up on their earlier work (Nasr et al., 2021) and design an improved auditing
scheme for testing DP implementations in black-box and white-box seings for DP-SGD
with gradient canaries or input space canaries. This method provides a tight privacy esti-
mation that signicantly reduces the computational cost by leveraging tight composition
theorems for DP. The authors check each individual step of the DP-SGD algorithm; that
is, they do not convert each obtained lower bound obtained into a guarantee for , which
is given after compiling over all training steps with understanding of the privacy under-
standing of the end-to-end algorithm.
Shadow model auditing: Shadow model membership auditing is a technique used
to assess the privacy of machine-learning models by replicating the target models that are
trained on similar datasets. They allow the auditor to infer information about the target
model’s training data without direct access to it. In this method, multiple shadow models
are created that mimic the behavior of the target model so that the auditor can infer the
membership status based on the outputs of the shadow models. The primary purpose of
using shadow models is to facilitate MIAs determining whether specic data points were
part of the training dataset.
The groundbreaking work Shokri et al. [52] evaluates the membership inference at-
tack in a black-box environment in which the aacker only has access to the target model
via a query. The aack algorithm is based on the concept of shadow models. The aacker
trains shadow models that are similar to the target model and uses these shadow models
to train a membership inference model. MIA is modeled as a binary classication task for
an aack model that is trained using the prediction of shadow models on the aacker
dataset. Salem et al. [112] utilize data augmentation to create shadow models and analyze
privacy leakage. It provides insights into the impact of data transformations on inference
accuracy in both black-box and white-box seings.
Yeom et al. [10] investigate overing in ML model auditing using a threshold mem-
bership inference aack as a primary method and aribute inference aack based on dis-
tinction between training and testing per-instance losses. The authors provide an upper
bound on the success of MIA as a function of the parameters in DP. By training shadow
models, the authors demonstrate how models that memorize training data are more
Appl. Sci. 2025, 15, 647 21 of 57
susceptible to MIAs, especially when DP techniques are not optimally applied. They con-
clude that overing is sucient for an aacker to perform MIA. Sablayrolles et al. [22]
focus on the comparison of black-box aacks and white-box aacks by eectively estimat-
ing the model loss for a data point. The authors use the shadow model technique to
demonstrate MIAs across architectures and training methods. They also investigate Bayes
optimal strategies for MIAs that leverage knowledge of model parameters in white-box
seings. Their ndings suggest that white-box aacks do not require specic information
about model weights and losses, but can still be performed eectively using probabilistic
assumptions, and that optimal aacks depend on the loss function and thus black-box
aacks are as good as white-box aacks. The authors introduce the Inverse Hessian aack
(IHA), which utilizes model parameters to enhance the eectiveness of membership in-
ference. By computing inverse-Hessian vector products, these aacks can exploit the sen-
sitivity of model output to specic training examples.
Label-only membership auditing: label-only membership inference auditing in dif-
ferentially private machine learning is a privacy assessment method where an auditor at-
tempts to deduce whether a particular data point was part of the training dataset based
on the model’s predicted labels (without access to probabilities or other model details).
This form of auditing is particularly relevant for real-world scenarios.
Malek et al. [179] adapt a heuristic method to evaluate label dierential privacy (La-
bel DP) in dierent congurations where privacy of the labels associated with training
examples is preserved, while features may be publicly accessible. The authors propose
two primary approaches: Private Aggregation of Teacher Ensembles (PATE) and Additive
Laplace with Iterative Bayesian Inference (ALBI). They apply noise exclusively to the la-
bels in the training data, leading to the development of dierent label-DP mechanisms,
and investigate model accuracy and estimate lower bounds for the privacy parameter
values. They trained several models with and without a training point, while the rest of
the training set remained unchanged. Choquee-Choo et al. [180] focus on a black-box
aack, where only the labels of the model, and not the full probability distribution, are
available to the aackers. They investigate privacy leakage in four private prediction al-
gorithms: PATE, CaPC, PromptPATE, and Private-KNN. The authors showed that DP
provides the strongest protection against privacy violation in both the average-case and
worst-case scenarios and when the model is trained with overcondence. However, this
may come at the expense of the model’s test accuracy. The authors show that an eective
defense against label-only MIAs involves DP and strong regularization, which signi-
cantly reduces the leakage of private information.
Single-run membership auditing: It is a technique using a single execution of the
audit process.
Steinke et al. [43] propose a novel auditing scheme that uses only a single training of
the model and can be evaluated using a one-time model output, making audits feasible in
practical applications. The authors apply their auditing scheme specically to the DP-SGD
algorithm. After training, auditors select a set of canary data points (auditing examples) and
apply MIA thresholds and model parameter tuning to maximize audit assurance from a
single model output. The aack estimates the sensitivity of the model to individual data,
which is an indication of the privacy risk of the model. By adjusting the MIA parameters
and interpreting the model’s response to the canary points, the auditor approximates the
empirical privacy loss, . This empirical estimate provides information on how closely
the model’s practical privacy matches the theoretical DP guarantees. Their analysis uses
parallelism to add or remove multiple data points (training examples independently) in a
single training run of the algorithm and statistical generalization. This auditing scheme re-
quires minimal assumptions about the underlying algorithm, making it applicable in both
black-box and white-box seings. Andrew et al. [118] propose a novel one-shot auditing
Appl. Sci. 2025, 15, 647 22 of 57
framework that enables ecient auditing during a single training run without a priori
knowledge of the model architecture, tasks, or DP training algorithm. The method is
proven to provide provably correct estimates for privacy loss under the Gaussian mecha-
nism, demonstrating its performance on FL benchmark datasets. The method they pro-
pose is model and dataset agnostic, so it can be applied to any local DP task.
Annamalai et al. [109] present a one-shot nearly tight black-box auditing scheme for the
privacy guarantees of the DP-SGD algorithm to investigate empirical vs. theoretical aspects.
The main idea behind the auditing is to craft worst-case initial model parameters, since
DP-SGD is agnostic to the choice of initial model parameters that can yield to tighter pri-
vacy audits. The model was initialized with the average-case initial parameters. The au-
thors empirically estimate the privacy leakage from DP-SGD by using the gradient-based
membership inference aacks approach. The authors’ key nding is that by crafting
worst-case initial model parameters, more realistic privacy estimates can be obtained to
address the limitations of the theoretical privacy analysis of DP-SGD.
Loss-based membership inference auditing: It is a technique that measures privacy
leakage in dierentially private models by evaluating the dierences in the model’s loss
values generating during the training of ML models when predicting training data versus
non-training data.
Wang et al. [111] introduced a novel randomized approach to privacy accounting,
which aims to improve on traditional deterministic methods by achieving tighter bounds
on privacy loss. The method leverages the concept of Privacy Loss Distribution (PLD) to
more accurately measure and track the cumulative privacy loss over a sequence of com-
putations. This approach is particularly benecial for large-scale data applications where
the privacy budget is strict.
Condence score membership auditing: In this type of audit, the vulnerability of
the model is assessed on the basis of the condence scores of the predictions. Higher con-
dence in the predictions for training data points compared to non-training points often
indicates a leak. By examining the condence values for predictions in a large sample,
auditors can determine whether training points have a higher condence than non-train-
ing points and thus estimate membership leakage.
Yeom et al. [10] establish a direct link between overing and membership inference
vulnerabilities by analyzing condence scores. It is shown that high-condence predic-
tions are often associated with data memorization, which increases privacy risks, espe-
cially when aackers exploit condence scores.
Metric-based membership inference auditing: This refers to the use of various met-
rics and statistics calculated from a ML model’s outputs to assess the privacy risks in
DPML systems. This approach applies membership inference techniques and calculates
metrics and statistics that allow a quantitative assessment of privacy leakage. It is often
used to compare dierent models and privacy parameters.
Rahman et al. [181] evaluates DP mechanisms against MIAs and uses accuracy and F-
score as a privacy leakage metrics to measure the privacy loss on models trained with DP
algorithms. Jayaraman and Evans [21] evaluate the private mechanism against both mem-
bership inference and aribute inference aacks. They used balanced prior data distribu-
tion probability. Note that if the prior probability is skewed, the above-mentioned meth-
ods are not applicable. Liu et al. [170] evaluate DP mechanism using a hypothesis testing
framework. They connect precision, recall, and F-score metrics to the DP parameters .
Based on the aacker’s background knowledge, they give insight into choosing these pa-
rameter values. Balle et al. [171] explain DP through a statistical hypothesis testing interpre-
tation, where conditions for a privacy denition based on statistical divergence are iden-
tied, allowing for an improved conversion rules between divergence and dierential pri-
vacy. Carlini et al. [102] investigate how neural networks unintentionally memorize
Appl. Sci. 2025, 15, 647 23 of 57
specic training data. The authors develop an aack methodology to quantify unintended
memorization by evaluating how easy it is to reconstruct specic data points (e.g., training
examples with private information) from the trained model. This study uses metric-based
approaches to measure memorization and unintended data retention, which are both crit-
ical components in determining membership inference. The research identies factors
contributing to memorization, including model size, training duration, and dataset char-
acteristics.
Humphries et al. [182] investigate the eectiveness of DP in protecting against MIAs
in ML. The authors perform an empirical evaluation by varying the values in DP-SGD
and observing the eect on the success rate of MIAs including back-box and white-box
aacks. The authors suggest that DP needs to be complemented by other techniques that
specically target membership inference risk. Ha et al. [41] evaluate the impact of privacy
parameters by adjusting on the eectiveness of DP in mitigating gradient-based MIAs.
The authors recommend specic DP parameter seings and training procedures to im-
prove privacy without sacricing model utility. Askin et al. [183] explore statistical meth-
ods for quantifying and verifying dierential privacy (DP) claims. This method includes
estimators and condence intervals for the optimal privacy parameter ϵ of a randomized
algorithm and avoids the complex process of event selection, which simplies the imple-
mentation, provides estimators and condence intervals for the optimal privacy parame-
ter ϵ of a randomized algorithm. Liu and Oh [159] report on extensive hypothesis testing
using DPML using the Neyman–Pearson criterion. They give guidance on seing the pri-
vacy budget on assumption about the aacker’s knowledge considering dierent types of
auxiliary information that an aacker can obtain (to strengthen the MIA such as probability
distribution of data, record correlation, and temporal correlation).
Aerni et al. [184] design adaptive membership inference aacks based on the LiRA
framework [103], which frames membership inference as a hypothesis testing problem.
Given the score of the victim model on , the aack then applies a likelihood ratio test to
distinguish between the two hypotheses. To estimate the score distribution, multiple shadow
models must be trained by repeatedly sampling a training set and training models.
Data augmentation-based auditing: This form of auditing involves generating syn-
thetic or modied versions of the data in order to assess and improve privacy guarantees.
This approach is useful in evaluating models with overing tendencies, where small per-
turbations could reveal privacy weaknesses.
Kong et al. [185] present a notable connection between machine unlearning and MIA.
Their method provides a mechanism for privacy auditing without modifying the model.
By leveraging forgeability (creating new, synthetic data samples), data owners can con-
struct Proof-of-Repudiation (PoR) concept that allows a model owner to refute claims
made by MIAs to enhance privacy protection and mitigate privacy risks.
Recently, there has been works on auditing lower bounds for dierential Rènyi dif-
ferential privacy (RDP). Balle et al. [171] investigate the relationship between RDP and
its interpretation in terms of a statistical hypothesis testing interpretation. The authors in-
vestigate the conditions for a privacy denition based on statistical divergence, which al-
lows for an improved conversion rules between divergence and dierential privacy. They
provide precise privacy loss bounds under RDP and interpret these in terms of type I and
type II errors in hypothesis testing. Kua et al. [186] develop a framework to estimate
lower bounds for RDP parameters (especially for ) by investigating a mechanism
in a black-box manner. Their framework allows auditors to derive minimal privacy guar-
antees without requiring internal access to the mechanism. Their goal was to observe how
much the outputs deviate for small perturbations in the inputs. Domingo-Enrich et al.
[187] propose an auditing procedure of DP with the regularized kernel Rènyi divergence
(KRD) to dene the regularized kernel Rènyi dierential privacy (KRDP). Their auditing
Appl. Sci. 2025, 15, 647 24 of 57
procedure can estimate from samples even in high dimensions for -DP, -DP, and
-Rényi DP. Their proposed auditing method does not suer from curse of dimension
and has parametric rates in high dimension. However this approach requires knowledge
of the covariance matrix of the underlying mechanisms, which is impractical for most
mechanisms other than Laplace and Gaussian mechanisms and inaccessible in black-box
seings.
Kong et al. [40] introduce a family of function-based testers for Rènyi DP (also for
pure and approximate DP). The authors introduce DP-Auditorium, a DP auditing library
implemented in Python that allows to test DP guarantees from the black-box access to the
mechanism. DP-Auditorium facilitates the development and execution of privacy audits,
allowing researchers and practitioners to evaluate the robustness of DP implementations
against various adversarial aacks, including membership inference and aribute inference.
The library also supports multiple privacy auditing protocols and integrates congurable
privacy mechanisms, allowing for testing across dierent privacy budgets and seings.
Chadha et al. [32] proposes a framework for auditing private predictions with dierent poi-
soning and querying capabilities. They investigate privacy leakage in terms of Rényi DP of
four private prediction algorithms PATE, CaPC, Prompt PATE and Private-KNN. The ex-
periments have shown that there are algorithms that are easier to poison and lead to much
higher privacy leakage. Moreover, the privacy leakage is signicantly lower for aackers
without query control than for aackers with full control. The authors have shown that
privacy leakage is lower for aackers without query control.
4.3.2. Dierential Privacy Auditing with Data Poisoning
Data-poisoning auditing: In data poisoning, “poisoned” data are introduced into the
training dataset to observe whether they inuence the model predictions and worsens the
data protection guarantees. The auditor simulates various data poisoning scenarios by
inserting manipulated samples that distort the data distribution. The main scenarios that
have been considered in the data poisoning auditing literature are adversarial injection of
data points, inuence function analysis, manipulation of gradients in DP training, empirical
evaluation of privacy loss ϵ, simulation of worst-case poisoning scenarios, and privacy
violations.
Inuence function analysis is a statistical tool that helps to identify whether specic
data points have an excessive inuence on the model used to measure the eect on
model’s predictions, indicating possible poisoning. They provide a way to estimate how
much specic training samples inuence the model’s behavior without needing to retrain
the model. The seminal work by Koh and Ling [188] provides a robust framework for
inuence functions to analyze and audit the predictions made by black-box ML models.
It introduces techniques for measuring the inuence of individual training points on
model predictions, seing the stage for analyzing poisoning aacks. The authors utilize
rst-order Taylor approximations to derive inuence functions. This method is particu-
larly useful for diagnosing issues related to model outputs. Lu et al. [61] audit the tight-
ness of DP algorithms using inuence-based poisoning to detect privacy violations and un-
derstand information leakage. They manipulate the training data to inuence the output
of the model and thus violate privacy guarantees. Their main goal is to verify the privacy
of a known mechanism whose inner workings may be hidden.
To understand how poisoned gradients can inuence privacy guarantees, the gradient
manipulation is used in DP training. By monitoring gradients, auditors can detect anomalies
due to poisoned inputs, as these may cause the dierentially private model to exhibit non-
robust behavior. Chen et al. [189] investigate the potential for reconstructing training data
from gradient leakage analysis during the training of neural networks. The reconstruction
problem is formulated as a series of optimization problems that are solved iteratively for
Appl. Sci. 2025, 15, 647 25 of 57
each layer of the neural network. An important contribution that this work makes is the
proposal of a metric to measure the security level of DL models against gradient-based
aacks. The seminal paper Xie et al. [190] investigate the impact of gradient manipulation
on both privacy guarantees and model accuracy relevant to DP auditing in federated
learning. Liu and Zhao [191] focus on the interaction of gradient manipulation with pri-
vacy and proposed ways to improve the robustness of the model under these aacks. Ma
et al. [54] establish complementary relationships between data poisoning and dierential
privacy by using a small-scale and a large-scale data-poisoning aack based on the gradient
ascend of logistic regression parameters with respect to to reach target . They
evaluate the aack algorithms on two private learners targeting an objective perturbation
and an output perturbation. They show that dierentially private learners are provably
resistant to data-poisoning aacks, with the protection deceasing exponentially as the at-
tacker poisons more data. Jagielski et al. [46] investigate privacy vulnerabilities in DP-
SGD, focusing on the question of whether the theoretical privacy guarantees hold under
real-world conditions. The DP-SGD algorithm was audited by simulating a model-agnos-
tic clipping-aware poisoning aack (ClipBKD) in black-box seings on logistic regression
and fully connected neural network models. The models were initialized, such that the
initial parameters were set for the average case. The empirical privacy estimates are derived
from Clopper–Pearson confidence intervals of the FP and FN rates of attacks. The authors
provide a -maximisation strategy to obtain a lower bound on the privacy leakage.
Empirical evaluation of privacy loss evaluates how the privacy budgets are aected
by poisoned data. Auditors measure the eective privacy loss, or empirical epsilon, by
feeding in poisoned data and calculating whether the privacy budget remains within ac-
ceptable bounds. Steinke and Ullman [192] introduce auditing mechanisms that track the
empirical privacy losses and provide insights into the impact of poisoned data on privacy
guarantees in real-world applications. The authors clarify the relationship between pure
and approximate DP by establishing quantitative bounds on privacy loss under dierent
conditions and introducing adaptive data analysis. Kairouz et al. [193] develop empirical
privacy assessment methods applicable to DP-SGD in high-risk inversion seings. This
method allows a detailed examination of how shuing aects privacy guarantees. The
authors evaluate the dierent parameters, such as batch size and privacy budget, in terms
of privacy leakage.
Privacy violation: The rst work in the eld of DP auditing, Li et al. [194] consider
relaxing the DP notations to cover dierent types of privacy violations, such as unauthor-
ized data collection, sharing and targeting. The authors outline key dimensions that inu-
ence privacy, such as individual factors (e.g., awareness and knowledge), technological
factors (e.g., data processing), and contextual factors (e.g., legal framework). However,
the data leakage is not assessed. Hay et al. [195] evaluate the existing DP implementations
for correctness of implementation. The authors create a privacy evaluation framework,
named DPBench. This framework is designed to evaluate, test and validate privacy guar-
antees. Recent work proposes ecient solutions for auditing simple privacy mechanisms
for scalar or vector inputs to detect DP violations (Ding et al., 2018; Bichsel et al., 2021).
For each neighboring input pair, the corresponding output is determined, and Monte
Carlo probabilities are measured to determine privacy. Ding et al. [45] were the rst to
propose practical methods for testing privacy claims in black-box access to a mechanism.
The authors designed StatDP, a hypothesis testing pipeline for checking DP violations in
many classical DP algorithms, including noisy argmax and for identifying ϵ-DP violations
in sparse algorithms, such as the spare vector technique and local DP algorithms. Their
work focuses on univariate testing of DP and evaluates the correctness of existing DP im-
plementations.
Appl. Sci. 2025, 15, 647 26 of 57
Wang et al. [196] oer a code analysis-based tool CheckDP to generate or prove coun-
terexamples for a variety of algorithms, including spares variety vector algorithm. Barthe
et al. [197] investigate the decidability of DP. CheckDP and DiPC can not only detect vio-
lations of privacy claims, but can also be used for explicit verication. Bichsel et al. [26]
present a privacy violation detection tool, DP-Sniper, which shows that a black-box seing
can eectively identify privacy violations. It utilizes two strategies: (1) classier training
to train a classier that predicts whether an observed output is likely to have been gener-
ated from one of two inputs; and (2) optimal aack transformation, where this classier is
then transformed into an approximately optimal aack on dierential privacy. DP-Sniper
is particularly eective at exploiting oating-point vulnerabilities in naively implemented
algorithms and detecting signicant privacy violations
Niu et al. [166] present DP-Opt(mizer), a disprover that aempts to nd for counter-
examples whose lower bounds on dierential privacy exceed the claimed level of privacy
guaranteed by the algorithm. The authors focus exclusively on -DP. They train a classi-
er to distinguish between the outputs and and create an aack based on
this classier. The statistical guarantees for the aack found are given. They transform the
search task into an improved optimization objective that takes the empirical error into
account and then solve it using various o-the-shelf optimizers. Lokna et al. [48] present
a black-box aack detection privacy violations of DP to detect potential violations
by grouping pairs based on the perception that many pairs can be grouped to-
gether because they are due to the same algorithm. The key technical insight of their work
is that many (, ) dierentially private algorithms combine and into a single privacy
parameter . By directly measuring the robustness or degree of privacy failure , one can
audit multiple privacy claims simultaneously. The authors implement their method in a
tool called Delta-Siege.
4.3.3. Dierential Privacy Auditing with Model Inversion
This is a DPML model evaluation scheme that examines how much information
about individual data records can be inferred from the outputs of the trained model (usu-
ally condence score values or gradients) to understand the level of privacy leakage. The
model is inverted to extract information. The auditing might be a white-box with access
to gradients or internal layers or a black-box that only accesses the output labels. The main
challenge in detecting model inversion aacks in dierential privacy auditing is the need
to prevent the inference of sensitive aributes of individuals from the shared model, es-
pecially in the context of black-box scenarios. The main scenarios that have been consid-
ered in the literature for model inversion auditing are sensitivity analyses, gradient and
weight analyses, empirical privacy loss, and embedding and reconstruction tests.
Sensitivity analyses quantify how much private information is embedded in the
model’s outputs that could potentially be reversed. Auditors evaluate gradients or out-
puts to determine how well they reect the characteristics of the data. This analysis often
involves running a series of model inversions to assess how DP mechanisms (e.g., DP-
SGD) protect against the disclosure of sensitive aributes.
Frederikson et al. [100] present a seminal paper that introduces model inversion at-
tacks that use condence scores from model predictions to reconstruct sensitive input data.
It explores how certain types of models, even when protected with DP, can be vulnerable
to model inversion aacks by revealing certain features. Wang et al. [136] analyze the vul-
nerability of existing DP mechanisms. They use a functional mechanism method that per-
turbs the coecients of the polynomial representations of the objective function balancing
the privacy budget between sensitive and non-sensitive aributes to mitigate model in-
version aacks. Hitaj et al. [198] focuses primarily on collaborative learning seings,
which is relevant to DP as it shows how generative adversarial networks (GANs) can be
Appl. Sci. 2025, 15, 647 27 of 57
used for model inversion to reconstruct sensitive information, providing insights into po-
tential vulnerabilities in DP-protected models. Song et al. [199] discuss how machine-
learning models can memorize training data in a way that allows aackers to perform
inversion aacks. The authors analyze scenarios in which DP cannot completely prevent
leakage of private data features through inversion techniques. Fang et al. [135] examine
the vulnerability of existing DP mechanisms using a functional mechanism method. They
propose a dierential privacy allocation model. They optimize the regression model by
adjusting the allocation of the privacy budget allocation within the objective function.
Cummings et al. [200] introduce an individual sensitivity metric technique like smooth sen-
sitivity and sensitivity preprocessing to improve the accuracy of private data by reducing
sensitivity, which is crucial for mitigating model inversion risk.
Gradient and weight analyses show whether and how gradients expose sensitive
aributes. By auditing gradients and weights, privacy auditors can check whether pro-
tected data aributes can be inferred directly or indirectly. Since model inversion often
leverages gradients for black-box aacks, gradient clipping in DP-SGD helps mitigate ex-
posure.
Works such as that of Phan et al. [201] investigate how model inversion can circum-
vent the standard DP defense by exploiting subtle dependencies in the model parameters.
There are works that use gradient-inversion aacks. Zhu et al. [202] show that gradient in-
formation, i.e., minimizing the dierence between the observed gradients and those that
would be expected from the true input data commonly used in DP or federated learning,
can reveal sensitive training data by inversion. It is shown how even with DP mechanisms,
gradient-based inversion aacks can reconstruct data and thus pose a privacy risk. Huang
et al. [203] align the gradients of dummy data with the actual data, making the dummy
images resemble the private images. The paper describes in detail how gradient inversion
aacks work by recovering training paerns from model gradients shared during feder-
ated learning. Wu et al. [204] use gradient compression to reduce the eectiveness of gra-
dient inversion aacks. Zhu et al. [205] introduce a novel generative gradient inversion
aack algorithm (GGI) where the dummy images are generated from low-dimension la-
tent vectors through the pre-trained generator.
Empirical privacy loss approach calculates the dierence between theoretical and
empirical privacy losses in inversion scenarios. Auditors measure the privacy loss, , by
performing a model inversion on a DP-protected model and comparing the result with
the theoretical privacy budget. Large deviations indicate a possible weakness of DP in
protecting against inversion.
Yang et al. [206] investigate defense mechanisms against model inversion and pro-
pose prediction purication techniques, which involves modifying the outputs of the
model to obscure sensitive information while still providing useful predictions. It shows
how adding additional processing to predictions can mitigate the eects of inversion at-
tacks. Zhang et al. [207] apply DP to software defect prediction (SDP) sharing models and
investigate privacy disclosure through model inversion aacks. The authors introduce
class-level and subclass-level DP and use DPRF (dierential private random forest) as a
part on enhancing DP mechanism.
Embedding and reconstruction test: It examines whether latent representations or
embeddings could be reversed to obtain private data. The auditors question whether em-
beddings of DP models are resistant to inversion by aempting to reconstruct data points
from compressed representations.
Manchini et al. [208] show that stricter privacy restrictions can lead to a strong bias
in inferential, aecting the statistical performance of the model. They propose an ap-
proach to improve the data privacy in regression models under heteroscedasticity. In ad-
dition, there are some methods such as Graph Model Inversion (GraphMI) that are
Appl. Sci. 2025, 15, 647 28 of 57
specically designed to address the unique challenges of graph data [146]. Park et al. [139]
recover the training images from the predictions of the model to evaluate the privacy loss
of a face recognition model and measure the success of model inversion aacks based on
the performance of an evaluation model. The results have shown that even a high privacy
budget = 8 can provide protection against model inversion aacks.
4.3.4. Dierential Privacy Auditing Using Model Extraction
When auditing DPML models using model extraction aacks, auditors evaluate how
resistant a DP-protected model is to extraction aacks, where an aacker aempts to rep-
licate or approximate the model by querying it multiple times and using the outputs to
train a surrogate. This form of auditing is essential to verify that DP implementations truly
protect against unintentional model replication, which can jeopardize privacy by allowing the
attacker to learn sensitive information from the original model. The main scenarios that have
been considered in the literature for auditing model extractions are query analyses.
Query analyses measure the extent to which queries can reveal model parameters or
behaviors. Auditors simulate extraction aacks by extensively querying the model and
analyzing how well they can replicate its outputs or decision boundaries.
Carlini et al. [101] show that embeddings can reveal private data, advancing research
on the robustness of embeddings for DP models. Dziedzic et al. [209] require users to per-
form computational tasks before accessing model predictions. The proposed calibrated
Proof of Work Mechanism (PoW) can deter aackers by increasing the model extraction
eort and creating a balance between robustness and utility. Their work contributes to the
broader eld of privacy auditing by proposing proactive defenses instead of reactive
measures in ML applications. Li et al. [210] investigate a novel personalized local dier-
ential mechanism to defend against the equation-solving aack and query-based aacks (solv-
ing for model parameters through multiple queries) by querying the model multiple times
to solve for the model parameters. The authors concluded that this method is particularly
eective against regression models and can be mitigated by adding high-dimensionality
Gaussian noise to model coecients. Li et al. [147] use the active verication two-stage
privacy auditing method to detect suspicious users based on the query paerns and veri-
fying if they are aackers. By analyzing how well the queries cover the feature space of
the victim model, it can detect potential model extraction. Once suspicious users are iden-
tied, an active verication module is employed to conrm whether these users are in-
deed aackers. Their proposed method is particularly useful for object detection models
through innovative use of feature space analysis and perturbation strategies. Zheng et al.
[211] propose a novel privacy preserving mechanism, Boundary Dierential Privacy (-
BDM), which modies the output layer of the model. BDP is designed to introduce care-
fully controlled noise around the decision boundary of the model. This method guaran-
tees that an aacker cannot learn the decision boundary of two classes with a certain ac-
curacy, regardless of the number of queries. The special layer, the so-called Boundary DP
layer that applies dierential privacy principles was implemented in the ML model. By
integrating BDP into this layer, the model produces an output that preserves privacy
around the boundary and eectively obscures information that could be exploited in ex-
traction aacks. This boundary randomized response algorithm was developed for binary
models and can be generalized to multiclass models. The extensive experiments (Zheng
et al., 2022) [18] have shown that BDP obscures the prediction responses with noise and
thus prevents aackers from learning the decision boundary of any two classes, regardless
of the number of queries issued.
Yan et al. [212] propose an alternative to caching the BDP layer in which privacy loss
is adapted accordingly. The authors propose an adaptive query-ooding parameter du-
plication (QPD) extraction aack that allows the auditor to infer model information with
Appl. Sci. 2025, 15, 647 29 of 57
black-box access and without prior knowledge of the model parameters or training data.
A defense strategy called monitoring-based DP (MDP) dynamically adjusts the noise
added to the model responses based on real-time evaluations, providing eective protec-
tion against QPD aacks. Pillutla et al. [161] introduce a method in which multiple ran-
domized canaries are added to audit the privacy guarantees by distinguish between mod-
els trained with dierent numbers of canaries in the dataset. By distinguishing between
models trained with dierent numbers of canaries, the authors introduce Lifted Dieren-
tial Privacy (LiDP) can eectively audit dierentially private models. The introduction of
novel condence intervals that adapt to empirical high-order correlations to improve the
accuracy and reliability of the auditing process.
4.3.5. Dierential Privacy Auditing Using Property Inference
Auditing DP with property inference aacks typically focuses on extracting global
features or statistical properties of a dataset used to train ML model, such as average age,
frequency of diseases, or frequency of geographic locations, rather than specic data rec-
ords to reveal sensitive information. The goal is to ensure that the DP mechanisms eec-
tively prevent aackers from inferring sensitive characteristics even if they have access to
the model outputs. The auditor checks whether the model reveals statistical properties of
the training data that could violate privacy. The literature review on the property inference for
differential privacy auditing presents different scenarios, such as evaluating property sensi-
tivity with model outputs and attribute-based simulated worst-case scenarios.
Evaluating property sensitivity with model outputs test how well the DP obscures
statistical dataset properties. Auditors analyze the extent to which an aacker could infer
information at the aggregate or property level by examining model outputs across multi-
ple queries. For example, changes in the distribution of outputs when querying specic
demographics can reveal hidden paerns. An aribute-based simulation of a worst-case
scenario is a case in which an aacker has partial information on certain aributes of the
dataset. Auditors test the DP model by combining partially known data (e.g., location or
age) with the model’s predictions to see if the model can reveal other aributes. This type
of adversarial testing helps validate DP protections against more informed aacks.
Suri et al. [213] introduce the concept of distribution inference aacks in both white-
box and black-box models, which motivated later DP studies to counteract these vulnera-
bilities. This type of inference aack aims to uncover sensitive properties of the underlying
training data distribution, potentially exposing private information about individuals or
groups within the dataset. The authors discuss auditing information disclosure at three
granularity levels: distribution, user, and record level. This multifaceted approach allows
for a comprehensive evaluation of privacy risks associated with ML models. Ganju et al.
[214] show how property inference aacks on aributes can reveal the characteristics of
datasets even when neural networks use DP. The authors introduce an approach for in-
ferring properties that a model inadvertently memorizes, using both synthetic and real-
world datasets.
Melis et al. [215] focus on collaborative learning using property inference in the con-
text of shared model updates, focusing in particular on how unintended feature leakage can
jeopardize privacy. By analyzing the model’s outputs, the author identify which features
can leak information and which features contribute to the privacy risks. They introduce a
method for inferring sensitive aributes that may only apply to subgroups, thereby re-
vealing potential privacy vulnerabilities. Property inference aacks, in this case, rely on
the detection of sensitive features in the training data. Aackers can exploit the linear
property of queries to obtain multiple responses from DP mechanisms, leading to unex-
pected information leakage [215]. Huang and Zhou [215] address critical concerns about
the limits of dierential privacy (DP), especially in the context of linear queries. They show
Appl. Sci. 2025, 15, 647 30 of 57
how the inherent linear properties of certain queries can lead to unexpected information
leaks that undermine the privacy guarantees that DP is supposed to provide. Ben Hamida
et al. [216] investigate how the implementation of dierential privacy can reduce the like-
lihood of successful property inference aacks by obscuring the relationships between the
model parameters and the underlying data properties. Song et al. [217] provide a compre-
hensive evaluation of privacy aacks on ML models, including property inference aacks.
The authors propose aack strategies that target unintended model memorization, with
empirical evaluations on DP-protected models. A list of privacy auditing schemes is
shown in Table 2.
Table 2. A list of privacy auditing schemes.
Privacy Auditing
Scheme
Privacy Aack
Auditing Methodology
Membership infer-
ence audits
White-box membership inference
auditing
Auditors analyze gradients, hidden layers, intermediate acti-
vations measuring how training data inuences model be-
havior.
Black-box membership inference
auditing
Auditors observe probability distributions and condence
scores by analyzing these outputs and assessing the likeli-
hood that certain samples were part of the training data.
Shadow model membership audit-
ing
Auditors use “shadow models” to mimic the behavior of the
target model.
Label-only membership inference
auditing
Auditor evaluates the privacy guarantee leveraging only
output labels, training shadow models, generating a separate
classier, and quantifying true-positive rate and accuracy.
Single-training membership infer-
ence run auditing
Auditor leverages the ability to add or remove multiple
training examples independently during the run. This ap-
proach focuses on estimating the lower bounds of the pri-
vacy parameters without the need for extensive retraining of
the models.
Metric-based membership infer-
ence auditing
Auditor assesses privacy guarantees directly evaluating met-
rics and statistics derived from the model’s outputs (preci-
sion, recall, and F1-score) on data points.
Data augmentation-based auditing
Auditor generates or augmented data samples similar to
training set, testing whether these samples reveal member-
ship risk.
Data poisoning au-
diting
Inuence-function analysis
Evaluate privacy by introduction malicious data.
Gradient manipulation in DP train-
ing
Auditor alters the training data using back-gradient optimi-
zation, gradient ascent poisoning, etc.
Empirical evaluation of privacy
loss
Auditor conducts quantitative analyses of how the privacy
budgets is aected.
Simulation of worst-case poisoning
scenarios
Auditor constructs approximate upper bounds on the pri-
vacy loss.
Model inversion au-
diting
Sensitivity analysis
Auditor quanties how much private information is embed-
ded in the model outputs.
Gradient and weight analyses
Auditor aempts to recreate input features or private data
points form model outputs using gradient-based or optimi-
zation methods.
Empirical privacy loss
Auditor calculates the dierence between theoretical and
empirical privacy losses.
Embedding and reconstruction test
Auditor examines whether latent representations or embed-
dings could be reversed to obtain private data.
Appl. Sci. 2025, 15, 647 31 of 57
Model extraction au-
diting
Query analysis
Auditors simulate extraction aacks by extensively querying
the model and analyzing how well they can replicate its out-
puts or decision boundaries.
Property inference
auditing
Evaluating property sensitivity
with model outputs.
The auditor performs a test to infer whether certain proper-
ties can be derived from the model and whether the privacy
parameters are sucient to obscure such data properties.
5. Discussion and Future Research
This paper presents the current trends in privacy auditing in DPML using member-
ship inference, data poisoning, model inversion aacks, model extraction aacks, and
property inference.
We consider the advantages of using membership inference for privacy auditing in
DPML models through quantication of privacy risk, empirical evaluation, improved au-
dit performance, and guidance for privacy parameter selection. MIAs can eectively
quantify the privacy risk (the amount of private information) that a model leaks about
individual data points in its training set. This makes them a valuable tool for auditing the
privacy guarantees of DP models [41]. They provide a practical lower bound on inference
risk complementing the theoretical upper bound of DP [16]. MIAs enable an empirical
evaluation of privacy guarantees in DP models, helping to identify potential privacy leaks
and implementation errors [161]. They can be used to calculate empirical identiability
scores that enable a more accurate assessment of privacy risks [34]. Advanced methods
that combine MIAs with other techniques, such as inuence-based poisoning aacks, have
been shown to provide signicantly improved audit performance compared to previous
approaches [61]. MIAs can help in selecting appropriate privacy parameters (ε, δ) by
providing insights into the trade-o between privacy and model utility [11,164]. These
aacks require relatively weak assumptions about the adversary’s knowledge, making
them applicable in dierent scenarios. This exibility allows for broader applicability in
real-world seings where aackers may have limited information about the mode [46].
We consider the drawbacks of using membership inference for privacy auditing in
DPML models as impacting model utility, parameter selection complexity, and non-uni-
form risk across classes. Implementing DP to defend against MIAs often leads to a trade-
o in which increased privacy leads to lower model accuracy [217]. Excessive addition of
noise, which is required for strong privacy guarantees, can signicantly degrade the util-
ity of the model, especially in scenarios with imbalanced datasets. Choosing the right pri-
vacy parameters is challenging due to the variability of data sensitivity and distribution,
making it dicult to eectively balance privacy and utility [35]. Legal and social norms
for anonymization are not directly addressed by dierent privacy parameters, adding to
the complexity. Some MIA methods, especially those that require additional model train-
ing or complex computations, can entail a signicant computational overhead [96]. The
development of robust auditing tools that can provide empirical assessments of privacy
guarantees in DP models is crucial. These tools should take into account real-world data
dependencies and provide practical measures of privacy loss [34]. Future research should
focus on adaptive privacy mechanisms that can dynamically adjust privacy parameters
based on the specic characteristics of the training data and the desired level of privacy.
Using data poisoning to audit privacy in DP provides valuable insight into vulnera-
bilities and helps quantify privacy guarantees, thus improving the understanding of the
robustness of models. Data-poisoning aacks can reveal vulnerabilities in DP models by
showing how easily an aacker can manipulate training data to inuence model outputs.
This helps identify vulnerabilities that are not obvious through standard auditing meth-
ods [46]. Data poisoning can help evaluate the robustness of DP mechanisms against at-
tacks by aackers. By understanding how models react to poisoned data, we can improve
Appl. Sci. 2025, 15, 647 32 of 57
their design and implementation [61]. By using data-poisoning techniques, auditors can
quantitatively measure the privacy guarantees of dierentially private algorithms [2].
This empirical approach complements theoretical analyses and provides a clearer under-
standing of how privacy is maintained in practice [43]. The use of data poisoning for au-
diting can be generalized to dierent models and algorithms, making it a versatile tool for
evaluating the privacy of dierent machine-learning implementations [61].
However, it also poses challenges in terms of complexity, potential misuse, and the
limits of the scope of application, which must be carefully considered in practice. Con-
ducting data-poisoning aacks requires signicant computational resources and exper-
tise. Developing eective poisoning strategies can be complex and may not be feasible for
all organizations [43,61]. Data-poisoning aacks may not cover all aspects of privacy au-
diting. While they may reveal certain vulnerabilities, they may not address other types of
privacy breaches or provide a comprehensive view of the overall security of a model [43].
Techniques developed for data poisoning in the context of audits could be misused by
malicious actors to exploit vulnerabilities in ML models. This dual use raises ethical con-
cerns about the impact of such research [43]. The eectiveness of data-poisoning aacks
can vary greatly depending on the specic characteristics of the model being audited. If
the model is robust against certain types of poisoning aacks, the auditing process may
lead to misleading results regarding its privacy guarantees.
Future research could focus on developing more robust data-poisoning techniques
that can eectively audit DP models. By rening these methods, auditors can beer assess
the resilience of models to dierent types of poisoning aacks, leading to improved pri-
vacy guarantees. As federated learning becomes more widespread, interest in how data-
poisoning aacks can be used in this context is likely to grow. Researchers could explore
how to audit federated learning models for DP, taking into account the particular chal-
lenges posed by decentralized data and model updates. The development of automated
frameworks that utilize data poisoning for auditing could streamline the process of eval-
uating dierentially private models. Such frameworks would allow organizations to rou-
tinely assess the privacy guarantees of their models without the need for extensive manual
intervention. There is a trend towards introducing standardized quantitative metrics for
evaluating the eectiveness of DP mechanisms using data poisoning [63]. This could lead
to more consistent and comparable assessments across dierent models and applications.
Model inversion aacks can expose vulnerabilities in DP models by showing how
easily an aacker can reconstruct sensitive training data from the model outputs. This
helps identify vulnerabilities that may not be obvious through standard auditing methods
[58]. The model inversion serves as a benchmark for evaluating the eectiveness of dier-
ent DP mechanisms. By evaluating a model’s resilience to inversion aacks, auditors can
assess whether the models full the privacy guarantees, enhancing trust in privacy pro-
tection technologies [201]. By using model inversion techniques, auditors can quantita-
tively measure the privacy guarantees provided by DP algorithms [200]. This empirical
approach provides a beer understanding of how well privacy is maintained in practice.
The insights gained from the model inversion can inform developers about necessary ad-
justments to strengthen privacy protection. This iterative feedback loop can lead to con-
tinuous improvement of model security against potential aacks. Model inversion tech-
niques can be applied to dierent types of ML models, making them versatile tools for
evaluating the privacy of dierent implementations.
Performing model inversion audits can be computationally expensive and time-con-
suming, requiring signicant resources for both implementation and analysis. This can
limit accessibility for smaller organizations or projects with limited budgets [48]. Model
inversion methods often rely on strong assumptions about the aacker’s capabilities, in-
cluding knowledge of the architecture and parameters of the model [199]. This may not
Appl. Sci. 2025, 15, 647 33 of 57
reect real-world scenarios where aackers have limited access, which can lead to an
overestimation of privacy risks. Practical implementations of algorithms with varying lev-
els of privacy often contain subtle vulnerabilities, making it dicult to audit at scale, es-
pecially in federated environments [118]. The results of model inversion audits can be
complex and may require expert interpretation to fully understand their implications.
This complexity can hinder the eective communication of results to stakeholders who
may not have a technical background [48]. While model inversion aacks are eective in
detecting certain vulnerabilities, they may not cover all aspects of privacy auditing. While
DP is an eective means of protecting the condentiality of data, it has problems prevent-
ing model inversion aacks in regression models [136]. Other types of privacy violations
may not be captured by this method, resulting in an incomplete overview of the overall
security of a model.
Future research could investigate the implementation of DP at the class and subclass
level to strengthen defenses against model inversion aacks. These approaches could en-
able more granular privacy guarantees that protect sensitive aributes related to specic
data classes while enabling useful model outputs [58]. The use of the stochastic gradient
descent (SGD) algorithm as a strategic approach to address the challenge of selecting an
appropriate value for the privacy budget shows a possible future application of model
inversion to optimize privacy budget selection. There may be a trend towards dynamic
privacy budgeting, where the privacy budget is adjusted in real time based on the context
and sensitivity of the data being processed. This could help to beer balance the trade-o
between privacy and utility, especially in scenarios that are prone to model inversion at-
tacks [136].
Model extraction aacks pose a signicant privacy risk, even when DP mechanisms
are used [101]. These aacks aim to replicate the functionality of a target model by query-
ing it and using the answers to infer its parameters or training data. Model extraction
aacks can derive the parameters of a machine-learning model through public queries
[209]. Even with DP, which adds noise to the model outputs to protect privacy, these at-
tacks can still be eective. For example, the adaptive query-ooding parameter duplica-
tion (QPD) aack can infer model information with black-box access and without prior
knowledge of the model parameters or training data [212].
Current trends in privacy auditing in the context of DPML show that the focus is on
developing ecient, eective frameworks and methods for evaluating privacy guaran-
tees. As the eld continues to advance, ongoing research is critical to rene these auditing
techniques and schemes, address the challenges related to the privacy–utility trade-o,
and improve the practical applicability of DPML systems in real-world seings. We hope
that this article provides insight into privacy auditing in both local and global DP.
Author Contributions: Conceptualization, I.N., K.S. and K.O.; methodology, I.N.; formal analysis,
I.N.; investigation, I.N.; resources, I.N.; writing—original draft preparation, I.N.; writing—review
and editing, K.S., A.N. and K.O.; project administration, K.O.; funding acquisition, K.O. All authors
have read and agreed to the published version of the manuscript.
Funding: This work is the result of activities within the “Digitalization of Power Electronic Appli-
cations within Key Technology Value Chains” (PowerizeD) project, which has received funding
from the Chips Joint Undertaking under grant agreement No. 101096387. The Chips-JU is supported
by the European Union’s Horizon Europe Research and Innovation Programme, as well as by Aus-
tria, Belgium, Czech Republic, Finland, Germany, Greece, Hungary, Italy, Latvia, Netherlands,
Spain, Sweden, and Romania.
Data Availability Statement: Not applicable.
Conicts of Interest: The authors declare no conicts of interest.
Appl. Sci. 2025, 15, 647 34 of 57
Appendix A
The table of privacy-auditing schemes provides an overview of the key privacy at-
tacks, references, privacy guarantees, methods, and main contributions discussed in the
study for Section 4.
Table A1. Privacy auditing schemes..
Privacy-Aack
Methodology
Reference
Privacy Guarantees
Methodology and the Main Contribution
Membership inference auditing
Black-box mem-
bership infer-
ence auditing
Song et al. [176]
Membership inference at-
tack analysis:
Investigates the vulnerabil-
ity of adversarial robust DL
to MIAs and shows that
there are signicant privacy
risks despite the claimed ro-
bustness.
Methodology: Performs a comprehensive analysis of
MIAs targeting robust models proposing new bench-
mark aacks that improve existing methods by lever-
aging prediction entropy and other metrics to evaluate
privacy risks. Empirical evaluations show that even ro-
bust models can leak sensitive information about train-
ing data.
Contribution: Reveals that adversarial robustness does
not inherently protect against MIAs and challenges the
assumption that such protection is sucient for pri-
vacy. Introduces the privacy risk score, a new metric
that quanties the likelihood of an individual sample
being part of the training set providing a more nu-
anced understanding of privacy vulnerabilities in ML
models.
Carlini et al.
[103]
Analyzes the eectiveness
of MIAs against ML mod-
els:
Shows that existing metrics
may underestimate the vul-
nerability of a model to
MIAs.
Methodology: Introduces a new aack framework
based on quantile regression of models’ condence
scores. Proposes a likelihood ratio aack (LiRA) that
signicantly improves TPR at low FNR.
Contribution: Establishes a more rigorous evaluation
standard for MIAs and presents a likelihood ratio at-
tack (LiRA) method to increase the eectiveness of
MIAs by improving the accuracy in identifying train-
ing data members.
Lu et al. [173]
Introduces a black-box es-
timator for DP:
Allows domain experts to
empirically estimate the pri-
vacy of arbitrary mecha-
nisms without requiring de-
tailed knowledge of these
mechanisms.
Methodology: Combines dierent estimates of DP pa-
rameters with Bayes optimal classiers. Proposes a rel-
ative DP framework that denes privacy with respect
to a nite input set, T, which improves scalability and
robustness.
Contribution: Establishes a theoretical foundation for
linking black-box poly-time parameter estimates
to classier performance and demonstrates the ability
to handle large output spaces with tight accuracy
bounds, thereby improving the understanding of pri-
vacy risks. Introduces a distributional DP estimator
and compares its performance on dierent mecha-
nisms.
Kazmi et al. [175]
Measuring privacy viola-
tions in DPML models:
Introduces a framework for
through MIAs without the
need to retrain or modify
the model.
Methodology: PANORAMIA uses generated data
from non-members to assess privacy leakage, eliminat-
ing the dependency on in-distribution non-members
included in the distribution from the same dataset.
This approach enables privacy measurement with min-
imal access to the training dataset.
Appl. Sci. 2025, 15, 647 35 of 57
Contribution: The framework was evaluated with var-
ious ML models for image and tabular data classica-
tion, as well as with large-scale language models,
demonstrating its eectiveness in auditing privacy
without altering existing models or their training pro-
cesses.
Koskela et al.
[177]
DP:
Proposes a method for au-
diting require prior
knowledge of the noise dis-
tribution or subsampling ra-
tio in black-box seings.
Methodology: Uses a histogram-based density estima-
tion technique to compare lower bounds for the total
variance distance (TVD) between outputs from two
neighboring datasets.
Contribution: The method generalizes existing thresh-
old-based membership inference auditing techniques
and improves prior approaches, such as f-DP auditing,
by addressing the challenges of accurately auditing the
subsampled Gaussian mechanism.
Kua et al. [186]
Rényi DP:
Establishes new lower
bounds for Rényi DP in
black-box seings provid-
ing statistical guarantees for
privacy leakage that hold
with high probability for
large sample sizes.
Methodology: Introduces a novel estimator for the Ré-
nyi divergence between the output distributions of al-
gorithms. This estimator is converted into a statistical
lower bound that is applicable to a wide range of algo-
rithms.
Contribution: The work pioneers the treatment of Ré-
nyi DP in black-box scenarios and demonstrates the ef-
fectiveness of the proposed method by experimenting
with previously unstudied algorithms and privacy en-
hancement techniques.
Domingo-Enrich
et al. [187]
DP:
Proposes auditing proce-
dures for dierent DP guar-
antees: -DP, -DP,
and -Rényi DP.
Methodology: The regularized kern Rényi divergence
can be estimated from random samples, which enables
eective auditing even in high-dimensional seings.
Contribution: Introduces relaxations of DP using the
kernel Rényi divergence and its regularized version.
White-box mem-
bership infer-
ence auditing
Leino and Fred-
rikson [107]
Membership inference at-
tack analysis:
Introduces a calibrated at-
tack that signicantly im-
proves the precision of
membership inference
Methodology: Exploits the internal workings of deep
neural networks to develop a white-box membership
inference aack.
Contribution: Demonstrates how MIAs can be utilized
as a tool to quantify the privacy risks associated with
ML models.
Chen et al. [178]
DP:
Evaluates the eectiveness
of dierential privacy as a
defense mechanism by per-
turbating the model
weights.
Methodology: Evaluate the dierential private convo-
lutional neural networks (CNNs) and Lasso regression
model with and without sparsity.
Contribution: Investigate the impact of sparsity on
privacy guarantees in CNNs and regression models
and provide insights into model design for improved
privacy.
Black- and
white-box mem-
bership infer-
ence auditing
Nasr et al. [47]
DP:
Determines lower bounds
on the eectiveness of MIAs
against DPML models and
shows that existing privacy
guarantees may not be as
robust as previously
thought.
Methodology: Instantiates a hypothetical aacker that
is able to distinguish between two datasets that dier
only by a single example. Develops two algorithms,
one for crafting these datasets and another for predict-
ing which dataset was used to train a particular model.
This approach allows users to analyze the impact of
the aacker’s capabilities on the privacy guarantees of
DP mechanisms such as DP-SGD.
Contribution: Provides empirical and theoretical in-
sights into the limitations of DP in practical scenarios.
Appl. Sci. 2025, 15, 647 36 of 57
It is shown that existing upper bounds may not hold
up under stronger aacker conditions, and it is sug-
gested that beer upper bounds require additional as-
sumptions on the aacker’s capabilities.
Tramèr et al. [49]
DP:
Investigates the reliability
of DP guarantees in an
open-source implementa-
tion of a DL algorithm.
Methodology: Explores auditing techniques inspired
by recent advances in lower bound estimation for DP
algorithms. Performs a detailed audit of a specic im-
plementation to assess whether it satises the claimed
DP guarantees.
Contribution: Shows that the audited implementation
does not satisfy the claimed dierential privacy guar-
antee with 99.9% condence. This emphasizes the im-
portance of audits in identifying errors in purported
DP systems and shows that even well-established
methods can have critical vulnerabilities.
Nasr et al. [42]
DP:
Provides tight empirical pri-
vacy estimates.
Methodology: Adversary instantiation to establish
lower bounds for DP.
Contribution: Develops techniques to evaluate the ca-
pabilities of aackers, providing lower bounds that in-
form practical privacy auditing.
Sablayrolles et
al. [22]
Membership inference at-
tack analysis:
Analyzes MIAs in both
white-box and black-box
seings and shows that op-
timal aack strategies de-
pend primarily on the loss
function and not on the
model architecture or access
type.
Methodology: Derives the optimal strategy for mem-
bership inference under certain assumptions about pa-
rameter distributions and shows that both white-box
and black-box seings can achieve similar eective-
ness by focusing on the loss function. Provides approx-
imations for the optimal strategy, leading to new infer-
ence methods.
Contribution: Establishes a formal framework for
MIAs and presents State-of-the-Art results for various
ML models, including logistic regression and complex
architectures such as ResNet-101 on datasets such as
ImageNet.
Shadow model-
ing membership
inference audit-
ing
Shokri et al. [52]
Membership inference at-
tack analysis:
Develop a MIA that utilizes
a shadow training tech-
nique.
Methodology: Investigates membership inference at-
tacks using black-box access to models.
Contribution: Quantitatively analyzes how ML mod-
els leak membership information and introducing a
shadow training technique for aacks.
Salem et al. [112]
Membership inference at-
tack analysis:
Demonstrates that MIAs
can be performed without
needing to know the archi-
tecture of the target model
or the distribution of the
training data, highlighting a
broader vulnerability in ML
models.
Methodology: Introduces a new approach called
“shadow training”. This involves training multiple
shadow models that mimic the behavior of the target
model using similar but unrelated datasets. These
shadow models are used to generate outputs that in-
form an aack model designed to distinguish between
training and non-training data.
Contribution: Presents a comprehensive assessment of
membership inference aacks across dierent datasets
and domains, highlighting the signicant privacy risks
associated with ML models. It also suggests eective
defenses that preserve the benets of the model while
mitigating these risks.
Memorization
auditing
Yeom et al. [10]
Membership inference and
aribute inference analy-
sis: Analyzes how
Methodology: Conducts both formal and empirical
analyses to examine the relationship between overt-
ting, inuence, and privacy risk. Introduces
Appl. Sci. 2025, 15, 647 37 of 57
overing and inuence
can increase the risk of
membership inference and
aribute inference aacks
on ML models, highlighting
that overing is sucient
but not necessary for these
aacks.
quantitative measures of aacker advantage that at-
tempt to infer training data membership or aributes
of training data. The study evaluates dierent ML al-
gorithms to illustrate how generalization errors and in-
uential features impact privacy vulnerability.
Contribution: This work provides new insights into
the mechanisms behind membership and aribute in-
ference aacks. It establishes a clear connection be-
tween model overing and privacy risks, while iden-
tifying other factors that can increase an aacker’s ad-
vantage.
Carlini et al.
[102]
Membership inference at-
tack analysis:
Identies the risk of unin-
tended memorization in
neural networks, especially
in generative models
trained on sensitive data,
and shows that unique se-
quences can be extracted
from the models.
Methodology: Develops a testing framework to quan-
titatively assess the extent of memorization in neural
networks. It uses exposure metrics to assess the likeli-
hood that specic training sequences will be memo-
rized and subsequently extracted. The study includes
hands-on experiments with Google’s Smart Compose
system to illustrate the eectiveness of their approach.
Contribution: It becomes clear that unintentional
memorization is a common problem with dierent
model architectures and training strategies, and it oc-
curs early in training and is not just a consequence of
overing. Strategies to mitigate the problem are also
discussed. These include DP, which eectively reduces
the risk of memorization but may introduce utility
trade-os.
Label-only
membership in-
ference auditing
Malek et al. [179]
Label dierential privacy:
Proposes two new ap-
proaches—PATE (Private
Aggregation of Teacher En-
sembles) and ALIBI (addi-
tive Laplace noise coupled
with Bayesian inference)—
to achieve strong label dif-
ferential privacy (LDP)
guarantees in machine-
learning models.
Methodology: Analyzes and compares the eective-
ness of PATE and ALIBI in the delivering LDP. It
demonstrates how PATE leverages a teacher–student
framework to ensure privacy, while ALIBI is more
suitable for typical ML tasks by adding Laplacian
noise to the model outputs. The study includes a theo-
retical analysis of privacy guarantees and empirical
evaluations of memorization properties for both ap-
proaches.
Contribution: It demonstrates that traditional compar-
isons of algorithms based solely on provable DP guar-
antees can be misleading, advocating for a more nu-
anced understanding of privacy in ML. Additionally, it
illustrates how strong privacy can be achieved with
the proposed methods in specic contexts.
Choquee-Choo
et al. [180]
Membership inference at-
tack analysis:
Introduces aacks that infer
membership inference
based only on labels and
evaluate model predictions
without access to con-
dence scores and shows that
these aacks can eectively
infer membership status.
Methodology: It proposes a novel aack strategy that
evaluates the robustness of a model’s predicted labels
in the presence of input perturbations such as data
augmentation and adversarial examples. It is empiri-
cally conrmed that their label-only aacks are compa-
rable to traditional methods that require condence
scores.
Contribution: The study shows that existing protec-
tion mechanisms based on condence value masking
are insucient against label-only aacks. The study
also highlights that training with DP or strong L2
Appl. Sci. 2025, 15, 647 38 of 57
regularization is a currently eective strategy to re-
duce membership leakage, even for outlier data points.
Single-training
membership in-
ference auditing
Steinke et al. [43]
DP:
Proposes a novel auditing
scheme for DPML systems
that can be performed with
a single training run and in-
creases the efficiency of pri-
vacy assessments.
Methodology: It utilizes the ability to independently
add or remove multiple training examples during a
single training run. It analyzes the relationship be-
tween DP and statistical generalization to develop its
auditing framework. This approach can be applied in
both black-box and white-box settings with minimal
assumptions about the underlying algorithm.
Contribution: It provides a practical solution for pri-
vacy auditing in ML models without the need for ex-
tensive retraining. This reduces the computational bur-
den while ensuring robust privacy assessment.
Andrew et al.
[118]
DP:
Introduces a novel “one-
shot” approach for estimat-
ing privacy loss in federated
learning.
Methodology: Develops a one-shot empirical privacy
evaluation method for federated learning.
Contribution: Provides a method for estimating pri-
vacy guarantees in federated learning environments
using a single training run, improving the efficiency of
privacy auditing in decentralized environments with-
out a priori knowledge of the model architecture, tasks
or DP training algorithm.
Annamalai et al.
[109]
DP:
Proposes an auditing proce-
dure for the Differentially
Private Stochastic Gradient
Descent (DP-SGD) algo-
rithm that provides tighter
empirical privacy estimates
compared to previous
methods, especially in
black-box settings.
Methodology: It introduces a novel auditing technique
that crafts worst-case initial model parameters, which
significantly affects the privacy analysis of DP-SGD.
Contribution: This work improves the understanding
of how the initial parameters affect the privacy guaran-
tees in DP-SGD and provides insights for detecting po-
tential privacy violations in real-world implementa-
tions, improving the robustness of differential privacy
auditing.
Loss-based
membership in-
ference auditing
Wang et al. [111]
DP:
Introduces a new dieren-
tial privacy paradigm called
estimate–verify–release
(EVR).
Methodology: Develops a randomized privacy veri-
cation procedure using Monte Carlo techniques and
proposes an estimate–verify–release (EVR) paradigm.
Contribution: Introduces a tight and ecient auditing
procedure that converts estimates of privacy parame-
ters into formal guarantees, allowing for eective pri-
vacy accounting with only one training run and aver-
ages the concept of Privacy Loss Distribution (PLD) to
more accurately measure and track the cumulative pri-
vacy loss through a sequence of computations.
Condence
score member-
ship inference
auditing
Askin et al. [183]
DP:
Introduces a statistical
method for quantifying dif-
ferential privacy in a black-
box seing, providing esti-
mators for the optimal pri-
vacy parameter and con-
dence intervals.
Methodology: Introduces a local approach for the sta-
tistical quantication of DP in a black-box seing.
Contribution: Develops estimators and condence in-
tervals for optimal privacy parameters, avoiding event
selection issues and demonstrating fast convergence
rates through experimental validation.
Metric-based
membership in-
ference auditing
Rahman et al.
[181]
DP:
Examines the eectiveness
of dierential privacy in
protecting deep-learning
Methodology: Investigates MIAs on DPML models
through membership inference.
Contribution: Analyzes the vulnerability of DP mod-
els to MIAs and shows that they can still leak
Appl. Sci. 2025, 15, 647 39 of 57
models against membership
inference aacks.
information about training data under certain condi-
tions, using accuracy and F-score as privacy leakage
metrics.
Liu et al. [170]
DP:
Focuses on how dierential
privacy can be understood
through hypothesis testing.
Methodology: Explores statistical privacy frameworks
through the lens of hypothesis testing.
Contribution: Provides a comprehensive analysis of
privacy frameworks, emphasizing the role of hypothe-
sis testing in evaluating privacy guarantees in ML
models, linking precision, recall, and F-score metrics to
the privacy parameters; and uses hypothesis testing
techniques.
Balle et al. [171]
Rényi DP:
Explores the relationship
between dierential privacy
and hypothesis testing in-
terpretations.
Methodology: Examines hypothesis testing interpreta-
tions in relation to Rényi DP.
Contribution: Establishes connections between statisti-
cal hypothesis testing and Rényi dierential privacy,
improving the theoretical understanding of privacy
guarantees in the context of ML.
Humphries et al.
[182]
Membership inference at-
tack analysis:
Conducts empirical evalua-
tions of various DP models
across multiple datasets to
assess their vulnerability to
membership inference at-
tacks.
Methodology: Analyzes the limitations of DP in the
bounding of MIAs.
Contribution: Shows that DP does not necessarily pre-
vent MIAs and points out vulnerabilities in current
privacy-preserving techniques.
Ha et al. [41]
DP:
Investigates how DP can be
aected by MIAs.
Methodology: Analyzes the impact of MIAs on DP
mechanisms.
Contribution: Examines how MIAs can be used as an
audit tool to quantify training data leaks in ML models
and proposes new metrics to assess vulnerability dis-
parities across demographic groups.
Data augmenta-
tion-based au-
diting
Kong et al. [185]
Membership inference at-
tack analysis:
Investigates the relationship
between forgeability in ML
models and the vulnerabil-
ity to MIAs and uncovers
vulnerabilities that can be
exploited by aackers.
Methodology: It proposes a framework to analyze
forgeability—dened as the ability of an aacker to
generate outputs that mimic a model’s behavior—and
its connection to membership inference. It conducts
empirical evaluations to show how certain model
properties inuence both forgeability and the risk of
MIAs.
Contribution: It shows how the choice of model de-
sign can inadvertently increase vulnerability to MIAs.
This suggests that understanding forgeability can help
in the development of secure ML systems.
Data-poisoning auditing
Inuence-func-
tion analysis
Koh and Ling
[188]
Model
Interpretation:
Investigates how inuence
functions can be used to
trace predictions back to
training data and thus gain
insight into the behavior of
the model without direct ac-
cess to the internal work-
ings of the model.
Methodology: Uses inuence functions from robust
statistics to nd out which training points have a sig-
nicant inuence on a particular prediction. Develops
an ecient implementation that only requires oracle
access to gradients and Hessian-vector products, al-
lowing scalability in modern ML contexts.
Contribution: Demonstrates the usefulness of inu-
ence functions for various applications, including un-
derstanding model behavior, debugging, detecting
Appl. Sci. 2025, 15, 647 40 of 57
dataset errors, and creating aacks on training sets,
improving the interpretability of black-box models.
Jayaraman and
Evans [21]
DP:
Investigates the limitations
of DPML, particularly fo-
cusing on the impact of the
privacy parameter on pri-
vacy leakage.
Methodology: Evaluates the practical implementation
of dierential privacy in machine-learning systems.
Contribution: Conducts an empirical analysis of dif-
ferentially private machine-learning algorithms, as-
sessing their performance and privacy guarantees in
real-world applications.
Lu et al. [61]
DP:
Focuses on the auditing of
DPML models for the em-
pirical evaluation of privacy
guarantees.
Methodology: Proposes a general framework for au-
diting dierentially private machine-learning models.
Contribution: Introduces a comprehensive tight audit-
ing framework that assesses the eectiveness and ro-
bustness of dierential privacy mechanisms in various
machine-learning contexts.
Gradient manip-
ulation in DP
training.
Chen et al. [189]
Gradient leakage analysis:
Investigates the potential
for training data leakage
from gradients in neural
networks, highlighting that
gradients can be exploited
to reconstruct training im-
ages.
Methodology: Analyzes training-data leakage from
gradients in neural networks for image classication.
Contribution: Provides a theoretical framework for
understanding how training data can be reconstructed
from gradients, proposing a metric to measure model
security against such aacks.
Xie et al. [190]
Generalization improve-
ment:
Focuses on improving gen-
eralization in DL models
through the manipulation
of stochastic gradient noise
(SGN).
Methodology: Introduces Positive–Negative Momen-
tum (PNM) to manipulate stochastic gradient noise for
improved generalization in machine-learning models.
Contribution: Proposes a novel approach that demon-
strates the convergence guarantees and generalization
of the model using PNM approach that leverages sto-
chastic gradient noise more eectively without increas-
ing computational costs.
Ma et al. [54]
DP:
Investigates the resilience of
dierentially private learn-
ers against data-poisoning
aacks.
Methodology: Designs specic aack algorithms tar-
geting two common approaches in DP, objective per-
turbation and output perturbation.
Contribution: Analyzes vulnerabilities of dierentially
private models to data-poisoning aacks and proposes
defensive strategies to mitigate these risks.
Jagielski et al.
[46]
DP:
Investigates the practical
privacy guarantees of Dif-
ferentially Private Stochas-
tic Gradient Descent (DP-
SGD).
Methodology: Audits dierentially private machine-
learning models, specically examining the privacy
guarantees of stochastic gradient descent (SGD).
Contribution: Evaluates the eectiveness of dieren-
tial privacy mechanisms in SGD, providing insights
into how private the training process really is under
various conditions.
Empirical evalu-
ation of privacy
loss.
Steinke and
Ullman [192]
DP:
Establishes a new lower
bound on the sample com-
plexity of dieren-
tially private algorithms for
accurately answering statis-
tical queries.
Methodology: Derives a necessary condition for the
number of records, n, required to satisfy dier-
ential privacy while achieving a specied accuracy.
Contribution: Introduces a framework that interpo-
lates between pure and approximate dierential pri-
vacy, providing optimal sample size requirements for
answering statistical queries in high-dimensional data-
bases.
Appl. Sci. 2025, 15, 647 41 of 57
Kairouz et al.
[193]
DP:
Presents a new approach for
training DP models without
relying on sampling or
shuing, addressing the
limitations of Dierentially
Private Stochastic Gradient
Descent (DP-SGD).
Methodology: Proposes a method for practical and
private deep learning without relying on sampling
through shuing techniques.
Contribution: Develops auditing procedure for evalu-
ating the eectiveness of shuing in DPML models by
leveraging various network parameters and likelihood
ratio functions.
Privacy viola-
tion
Li et al. [194]
Information privacy:
Reviews various theories re-
lated to online information
privacy, analyzing how
they contribute to under-
standing privacy concerns.
Methodology: Conducts a critical review of theories in
online information privacy research and proposes an
integrated framework.
Contribution: Conducts a critical review of theories in
online information privacy research and proposes an
integrated framework.
Hay et al. [195]
DP:
Emphasizes the importance
of rigorous evaluation of
DP algorithms.
Methodology: Develops DPBench, a benchmarking
suite for evaluating dierential privacy algorithms.
Contribution: Propose a systematic benchmarking
methodology that includes various metrics to evaluate
the privacy loss, utility, and robustness of algorithms
with dierent privacy.
Ding et al. [45]
DP:
Addresses the issue of veri-
fying whether algorithms
claiming DP actually adhere
to their stated privacy guar-
antees.
Methodology: Develops a statistical approach to de-
tect violations of dierential privacy in algorithms.
Contribution: Proposes the rst counterexample gen-
erator that produces human-understandable counter-
examples specically designed to detect violations to
DP in algorithms.
Wang et al. [196]
DP:
Introduces CheckDP, an au-
tomated framework de-
signed to prove or disprove
claims of DP for algorithms.
Methodology: Utilizes a bidirectional Counterexam-
ple-Guided Inductive Synthesis (CEGIS) approach em-
bedded in CheckDP, allowing it to generate proofs for
correct systems and counterexamples for incorrect
ones.
Contribution: Presents an integrated approach that
automates the verication process for dierential pri-
vacy claims, enhancing the reliability of privacy-pre-
serving mechanisms.
Barthe et al. [197]
DP:
Addresses the problem of
deciding whether probabil-
istic programs satisfy DP
when restricted to nite in-
puts and outputs.
Methodology: Develops a decision procedure that lev-
erages type systems and program analysis techniques
to check for dierential privacy in a class of probabilis-
tic computations.
Contribution: Explores theoretical aspects of dieren-
tial privacy, providing insights into the conditions un-
der which dierential privacy can be eectively de-
cided in computational seings.
Niu et al. [166]
DP:
Presents DP-Opt, a frame-
work designed to identify
violations of DP in algo-
rithms by optimizing for
counterexamples.
Methodology: Utilizes optimization techniques to
search for counterexamples that demonstrate when the
lower bounds on dierential privacy exceed the
claimed values.
Contribution: Develops a disprover that searches for
counterexamples where the lower bounds on dieren-
tial privacy exceed claimed values, enhancing the abil-
ity to detect and analyze privacy violations in algo-
rithms.
Appl. Sci. 2025, 15, 647 42 of 57
Lokna et al. [48]
DP:
Introduces a novel method
for auditing -dieren-
tial privacy, highlighting
that many pairs can
be grouped, as they result
in the same algorithm.
Methodology: Develops a novel method for auditing
dierential privacy violations using a combined pri-
vacy parameter,
Contribution: Introduces Delta-Siege, an auditing tool
that eciently discovers violations of dierential pri-
vacy across multiple claims simultaneously, demon-
strating superior performance compared to existing
tools and providing insights into the root causes of
vulnerabilities.
Model inversion auditing
Sensitivity anal-
ysis.
Frederikson et al.
[100]
Model inversion aack
analysis: Explores vulnera-
bilities in ML models
through model inversion at-
tacks that exploit con-
dence information and pose
signicant risks to user pri-
vacy.
Methodology: A new class of model inversion aacks
is developed that exploits the condence values given
next to the predictions. It empirically evaluates these
aacks in two contexts: decision trees for lifestyle sur-
veys and neural networks for face recognition. The
study includes experimental results that show how at-
tackers can infer sensitive information and recover rec-
ognizable images based solely on model outputs.
Contribution: It demonstrates the eectiveness of
model inversion aacks in dierent contexts and pre-
sents basic countermeasures, such as training algo-
rithms that obfuscate condence values, that can miti-
gate the risk of these aacks while preserving the util-
ity.
Wang et al. [136]
DP:
Proposes a DP regression
model that aims to protect
against model inversion at-
tacks while preserving the
model utility.
Methodology: A novel approach is presented that uti-
lizes the functional mechanism to perturb the coe-
cients of the regression model. It analyzes how existing
DP mechanisms cannot eectively prevent model in-
version aacks. It provides a theoretical analysis and
empirical evaluations showing that their approach can
balance privacy for sensitive and non-sensitive arib-
utes while preserving model performance.
Contribution: It demonstrates the limitations of tradi-
tional DP in protecting sensitive aributes in model in-
version aacks and presents a new method that eec-
tively mitigates these risks while ensuring that the util-
ity of the regression model is preserved.
Hitaj et al. [198]
Information leakage analy-
sis: Investigates vulnerabili-
ties in collaborative DL
models and shows that
these models are suscepti-
ble to information leakage
despite aempts to protect
privacy through parameter
sharing and DP.
Methodology: Develops a novel aack that exploits
the real-time nature of the learning process in collabo-
rative DL environments. They show how an aacker
can train a generative adversarial network (GAN) to
generate prototypical samples from the private train-
ing data of honest participants. It criticizes existing pri-
vacy-preserving techniques, particularly record-level
DP at the dataset level, and highlights their ineective-
ness against their proposed aack.
Contribution: Reveals fundamental aws in the design
of collaborative DL systems and emphasizes that cur-
rent privacy-preserving measures do not provide ade-
quate protection against sophisticated aacks such as
those enabled by GANs. It calls for a re-evaluation of
privacy-preserving strategies in decentralized ML set-
tings.
Appl. Sci. 2025, 15, 647 43 of 57
Song et al. [199]
Model inversion aack
analysis: Investigates the
risks of overing in ML
models and shows that
models can inadvertently
memorize sensitive training
data, leading to potential
privacy violations.
Methodology: Analyzes dierent ML models to assess
their vulnerability to memorization aacks. Introduces
a framework to quantify the amount of information a
model stores about its training data and conduct em-
pirical experiments to illustrate how certain models
can reconstruct sensitive information from their out-
puts.
Contribution: The study highlights the implications of
model overing on privacy, showing that even well-
regulated models can leak sensitive information. The
study emphasizes the need for robust privacy-preserv-
ing techniques in ML to mitigate these risks.
Fang et al. [135]
DP:
Provides a formal guarantee
that the output of the analy-
sis will not change signi-
cantly if an individual’s
data are altered.
Methodology: Utilizes a functional mechanism that
adds calibrated noise to the regression outputs, balanc-
ing privacy protection with data utility.
Contribution: Introduces a functional mechanism for
regression analysis under DP. Evaluates the perfor-
mance of the model in terms of noise reduction and re-
silience to model inversion aacks.
Cummings et al.
[200]
DP:
Ensures that the output of
the regression analysis does
not change signicantly
when the data of a single in-
dividual are changed.
Methodology: Introduces individual sensitivity pre-
processing techniques for enhancing data privacy.
Contribution: Proposes preprocessing methods that
adjust data sensitivity on an individual level, improv-
ing privacy protection while allowing for meaningful
data analysis. Introduces an individual sensitivity met-
ric technique to improve the accuracy of private data.
Gradient and
weight analyses
Zhu et al. [201]
Model inversion aack
analysis:
Utilizes gradients to recon-
struct inputs from model
outputs.
Methodology: Explores model inversion aacks en-
hanced by adversarial examples in ML models.
Contribution: Demonstrates how adversarial exam-
ples can signicantly boost the eectiveness of model
inversion aacks, providing insights into potential vul-
nerabilities in machine-learning systems.
Zhu et al. [202]
Gradient leakage analysis:
Exchanges gradients that
lead to the leakage of pri-
vate training data.
Methodology: Investigates deep leakage from gradi-
ents in machine-learning models.
Contribution: Analyzes how gradients can leak sensi-
tive information about training data, contributing to
the understanding of privacy risks associated with
model training.
Huang et al.
[203]
Gradient inversion aack
analysis:
Evaluates gradient inver-
sion aacks in federated
learning.
Methodology: Explores model inversion aacks en-
hanced by adversarial examples in ML models.
Contribution: Assesses the eectiveness of gradient
inversion aacks in federated learning seings and
proposes defenses to mitigate these vulnerabilities.
Wu et al. [204]
Gradient inversion aack
analysis:
Introduces a new gradient
inversion method, Learning
to Invert (LIT).
Methodology: Develops adaptive aacks for gradient
inversion in federated learning environments.
Contribution: Introduces simple adaptive aack strat-
egies to enhance the success rate of gradient inversion
aacks (gradient compression), highlighting the risks
in federated learning scenarios.
Zhu et al. [205]
Gradient inversion aack
analysis:
Proposes a generative gra-
dient inversion aack (GGI)
Methodology: Utilizes generative models to perform
gradient inversion without requiring prior knowledge
of the data distribution.
Appl. Sci. 2025, 15, 647 44 of 57
in federated learning con-
texts.
Contribution: Presents a novel aack that utilizes gen-
erative models to enhance gradient inversion aacks,
demonstrating new avenues for information leakage in
collaborative seings.
Empirical pri-
vacy loss
Yang et al. [206]
DP:
Proposes a method to en-
hance privacy by purifying
predictions.
Methodology: Proposes a defense mechanism against
model inversion and membership inference aacks
through prediction purication.
Contribution: Demonstrates that a purier dedicated
to one type of aack can eectively defend against the
other, establishing a connection between model inver-
sion and membership inference vulnerabilities, em-
ploying a prediction purication technique.
Zhang et al. [207]
DP:
Incorporates additional
noise mechanisms speci-
cally designed to counter
model inversion aacks.
Methodology: Broadens dierential privacy frame-
works to enhance protection against model inversion
aacks in deep learning.
Contribution: Introduces new techniques to
strengthen dierential privacy guarantees specically
against model inversion, improving the robustness of
deep-learning models against such aacks, and pro-
pose class and subclass DP within context of random
forest algorithms.
Reconstruction
test
Manchini et al.
[208]
DP:
Use dierential privacy in
regression models that ac-
counts for heteroscedastic-
ity.
Methodology: Proposes a new approach to data dier-
ential privacy using regression models under hetero-
scedasticity.
Contribution: Develops methods to enhance dieren-
tial privacy in regression analysis, particularly for da-
tasets with varying levels of noise, improving privacy
guarantees for ML applications.
Park et al. [139]
DP:
Evaluates the eectiveness
of dierentially private
learning models against
model inversion aacks.
Methodology: Evaluates dierentially private learning
against model inversion aacks through an aack-
based evaluation method.
Contribution: Introduces an evaluation framework
that assesses the robustness of dierentially private
models against model inversion aacks, providing in-
sights into the eectiveness of privacy-preserving tech-
niques.
Model extraction auditing
Query analysis
Carlini et al.
[101]
Model extraction aack
analysis:
Demonstrates that large lan-
guage models, such as GPT-
2, are vulnerable to training
data-extraction aacks.
Methodology: Employs a two-stage approach for
training data extraction, sux generation and sux
ranking.
Contribution: Shows that aackers can recover indi-
vidual training examples from large language models
by querying them, highlighting vulnerabilities in
model training processes and discussing potential safe-
guards.
Dziedzic et al.
[209]
Model extraction aack
analysis:
Addresses model extraction
aacks, where aackers can
steal ML models by query-
ing them.
Methodology: Proposes a calibrated proof of work
mechanism to increase the cost of model extraction at-
tacks.
Contribution: Introduces a novel approach, BDPL
(Boundary Dierential Private Layer), that raises the
resource requirements for adversaries aempting to
extract models, thereby enhancing the security of ma-
chine-learning systems against such aacks.
Appl. Sci. 2025, 15, 647 45 of 57
Li et al. [210]
Local DP:
Introduces a personalized
local dierential privacy
(PLDP) mechanism de-
signed to protect regression
models from model extrac-
tion aacks.
Methodology: Uses a novel perturbation mechanism
that adds high-dimensional Gaussian noise to the
model outputs based on personalized privacy parame-
ters.
Contribution: Personalized local dierential privacy
(PLDP) ensures that individual user data are per-
turbed before being sent to the model, thereby protect-
ing sensitive information from being extracted through
queries.
Li et al. [147]
Model extraction aack
analysis:
Proposes a framework de-
signed to protect object de-
tection models from model
extraction aacks by focus-
ing on feature space cover-
age.
Methodology: Uses a novel detection framework that
identies suspicious users based on their query trac
and feature coverage.
Contribution: Develops a detection framework that
identies suspicious users based on feature coverage
in query trac, employing an active verication mod-
ule to conrm potential aackers, thereby enhancing
the security of object detection models and distin-
guishing between malicious and benign queries.
Zheng et al. [211]
Boundary Dierential Pri-
vacy (-BDP):
Introduces Boundary Dif-
ferential Privacy (ϵ-BDP),
which protects against
model extraction aacks by
obfuscating prediction re-
sponses near the decision
boundary.
Methodology: Uses a perturbation algorithm called
boundary randomized response, which achieves ϵ-
BDP by adding noise to the model’s outputs based on
their proximity to the decision boundary.
Contribution: Introduces a novel layer that obfuscates
prediction responses near the decision boundary to
prevent adversaries from inferring model parameters,
demonstrating eectiveness through extensive experi-
ments.
Yan et al. [212]
DP:
Proposes a monitoring-
based dierential privacy
(MDP) mechanism that en-
hances the security of ma-
chine-learning models
against query ooding at-
tacks.
Methodology: Introduces a novel real-time model ex-
traction status assessment scheme called “Monitor”,
which evaluates the model’s exposure to potential ex-
traction based on incoming queries.
Contribution: Proposes a mechanism that monitors
query paerns to detect and mitigate model extraction
aempts, enhancing the resilience of machine-learning
models against ooding aacks.
Property inference auditing
Evaluating
property sensi-
tivity with
model outputs.
Suri et al. [213]
Distribution inference at-
tack analysis:
Investigates distribution in-
ference aacks, which aim
to infer statistical properties
of the training data used by
ML models.
Methodology: Introduces a distribution inference at-
tack that infers statistical properties of training data
using a KL divergence approach.
Contribution: Develops a novel black-box aack that
outperforms existing white-box methods, evaluating
the eectiveness of various defenses against distribu-
tion inference risks; performs disclosure at three gran-
ularities, namely distribution, user, and record levels;
and proposes metrics to quantify observed leakage
from models under aack.
Property infer-
ence framework
Ganju et al. [214]
Property inference aack
analysis:
Explores property inference
aacks on fully connected
neural networks (FCNNs),
demonstrating that
Methodology: Leverages permutation invariant repre-
sentations to reduce the complexity of inferring prop-
erties from FCNNs.
Contribution: Analyzes how permutation invariant
representations can be exploited to infer sensitive
properties of training data, highlighting vulnerabilities
in neural network architectures.
Appl. Sci. 2025, 15, 647 46 of 57
aackers can infer global
properties of the training
data.
Melis et al. [215]
Feature leakage analysis:
Reveals that collaborative
learning frameworks inad-
vertently leak sensitive in-
formation about partici-
pants’ training data through
model updates.
Methodology: Uses both passive and active inference
aacks to exploit unintended feature leakage.
Contribution: Examines how collaborative learning
frameworks can leak sensitive features, providing in-
sights into the risks associated with sharing models
across dierent parties.
Empirical evalu-
ation of linear
queries
Huang and Zhou
[216]
DP:
Discusses how DP mecha-
nisms can inadvertently
leak sensitive information
when linear queries are in-
volved.
Methodology: Studies unexpected information leak-
age in dierential privacy due to linear properties of
queries.
Contribution: Analyzes how certain (linear) query
structures can lead to information leakage despite dif-
ferential privacy guarantees, suggesting improvements
for privacy-preserving mechanisms.
Analysis of DP
implementation
Ben Hamida et
al. [217]
DP:
Discusses how dierential
privacy (DP) enhances the
privacy of machine-learning
models by ensuring that in-
dividual data contributions
do not signicantly aect
the model’s output.
Methodology: Explore various techniques for imple-
menting DPML, including adding noise to gradients
during training and employing mechanisms that en-
sure statistical outputs mask individual contributions.
Contribution: Explores the interplay between dieren-
tial privacy techniques and their eectiveness in en-
hancing model security against various types of at-
tacks.
Song et al. [218]
Privacy risk evaluation:
Methodology: Conducts a systematic evaluation of
privacy risks in machine-learning models across dier-
ent scenarios.
Contribution: Provides a comprehensive framework
for assessing the privacy risks associated with ma-
chine-learning models, identifying key vulnerabilities
and suggesting mitigation strategies.
References
1. Choudhury, O.; Gkoulalas-Divanis, A.; Salonidis, T.; Sylla, I.; Park, Y.; Hsu, G.; Das, A. Differential Privacy-Enabled Federated
Learning For Sensitive Health Data. arXiv 2019, arXiv:1910.02578. Available online: https://arxiv.org/abs/1910.02578 (accessed
on 1 December 2024).
2. Dwork, C.; McSherry, F.; Nissim, K.; Smith, A. Calibrating Noise To Sensitivity In Private Data Analysis. In Theory of
Cryptography; Halevi, S., Rabin, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 265–284.
3. Williamson, S.M.; Prybutok, V. Balancing Privacy and Progress: A Review of Privacy Challenges, Systemic Oversight, and
Patient Perceptions. In: AI-Driven Healthcare. Appl. Sci. 2024, 14, 675. https://doi.org/10.3390/app14020675.
4. Barbierato, E.; Gatti, A. The Challenges of Machine Learning: A Critical Review. Electronics 2024, 13, 416.
https://doi.org/10.3390/electronics13020416.
5. Noor, M.H.M.; Ige, A.O. A Survey on State-of-the-art Deep Learning Applications and Challenges. arXiv 2024, arXiv:2403.17561.
Available online: https://arxiv.org/abs/2403.17561 (accessed on 1 December 2024).
6. Du Pin Calmon, F.; Fawaz, N. Privacy Against Statistical Inference. In Proceedings of the 2012 50th Annual Allerton Conference
on Communication, Control, and Computing, Allerton, Monticello, IL, USA, 1–5 October 2012; pp. 1401–1408.
7. Dehghani, M.; Azarbonyad, H.; Kamps, J.; de Rijke, M. Share your Model of your Data: Privacy Preserving Mimic Learning for
Mimic Learning for Ranking. arXiv 2017, arXiv:1707.07605. Available online: https://arxiv.org/abs/1707.07605 (accessed on 1
December 2024).
8. Bouke, M.; Abdullah, A. An Empirical Study Of Pattern Leakage Impact During Data Preprocessing on Machine Learning-
Based Intrusion Detection Models Reliability. Expert Syst. Appl. 2023, 230, 120715. https://doi.org/10.1016/j.eswa.2023.120715.
Appl. Sci. 2025, 15, 647 47 of 57
9. Xu, J.; Wu, Z.; Wang, C.; Jia, X. Machine Unlearning: Solutions and Challenges. IEEE Trans. Emerg. Top. Comput. Intell. 2024, 8,
2150–2168.
10. Yeom, S.; Giacomelli, I.; Fredrikson, M.; Jha, S. Privacy risk in machine learning: Analyzing the connection to overfitting. In
Proceedings of the 2018 IEEE 31st Computer Security Foundations Symposium (CSF), Oxford, UK, 9–12 July 2018; pp. 268–282.
11. Li, Y.; Yan, H.; Huang, T.; Pan, Z.; Lai, J.; Zhang, X.; Chen, K.; Li, J. Model Architecture Level Privacy Leakage In Neural
Networks. Sci. China Inf. Sci. 2024, 67, 3.
12. Del Grosso, G.; Pichler, G.; Palamidessi, C.; Piantanida, P. Bounding information leakage in machine learning. Neurocomputing
2023, 534, 1–17. https://doi.org/10.1016/j.neucom.2023.02.058.
13. McSherry, F.; Talwar, K. Mechanism Design via Differential Privacy. In Proceedings of the 48th Annual IEEE Symposium on
Foundations of Computer Science (FOCS 2007), Providence, RI, USA, 20–23 October 2007; pp. 94–103.
14. Mulder, V.; Humbert, M. Differential privacy. In Trends in Data Protection and Encryption Technologies; Springer:
Berlin/Heidelberg, Germany, 2023; pp. 157–161.
15. Gong, M.; Xie, Y.; Pan, K.; Feng, K.; Qin, A. A Survey on Differential Private Machine Learning. IEEE Comput. Intell. Mag. 2020,
15, 49–64.
16. Liu, B.; Ding, M.; Shaham, S.; Rahayu, W.; Farokhi, F.; Lin, Z. When Machine Learning Meets Privacy: A Survey and Outlook.
ACM Comput. Surv. 2021, 54, 31:1–31:36. https://doi.org/10.1145/3436755.
17. Blanco-Justicia, A.; Sanchez, A.; Domingo-Ferrer, J.; Muralidhar, K.A. Critical Review on the Use (and Misuse) of Differential
Privacy in Machine Learning. ACM Comput. Surv. 2023, 55, 1–16. https://doi.org/10.1145/3547139.
18. Zheng, H.; Ye, Q.; Hu, H.; Fang, C.; Shi, J. Protecting Decision Boundary of Machine Learning Model With Differential Private
Perturbation. IEEE Trans. Dependable Secur. Comput. 2022, 19, 2007–2022. https://doi.org/10.1109/TDSC.2022.3143927.
19. Ponomareva, N.; Hazimeh, H.; Kurakin, A.; Xu, Z.; Denison, C.; McMahan, H.B.; Vassilvitskii, S.; Chien, S.; Thakurta, A.G. A
Practical Guide to Machine Learning with Differential Privacy. J. Artif. Intell. Res. 2023, 77, 1113–1201.
https://doi.org/10.1613/jair.1.14649.
20. Choquette-Choo, C.A.; Dullerud, N.; Dziedzic, A.; Zhang, Y.; Jha, S.; Papernot, N.; Wang, X. CaPC Learning: Confidential and
Private Collaborative Learning. In Proceedings of the International Conference on Learning Representations (ICLR), Vienna,
Austria, 4 May 2021.
21. Jayaraman, B.; Evans, D. Evaluating Differentially Private Machine Learning in Practice. In Proceedings of the 28th USENIX
Security Symposium (USENIX Security 19), Santa Clara, CA, USA, 14–16 August 2019; pp. 1895–1912.
https://doi.org/10.5555/3361338.3361469.
22. Sablayrolles, A.; Douze, M.; Schmid, C.; Ollivier, Y.; Jégou, H. White-box vs black-box: Bayes optimal strategies for membership
inference. In Proceedings of the International Conference on Machine Learning (ICML), Long Beach, CA, USA, 9–15 June 2019;
pp. 5558–5567.
23. Abadi, M.; Chu, A.; Goodfellow, I.; McMahan, H.B.; Mironov, I.; Talwar, K.; Zhang, L. Deep Learning with Differential Privacy.
In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS), Vienna, Austria, 24–
28 October 2016; pp. 308–318. https://doi.org/10.1145/2976749.2978318.
24. Bagdasaryan, E. Differential Privacy Has Disparate Impact on Model Accuracy. Adv. Neural Inf. Process. Syst. 2019, 32, 161263.
https://doi.org/10.5555/3454287.3455674.
25. Tran, C.; Dinh, M.H. Differential Private Empirical Risk Minimization under the Fairness Lens. Adv. Neural Inf. Process. Syst.
2021, 33, 27555–27565. https://doi.org/10.5555/3540261.3542371.
26. Bichsel, B.; Stefen, S.; Bogunovic, I.; Vechev, M. Dp-Sniper: Black-Box Discovery Of Differential Privacy Violations Using
Classifiers. In Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 24–27 May 2021;
pp. 391–409. https://doi.org/10.1109/SP46214.2021.00042.
27. Dwork, C. Differential Privacy. In Automata, Languages and Programming; Bugliesi, M., Preneel, B., Sassone, V., Wegener, I., Eds.;
Lecture Notes in Computer Science; Springer: Berlin, Germany, 2006; pp. 1–12. https://doi.org/10.1007/11787006_1.
28. He, J.; Cai, L.; Guan, X. Differential Private Noise Adding Mechanism and Its Application on Consensus Algorithm. IEEE Trans.
Signal Process. 2020, 68, 4069–4082.
29. Wang, R.; Fung, B.C.M.; Zhu, Y.; Peng, Q. Differentially Private Data Publishing for Arbitrary Partitioned Data. Inf. Sci. 2021,
553, 247–265. https://doi.org/10.1016/j.ins.2020.10.051.
30. Baraheem, S.S.; Yao, Z. A Survey on Differential Privacy with Machine Learning and Future Outlook. arXiv 2022,
arXiv:2211.10708. Available online: https://arxiv.org/abs/2211.10708 (accessed on 1 December 2024).
Appl. Sci. 2025, 15, 647 48 of 57
31. Dwork, C.; Roth, A. The Algorithmic Foundations Of Differential Privacy. Found. Trends Theor. Comput. Sci. 2014, 9, 211–407.
Available online: https://www.nowpublishers.com/article/Details/TCS-042 (accessed on 1 December 2024).
32. Chadha, K.; Jagielski, M.; Papernot, N.; Choquette-Choo, C.A.; Nasr, M. Auditing Private Prediction. arXiv 2024,
arXiv:2402.0940. https://doi.org/10.48550/arXiv.2402.09403.
33. Papernot, N.; Abadi, M.; Erlingsson, Ú .; Goodfelow, I.; Talwar, K. Semi-Supervise Knowledge Transfer for Deep Learning from
Private Training Data. International Conference on Learning Representations. 2016. Available online:
https://openreview.net/forum?id=HkwoSDPgg (accessed on 1 December 2024).
34. Bernau, D.; Robl, J.; Grassal, P.W.; Schneider, S.; Kerschbaum, F. Comparing Local and Central Differential Privacy Using
Membership Inference Attacks. In IFIP Annual Conference on Data and Applications Security and Privacy; Springer:
Berlin/Heidelberg, Germany, 2021; pp. 22–42. https://doi.org/10.1007/978-3-030-81242-3_2.
35. Hsu, J.; Gaboardi, M.; Haeberlen, A.; Khanna, S.; Narayan, A.; Pierce, B.C.; Roth, A. Differential Privacy: An economic methods
for choosing epsilon. In Proceedings of the Computer Security Foundations Workshop, Vienna, Austria, 19–22 July 2014; pp.
398–410. https://doi.org/10.1109/CSF.2014.35.
36. Mehner, L.; Voigt, S.N.V.; Tschorsch, F. Towards Explaining Epsilon: A Worst-Case Study of Differential Privacy Risks. In
Proceedings of the 2021 IEEE European Symposium on Security and Privacy Workshop, Euro S and PW, Virtual, 6–10
September 2021; pp. 328–331.
37. Busa-Fekete, R.I.; Dick, T.; Gentile, C.; Medina, A.M.; Smith, A.; Swanberg, M. Auditing Privacy Mechanisms via Label Inference
Attacks. arXiv 2024, arXiv:2406.01797. Available online: https://arxiv.org/abs/2406.02797 (accessed on 1 December 2024).
38. Desfontaines, D.; Pejó, B. SoK: Differential Privacies. arXiv 2022, arXiv:1906.01337. Available online:
https://arxiv.org/abs/1906.01337 (accessed on 1 December 2024).
39. Lycklama, H.; Viand, A.; Küchler, N.; Knabenhans, C.; Hithnawi, A. Holding Secrets Accountable: Auditing Privacy-Preserving
Machine Learning. arXiv 2024, arXiv:2402.15780. Available online: https://arxiv.org/abs/2402.15780 (accessed on 1 December
2024).
40. Kong, W.; Medina, A.M.; Ribero, M.; Syed, U. DP-Auditorium: A Large Scale Library for Auditing Differential Privacy. arXiv
2023, arXiv:2307.05608. Available online: https://arxiv.org/abs/2307.05608 (accessed on 1 December 2024).
41. Ha, T.; Vo, T.; Dang, T.K. Differential Privacy Under Membership Inference Attacks. Commun. Comput. Inf. Sci. 2023, 1925, 255–
269.
42. Nasr, M.; Hayes, J.; Steinke, T.; Balle, B.; Tramer, F.; Jagielski, M.; Carlini, N.; Terzis, A. Tight Auditing of Differentially Private
Machine Learning. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23), San Francisco, CA, USA, 9–
11 August 2023; pp. 1631–1648.
43. Steinke, T.; Nasr, M.; Jagielski, M. Privacy Auditing with One (1) Training Run. arXiv 2023, arXiv:2305.08846. Available online:
https://arxiv.org/abs/2305.08846 (accessed on 1 December 2024).
44. Wairimu, S.; Iwaya, L.H.; Fritsch, L.; Lindskog, S. Assessment and Privacy Risk Assessment Methodologies: A Systematic
Literature Review. IEEE Access 2024, 12, 19625–19650. https://doi.org/10.1109/ACCESS.2024.3360864.
45. Ding, Z.; Wang, Y.; Wang, G.; Zhang, D.; Kifer, D. Detecting Violations Of Differential Privacy. In Proceedings of the 2018 ACM
SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 475–489.
https://doi.org/10.1145/3243734.3243818.
46. Jagielski, M.; Ullman, J.; Oprea, A. Auditing Differentially Private Machine Learning: How Private is Private sgd? Adv. Neural
Inf. Process. Syst. 2020, 33, 22205–22216. https://doi.org/10.48550/arXiv.2006.07709.
47. Nasr, M.; Songi, S.; Thakurta, A.; Papernot, N.; Carlin, N. Adversary instantiation: Lower bounds for differentially private
machine learning. In Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 24–27
May 2021; pp. 866–882.
48. Lokna, J.; Paradis, A.; Dimitrov, D.I.; Vechev, M. Group and Attack: Auditing Differential Privacy. In Proceedings of the 2023
ACM SIGSAC Conference on Computer and Communications Security (CCS ’23), Copenhagen, Denmark, 26–30 November
2023; ACM: New York, NY, USA, 2023; pp. 1–22. https://dl.acm.org/doi/10.1145/3576915.3616607.
49. Tramèr, F.; Terzis, A.; Steinke, T.; Song, S.; Jagielski, M.; Carlini, N. Debugging differential privacy: A case study for privacy
auditing. arXiv 2022, arXiv:2202.12219. Available online: https://arxiv.org/abs/2202.12219 (accessed on 1 December 2024).
50. Kifer, D.; Messing, S.; Roth, A.; Thakurta, A.; Zhang, D. Guidelines for Implementing and Auditing Differentially Private
Systems. arXiv 2020, arXiv:2002.04049. Available online: https://arxiv.org/abs/2002.04049 (accessed on 1 December 2024).
Appl. Sci. 2025, 15, 647 49 of 57
51. Homer, N.; Szelinger, S.; Redman, M.; Duggan, D.; Tembe, W.; Muehling, J.; Pearson, J.V.; Stephan, D.A.; Nelson, S.F.; Craig,
D.W. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP
genotyping microarrays. PLoS Genet. 2028, 4, e1000167. https://doi.org/10.1371/journal.pgen.1000167.
52. Shokri, R.; Stronati, M.; Song, C.; Shmatikov, V. Membership Inference Attacks against Machine Learning Models. In
Proceedings of the 2017 IEEE Symposium on Security and Priavcy (S&P), San Jose, CA, USA, 22–26 May 2017; pp. 3–18.
53. Cui, G.; Ge, L.; Zhao, Y.; Fang, T. A Membership Inference Attack Defense Methods Based on Differential Privacy and Data
Enhancement. In Proceedings of the Communication in Computer and Information Science, Manchester, UK, 9–11 September
2024; Volume 2015 CCIS, pp. 258–270.
54. Ma, Y.; Zhu, X.; Hsu, J. Data Poisoning against Differentially-Private Learners: Attacks and Defences. arXiv 2019,
arXiv:1903.09860. Available online: https://arxiv.org/abs/1903.09860 (accessed on 1 December 2024).
55. Cinà, A.E.; Grosse, K.; Demondis, A.; Biggo, B.; Roli, F.; Pelillo, M. Machine Learning Security Against Data Poisoning: Are We
There Yet? Computer 2024, 7, 26–34. https://doi.org/10.1109/MC.2023.3299572.
56. Cheng, Z.; Li, Z.; Zhang, L.; Zhang, S. Differentially Private Machine Learning Model against Model Extraction Attack. IEEE
2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom)
and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on
Cybermatics (Cybermatics), Rhodes, Greece, 2020, 722-728, Available online: https://ieeexplore.ieee.org/document/9291542
(accessed on 9 January 2025).
57. Miura, T.; Hasegawa, S.; Shibahara, T. MEGEX: Data-free model extraction attack against gradient-based explainable AI. arXiv
2021, arXiv:2107.08909. Available online: https://arxiv.org/abs/2107.08909 (accessed on 1 December 2024).
58. Ye, Z.; Luo, W.; Naseem, M.L.; Yang, X.; Shi, Y.; Jia, Y. C2FMI: Corse-to-Fine Black-Box Model Inversion Attack. In IEEE
Transactions on Dependable and Secure Computing. 2024, 21, 3, 1437 – 1450. https://ieeexplore.ieee.org/document/10148574
(accessed on 9 January 2025).
59. Qiu, Y.; Yu, H.; Fang, H.; Yu, W.; Chen, B.; Wang, X.; Xia, S.-T.; Xu, K. MIBench: A Comprehensive Benchmark for Model
Inversion Attack and Defense. arXiv 2024, arXiv:2410.05159. Available online: https://arxiv.org/abs/2410.05159 (accessed on 9
January 2025).
60. Stock, J.; Lange, L.; Erhard, R.; Federrath, H. Property Inference as a Regression Problem: Attacks and Defense. In Proceedings
of the International Conference on Security and Cryptography, Bengaluru, India, 18–19 April 2024; pp. 876–885. Available
online: https://www.scitepress.org/publishedPapers/2024/128638/pdf/index.html (accessed on 30 December 2024).
61. Lu, F.; Munoz, J.; Fuchs, M.; LeBlond, T.; Zaresky-Williams, E.; Raff, E.; Ferraro, F.; Testa, B. A General Framework for Auditing
Differentially Private Machine Learning. In Advances in Neural Information Processing Systems; Oh, A.H., Belgrave, A., Cho, K.,
Eds.; The MIT Press: Cambridge, MA, USA, 2022. Available online: https://openreview.net/forum?id=AKM3C3tsSx3 (accessed
on 1 December 2024).
62. Zanella-Béguelin, S.; Wutschitz, L.; Tople, S.; Salem, A.; Rühle, V.; Paverd, A.; Naseri, M.; Köpf, B.; Jones, D. Bayesian Estimation
Of Differential Privacy. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29
July 2023; Volume 202, pp. 40624–40636.
63. Cowan, E.; Shoemate, M.; Pereira, M. Hands-On Differential Privacy; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2024; ISBN
9781492097747.
64. Bailie, J.; Gong, R. Differential Privacy: General Inferential Limits via Intervals of Measures. Proc. Mach. Learn. Res. 2023, 215,
11–24. Available online: https://proceedings.mlr.press/v215/bailie23a/bailie23a.pdf (accessed on 30 December 2024).
65. Kilpala, M.; Kärkäinen, T. Artificial Intelligence and Differential Privacy: Review of Protection Estimate Models. In Artificial
Intelligence for Security: Enhancing Protection in a Changing World; Springer Nature Switherland: Cham, Switherland, 2024; pp.
35–54.
66. Balle, B.; Wang, Y.-X. Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal
Denoising. In Proceedongs of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden, 10–15 July
2018; pp. 394–403. Available online: http://proceedings.mlr.press/v80/balle18a/balle18a.pdf (accessed on 30 December 2024).
67. Chen, B.; Hale, M. The Bounded Gaussian Mechanism for Differential Privacy. J. Priv. Confidentiality 2024, 14, 1.
https://doi.org/10.29012/jpc.850.
68. Zhang, K.; Zhang, Y.; Sun, R.; Tsai, P.-W.; Ul Hassan, M.; Yuan, X.; Xue, M.; Chen, J. Bounded and Unbiased Composite
Differential Privacy. In Proceedings of the IEEE Symposium on Security and Privacy, San Francisco, CA, USA, 19–23 May 2024;
pp. 972–990.
Appl. Sci. 2025, 15, 647 50 of 57
69. Nanayakkara, P.; Smart, M.A.; Cummings, R.; Kaptchuk, G. What Are the Chances? Explaining the Epsilon Parameter in
Differential Privacy. In Proceeding of the 32nd USINEX Security Symposium, Anaheim, CA, USA, 9–11 August 2023; Volume
3, pp. 1613–1630. https://doi.org/10.5555/3620237.3620328.
70. Cannone, C.; Kamath, G.; McMillan, A.; Smith, A.; Ullman, J. The Structure Of Optimal Private Tests For Simple Hypotheses.
In Proceedings of the Annual ACM Symposium on Theory of Computing, Phoenix, AZ, USA, 23–26 June 2019; pp. 310–321.
Available online: https://arxiv.org/abs/1811.11148 (accessed on 30 December 2024).
71. Dwork, C.; Feldman, V. Privacy-preserving Prediction. arXiv 2018, arXiv:1803.10266. Available online:
https://arxiv.org/abs/1803.10266 (accessed on 1 December 2024).
72. Mironov, I. Rényi Differential Privacy. In Proceedings of the 30th IEEE Computer Security Foundations Symposium, CSF, Santa
Barbara, CA, USA, 21–25 August 2017; pp. 263–275. https://doi.org/10.1109/CSF.2017.33.
73. Sarathy, R.; Muralidhar, K. Evaluating Laplace noise addition to satisfy differential privacy for numeric data. Trans. Data Priv.
2011, 4, 1–17. https://doi.org/10.2202/tdp.2011.001.
74. Kumar, G.S.; Premalatha, K.; Uma Maheshwari, G.; Rajesh Kanna, P.; Vijaya, G.; Nivaashini, M. Differential privacy scheme
using Laplace mechanism and statistical method computation in deep neural network for privacy preservation. Eng. Appl. Artif.
Intell. 2024, 128, 107399. https://doi.org/10.1016/j.engappai.2023.107399.
75. Liu, F. Generalized Gaussian Mechanism for Differential Privacy. IEEE Trans. Knowl. Data Eng. 2018, 31, 747–756.
https://doi.org/10.1109/TKDE.2018.2845388.
76. Dong, J.; Roth, A.; Su, W.J. Gaussian Differential privacy. arXiv 2019, arXiv:1905.02383. Available online:
https://arxiv.org/abs/1905.02383 (accessed on 1 December 2024).
77. Geng, Q.; Ding, W.; Guo, R.; Kumar, S. Tight Analysis of Privacy and Utility Tradeoff in Approximate Differential Privacy. Proc.
Mach. Lerning Res. 2020, 108, 89–99. Available online: http://proceedings.mlr.press/v108/geng20a/geng20a.pdf (accessed on 30
December 2024).
78. Whitehouse, J.; Ramdas, A.; Rogers, R.; Wu, Z.S. Fully-Adaptive Composition in Differential Privacy. arXiv 2023,
arXiv:2203.05481. Available online: https://arxiv.org/abs/2203.05481 (accessed on 30 December 2024).
79. Dwork, C.; Kenthapadi, K.; McSherry, F.; Mironov, I.; Naor, M. Our Data, Ourselves: Privacy Via Distributed Noise Generation.
In Advances in Cryptology—EUROCRYPT; Vaudenay, S., Ed.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 486–503.
80. Zhu, K.; Fioretto, F.; Van Hentenryck, P. Post-processing of Differentially Private Data: A Fairness Perspective. In Proceedings
of the 31st International Joint Conference on Artificial Intelligence (IJCAI), Vienna, Austria, 23–29 July 2022; pp. 4029–4035.
https://doi.org/10.24963/ijcai.2022/559.
81. Ganev, G.; Annamalai, M.S.M.S.; De Cristofaro, E. The Elusive Pursuit of Replicating PATE-GAN: Benchmarking, Auditing,
Debugging. arXiv 2024, arXiv2406.13985. Available online: https://arxiv.org/abs/2406.13985 (accessed on 1 December 2024).
82. Naseri, M.; Hayes, J.; De Cristofaro, E. Local and Central Differential Privacy for Robustness and Privacy in Federated Learning.
arXiv 2022, arXiv:2009.03561. Available online: https://arxiv.org/abs/2009.03561 (accessed on 1 December 2024).
83. Babesne, B. Local Differential Privacy: A tutorial. arXiv 2019, arXiv:1907.11908. Available online:
https://arxiv.org/abs/1907.11908 (accessed on 1 December 2024).
84. Nasr, M.; Shokri, R.; Houmandsadr, A. Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box
Inference Attacks against Centralized and Federated Learning. arXiv 2020, arXiv:1812.00910. Available online:
https://arxiv.org/abs/1812.00910 (accessed on 1 December 2024).
85. Galli, F.; Biswas, S.; Jung, K.; Cucinotta, T.; Palamidessi, C. Group privacy for personalized federated learning. arXiv 2022,
arXiv:2206.03396. Available online: https://arxiv.org/abs/2206.03396 (accessed on 1 December 2024).
86. Cormode, G.; Jha, S.; Kulkarni, T.; Li, N.; Srivastava, D.; Wang, T. Privacy At Scale: Local Differential Privacy in Practice. In
Proceedings of the ACM SIGMOD International Conference on Management of Data, Houston, TX, USA, 10–15 June 2018; pp.
1655–1658. https://doi.org/10.1145/3183713.3197390.
87. Yang, M.; Guo, T.; Zhu, T.; Tjuawinata, I.; Zhao, J.; Lam, K.-Y. Local Differential Privacy And Its Applications: A Comprehensive
Survey. Comput. Stand. Interfaces 2024, 89, 103827. https://doi.org/10.1016/j.csi.2023.103827.
88. Duchi, J.; Wainwright, M.J.; Jordan, M.I. Local Privacy And Minimax Bounds: Sharp Rates For Probability Estimation. Adv.
Neural Inf. Process. Syst. 2013, 26, 1529–1537. https://doi.org/10.5555/2999611.2999782.
89. Ruan, W.; Xu, M.; Fang, W.; Wang, L.; Wang, L.; Han, W. Private, Efficient, and Accurate: Protecting Models Trained by Multi-
party Learning with Differential Privacy. In Proceedings of the–IEEE Symposium on Security and Privacy, San Francisco, CA,
USA, 21–25 May 2023; pp. 1926–1943.
Appl. Sci. 2025, 15, 647 51 of 57
90. Pan, K.; Ong, Y.-S.; Gong, M.; Li, H.; Qin, A.K.; Gao, Y. Differential privacy in deep learning: A literature review. Neurocomputing
2024, 589, 127663. https://doi.org/10.1016/j.neucom.2024.127663.
91. Kang, Y.; Liu, Y.; Niu, B.; Tong, X.; Zhang, L.; Wang, W. Input Perturbation: A New Paradigm between Central and Local
Differential Privacy. arXiv 2020, arXiv:2002.08570. Available online: https://arxiv.org/abs/2002.08570 (accessed on 1 December
2024).
92. Chaudhuri, K.; Monteleoni, C.; Sarwate, A.D. Differentially Private Empirical Risk Minimization. J. Mach. Learn. Res. 2011, 12,
1069–1109. https://doi.org/10.5555/1953048.2021036.
93. De Cristofaro, E. Critical Overview of Privacy in Machine Learning. IEEE Secur. Priv. 2021, 19, 19–27.
https://doi.org/10.1109/MSEC.2021.9433648.
94. Shen, Z.; Zhong, T. Analysis of Application Examples of Differential Privacy in Deep Learning. Comput. Intell. Neurosci. 2021,
2021, e4244040. https://doi.org/10.1155/2021/4244040.
95. Rigaki, M.; Garcia, S. A Survey of Privacy Attacks in Machine Learning. ACM Comput.Surv. 2023, 56, 101.
https://doi.org/10.1145/3624010.
96. Wu, D.; Qi, S.; Li, Q.; Cai, B.; Guo, Q.; Cheng, J. Understanding and Defending against White-Box Membership Inference Attack
in Deep Learning. Knowl. Based Syst. 2023, 259, 110014. https://doi.org/10.1016/j.knosys.2022.110014.
97. Fang, H.; Qiu, Y.; Yu, H.; Yu, W.; Kong, J.; Chong, B.; Chen, B.; Wang, X.; Xia, S.-T. Privacy Leakage on DNNs: A Survey of
Model Inversion Attacks and Defenses. arXiv 2024, arXiv:2402.04013. Available online: https://arxiv.org/abs/2402.04013
(accessed on 1 December 2024).
98. He, X.-M.; Wang, X.S.; Chen, H.-H.; Dong, Y.-H. Study on Choosing the Parameter in Differential Privacy. Tongxin Xuebo/J.
Commun. 2015, 36, 12.
99. Mazzone, F.; Al Badawi, A.; Polyakov, Y.; Everts, M.; Hahn, F., Peter, A. Investigating Privacy Attacks in the Gray-Box Settings
to Enhance Collaborative Learning Schemes. arXiv 2024, arXiv:2409.17283. https://arxiv.org/abs/2409.17283 (accessed on 9
January 2025).
100. Fredrikson, M.; Jha, S.; Ristenpart, T. Model inversion attacks that exploit confidence information and basic countermeasures.
In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, ACM, Denver, CO, USA,
12–16 October 2015; pp. 1322–1333.
101. Carlini, N.; Tramèr, F.; Wallace, E.; Jagielski, M.; Herbert-Voss, A.; Lee, K.; Roberts, A.; Brown, T.; Song, D.; Erlingsson, U.;
Oprea, A.; Raffel, C Extracting training data from large language models. arXiv 2020, arXiv:2012.07805. Available online:
https://arxiv.org/abs/2012.07805 (accessed on 1 December 2024).
102. Carlini, N.; Liu, C.; Erlingsson, Ś.; Kos, J.; Song, D. The secret sharer: Evaluating and testing unintended memorization in neural
networks. In Proceedings of the 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, USA, 14–16 August
2019; pp. 267–284.
103. Carlini, N.; Chien, S.; Nasr, M.; Song, S.; Terzis, A.; Tramèr, F. Membership Inference Attacks from First Principles. arXiv 2021,
arXiv:2112.03570. Available online: https://arxiv.org/abs/2112.03570 (accessed on 9 January 2025).
104. Hu, H.; Salcic, Z.; Sun, L.; Dobbie, G.; Yu, P.S.; Zhang, X. Membership Inference Attacks on Machine Learning: A Survey. arXiv
2022, arXiv:2103.07853. Available online: https://arxiv.org/abs/2103.07853 (accessed on 1 December 2024).
105. Zarifzadeh, S.; Liu, P.; Shokri, R. Low-Cost High-Power Membership Inference Attacks. arXiv 2023, arXiv:2312.03262. Available
online: https://arxiv.org/abs/2312.03262 (accessed on 1 December 2024).
106. Aubinais, E.; Gassiat, E.; Piantanida, P. Fundamental Limits of Membership Inference attacks on Machine Learning Models.
arXiv 2024, arXiv:2310.13786. Available online: https://arxiv.org/html/2310.13786v4 (accessed on 1 December 2024).
107. Leino, K.; Fredrikson, M. Stolen memories: Leveraging model memorization for calibrated white box membership inference. In
Proceedings of the 29th {USENIX} Security Symposium {USENIX} Security 20, Online, 12–14 August 2020; pp. 1605–1622.
Available online: https://www.usenix.org/conference/usenixsecurity20/presentation/leino (accessed on 28 December 2024).
108. Liu, R.; Wang, D.; Ren, Y.; Wang, Z.; Guo, K.; Qin, Q.; Liu, X. Unstoppable Attack: Label-Only Model Inversion via Conditional
Diffusion Model. IEEE Trans. Inf. Forensics Secur. 2024, 19, 3958–3973. https://doi.org/10.1109/TIFS.2024.3372815.
109. Annamalai, M.S.M.S. Nearly Tight Black-Box Auditing of Differential Private Machine Learning. arXiv 2024, arXiv:2405.14106.
Available online: https://arxiv.org/abs/2405.14106 (accessed on 1 December 2024).
110. Lin, S.; Bun, M.; Gaboardi, M.; Kolaczyk, E.D.; Smith, A. Differential Private Confidence Intervals for Proportions Under
Stratified Random Sampling. Electron. J. Statisitics 2024, 18, 1455–1494. https://doi.org/10.1214/24-EJS2234.
111. Wang, J.T.; Mahloujifar, S.; Wu, T.; Jia, R.; Mittal, P. A Randomized Approach to Tight Privacy Accounting. arXiv 2023,
arXiv:2304.07927. Available online: https://arxiv.org/abs/2304.07927 (accessed on 1 December 2024).
Appl. Sci. 2025, 15, 647 52 of 57
112. Salem, A.; Zhang, Y.; Humbert, M.; Fritz, M.; Backes, M. ML-Leaks: Model and Data Independent Membership Inference
Attacks and Defenses on Machine Learning Models. arXiv 2019, arXiv:1806.01246. Available online:
https://arxiv.org/abs/1806.01246 (accessed on 9 January 2025).
113. Ye, D.; Shen, S.; Zhu, T.; Liu, B.; Zhou, W. One Parameter Defense—Defending against Data Inference Attacks via Differential
Privacy. arXiv 2022, arXiv:2203.06580. Available online: https://arxiv.org/abs/2203.06580 (accessed on 1 December 2024).
114. Cummings, R.; Desfontaines, D.; Evans, D.; Geambasu, R.; Huang, Y.; Jagielski, M.; Kairouz, P.; Kamath, G.; Oh, S.; Ohrimenko,
O.; et al. Advancing Differential Privacy: Where We are Now and Future Directions. Harv. Data Sci. Rev. 2024, 6, 475–489.
https://doi.org/10.1162/99608f92.d3197524.
115. Zhang, G.; Liu, B.; Zhu, T.; Ding, M.; Zhou, W. Label-Only Membership Inference attacks and Defense in Semantic Segmentation
Models. IEEE Trans. Dependable Secur. Comput. 2023, 20, 1435–1449. https://doi.org/10.1109/TDSC.2023.00049.
116. Wu, Y.; Qiu, H.; Guo, S.; Li, J.; Zhang, T. You Only Query Once: An Efficient Label-Only Membership Inference Attack. In
Proceedings of the 12th International Conference on Learning Representations, ICLR 2024, Hybrid, Vienna, 7–11 May 2024.
Available online: https://openreview.net/forum?id=7WsivwyHrS¬eId=QjoAoa8UVW (accessed on 30 December 2024).
117. Li, N.; Qardaji, W.; Su, D.; Wu, Y.; Yang, W. Membership privacy: A Unifying Framework for Privcy Definitions. In Proceedings
of the ACM Conference on Computers and Communication Security (CCS), Berlin, Germany, 4–8 November 2013; pp. 889–900.
https://doi.org/10.1145/2508859.2516686.
118. Andrew, G.; Kairouz, P.; Oh, S.; Oprea, A.; McMahan, H.B.; Suriyakumar, V. One-shot Empirical Privacy for Federated
Learning. arXiv 2024, arXiv:2302.03098.
119. Patel, N.; Shokri, R.; Zick, Y. Model Explanations with Differential Privacy. In Proceedings of the 2022 ACM Conference on
Fairness, Accountability, and Transparency (FaccT ’22), Seoul, Republic of Korea, 21–24 June 2022; ACM, New York, NY, USA,
2022; 10p. https://doi.org/10.1145/3531146.3533235.
120. Ding, Z.; Tian, Y.; Wang, G.; Xiong, J. Regularization Mixup Adversarial Training: A Defense Strategy for Membership Privacy
with Model Availbility Assurance. In Proceedings of the 2024 2nd Interntional Conference on Big Data and Privacy Computing,
BDPC, Macau, China, 10–12 January 2024; pp. 206–212.
121. Qui, W. A Survey on Poisoning Attacks Against Supervised Machine Learning. arXiv 2022, arXiv:2202.02510. Available online:
https://arxiv.org/abs/2202.02510 (accessed on 9 January 2025).
122. Zhao, B. Towards Class-Oriented Poisoning Attacks Against Neural Networks. In Proceedings of the 2022 IEEE/CVF Winter
Conference on Application of Computer Vision, WACV, Waikoloa, HI, USA, 3–8 January 2022; pp. 2244–2253.
https://doi.org/10.1109/WACV51468.2022.00244.
123. Koh, P.W.; Steinhardt, J.; Liang, P. Stronger data poisoning attacks data sanitization defenses. arXiv 2021, arXiv:1811.00741.
Available online: https://arxiv.org/abs/1811.00741 (accessed on 1 December 2024).
124. Zhang, R.; Gou, S.; Wang, J.; Xie, X.; Tao, D. A Survey on Gradient Inversion Attacks, Defense and Future Directions. In
Proceedings of the 31st Joint Conference on Artificial Intelligence (IJCAI-22), 2022, 5678 – 5685.
https://www.ijcai.org/proceedings/2022/0791.pdf (accessed on 10 January 2025).
125. Yan, H.; Wang, Y.; Yao, L.; Zhong, X.; Zhao, J. A Stacionary Random Process based Privacy-Utility Tradeoff in Differential
Privacy. In Proceedings of the 2023 International Confernce on High Performance Big Data and Intelligence Systems, HDIS
2023, Macau, China, 6–8 December 2023; pp. 178–185.
126. D’Oliveira, R.G.L.; Salamtian, S.; Médard, M. Low Influence, Utiltiy, and Independence in Differential Privacy: A Curious Case
of (32). IEEE J. Sel. Areas Inf. Theory 2021, 2, 240–252. https://doi.org/10.1109/JSAIT.2021.3083939.
127. Chen, M.; Liu, C.; Li, B.; Lu, K.; Song, D. Targeted Backdoor attacks on deed learning systems using data poisoning. arXiv 2017,
arXiv:1712.05526. Available online: https://arxiv.org/abs/1712.05526 (accessed on 1 December 2024).
128. Feng, S.; Tramèr, F. Privacy Backdoors: Stealing Data with Corrupted Pretrained Models. arXiv 2024, arXiv:2404.00473.
Available online: https://arxiv.org/abs/2404.00473 (accessed on 1 December 2024).
129. Gu, T.; Dolan-Gavitt, B.; Garg, S. BadNets: Identifying vulnerabilities in the machine learning model supply chain. arXiv 2019,
arXiv:1708.06733. Available online: https://arxiv.org/abs/1708.06733 (accessed on 1 December 2024).
130. Demelius, L.; Kern, R.; Trügler, A. Recent Advances of Differential Privacy in Centralized Deep Learning: A Systematic Survey.
arXiv 2023, arXiv:2309.16398. Available online: https://arxiv.org/abs/2309.16398 (accessed on 1 December 2024).
131. Oprea, A.; Singhal, A.; Vassilev, A. Poisoning attacks against machine learning: Can machine learning be trustworthy? Computer
2022, 55, 94–99, Available online: https://ieeexplore.ieee.org/document/9928202 (accessed on 1 December 2024).
Appl. Sci. 2025, 15, 647 53 of 57
132. Salem, A.; Wen, R.; Backes, M.; Ma, S.; Zhang, Y. Dynamic Backdoor Attacks Against Machine Learning Models. In Proceedings
of the IEEE European Symposium Security Privacy (EuroS&P), Genoa, Italy, 6–10 June 2022; pp. 703–718.
https://doi.org/10.1109/EuroSP53844.2022.00049.
133. Xu, X.; Chen, Y.; Wang, B.; Bian, Z.; Han, S.; Dong, C.; Sun, C.; Zhang, W.; Xu, L.; Zhang, P. CSBA: Covert Semantic Backdoor
Attack Against Intelligent Connected Vehicles. IEEE Trans. Veh. Technol. 2024, 73, 17923–17928.
https://doi.org/10.1109/TVT.2024.10598360.
134. Li, X.; Li, N.; Sun, W.; Gong, N.Z.; Li, H. Fine-grained Poisoning attack to Local Differential Privacy Protocols for Mean and
Variance Estimation. In Proceedings of the 32nd USENIX Security Symposium (USINEX Security), Anaheim, CA, USA, 9–11
August 2023; Volume 3, pp. 1739–1756. Available online: https://www.usenix.org/conference/usenixsecurity23/presentation/li-
xiaoguang (accessed on 30 December 2024).
135. Fang, X.; Yu, F.; Yang, G.; Qu, Y. Regression Analysis with Differential Privacy Preserving. IEEE Access 2019, 7, 129353–129361.
https://doi.org/10.1109/ACCESS.2019.2940714.
136. Wang, Y.; Si, C.; Wu, X. Regression Model Fitting under Differential Privacy and Model Inversion Attack. In Proceedings of the
24th International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, 25–31 July 2015; pp. 1003–1009.
137. Dibbo, S.V. SoK: Model Inversion Attack Landscape: Taxonomy, Challenges, and Future Roadmap. In Proceedings of the IEEE
36th Computer Security Foundations Symposium (CSF), Dubrovnik, Croatia, 10–14 July 2023. Available online:
https://ieeexplore.ieee.org/document/10221914 (accessed on 1 December 2024).
138. Wu, X.; Fredrikson, M.; Jha, S.; Naughton, J.F. A methodology for formalizing model-inversion attacks. In Proceedings of the
2016 IEEE 29th Computer Security Foundations Symposium (CSF), Lisbon, Portugal, 27 June–1 July 2016; pp. 355–370. Available
online: https://ieeexplore.ieee.org/document/7536387 (accessed on 30 December 2024).
139. Park, C.; Hong, D.; Seo, C. An Attack-Based Evaluation Method for Differentially Private Learning Against Model Inversion
Attack. IEEE Access 2019, 7, 124988–124999. https://doi.org/10.1109/ACCESS.2019.2938759.
140. Zhao, J.; Chen, Y.; Zhang, W. Differential Privacy Preservation in Deep Learning: Challenges, Opportunities and Solutions.
IEEE Access 2019, 7, 48901–48911. https://doi.org/10.1109/ACCESS.2019.2901678.
141. Yang, Z.; Zhang, J.; Chang, E.-C.; Liang, Z. Neural Network in Adversarial Setting via Background Knowledge Alignment. In
Proceedings of the 2019 ACM SIGSAC Conf. on Computing and Communication Security, London, UK, 11–15 November 2019;
pp. 225–240. https://doi.org/10.1145/3319535.3354261.
142. Han, G.; Choi, J.; Lee, H.; Kim, J. Reinforcement Learning-Based Black-Box Model Inversion Attacks. arXiv 2023,
arXiv:2304.04625. Available online: https://arxiv.org/abs/2304.04625 (accessed on 10 January).
143. Han, G.; Choi, J.; Lee, H.; Kim, J. Reinforcement Learning-Based Black-Box Model Inversion Attacks. In Proceedings of the IEEE
Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023;
pp. 20504–20513. https://doi.org/10.1109/CVPR42600.2023.020504.
144. Bekman, T.; Abolfathi, M.; Jafarian, H.; Biswas, A.; Banaei-Kashani, F.; Das, K. Practical Black Box Model Inversion Attacks
Against Neural Nets. Commun. Comput. Inf. Sci. 2021, 1525, 39–54. https://doi.org/10.1007/978-3-030-93733-1_3.
145. Du, J.; Hu, J.; Wang, Z.; Sun, P.; Gong, N.Z.; Ren, K. SoK: Gradient Leakage in Federated Learning. arXiv 2024, arXiv:2404.05403.
Available online: https://arxiv.org/abs/2404.05403 (Accessed on 10 January).
146. Zhang, Z.; Liu, Q.; Huang, Z.; Wang, H.; Lu, C.; Liu, C.; Chen, E. GraphMI: Extracting Private Graph Data from Graph Neural
Networks. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Montreal, QC, Canada, 19–27
August 2021; pp. 3749–3755. https://doi.org/10.24963/ijcai.2021/516.
147. Li, Z.; Pu, Y.; Zhang, X.; Li, Y.; Li, J.; Ji, S. Protecting Object Detection Models From Model Extraction Attack via Feature Space
Coverage. In Proceedings of the 33rd International Joint Conference on Artificial Intelligence (IJCAI), Jeju, Republic of Korea,
3–9 August 2024; pp. 431–439. https://doi.org/10.24963/ijcai.2024/48.
148. Tramér, F.; Zhang, F.; Juels, A.; Reiter, M.K.; Ristenpart, T. Stealing Machine Learning Models and Prediction APIs. In
Proceedings of the USENIX Security Symposium (SEC), Austin, TX, USA, 10–12 August 2016. Available online:
https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/tramer (accessed on 30 December 2024).
149. Liang, J.; Pang, R.; Li, C.; Wang, T. Model Extraction Attacks Revisited. arXiv 2023, arXiv:2312.05386. Available online:
https://arxiv.org/abs/2312.05386 (accessed on 1 December 2024).
150. Liu, S. Model Extraction Attack and Defense on Deep Generative Models. J. Phys. Conf. Ser. 2021, 2189 012024.
https://doi.org/10.1088/1742-6596/2189/1/012024.
Appl. Sci. 2025, 15, 647 54 of 57
151. Parisot, M.P.M.; Pejo, B.; Spagnuelo, D. Property Inference Attacks on Convolution Neural Networks: Influence and
Implications of Target Model`s Complexity. arXiv 2021, arXiv:2104.13061. Available online: https://arxiv.org/abs/2104.13061
(accessed on 10 January 2025).
152. Zhang, W.; Tople, S.; Ohrimenko, O. Leakage of dataset properties in Multi-Party machine learning. In Proceedings of the 30th
USINEX Security Symposium (USINEX Security), virtual, 11–13 August 2021; USINEX Association: Berkeley, CA, USA, 2021;
pp. 2687–2704. Available online: https://www.usenix.org/conference/usenixsecurity21/presentation/zhang-wanrong (accessed
on 1 December 2024).
153. Mahloujifar, S.; Ghosh, E.; Chase, M. Property Inference from Poisoning. In Proceedings of the 2022 IEEE Symposium on
Security and Privacy (SP), San Francisco, CA, USA, 22–26 May 2022; pp. 1120–1137.
https://doi.org/10.1109/SP46214.2022.9833623.
154. Horigome, H.; Kikuchi, H.; Fujita, M.; Yu, C.-M. Robust Estimation Method against Poisoning Attacks for Key-Value Data Local
Differential Privacy. Appl. Sci. 2024, 14, 6368. https://doi.org/10.3390/app14146368.
155. Parisot, M.P.M.; Pejó, B.; Spagnuelo, D. Property Inference Attacks on Convolutional Neural Networks: Influence and
Implications of Target Model’s Complexity, In Proceedings of the 18th International Conference on Security and Cryptography,
SECRYPT, Online, 6–8 July 2021; pp. 715–721. https://doi.org/10.5220/0010555607150721.
156. Chase, M.; Ghosh, E.; Mahloujifar, S. Property Inference from Poisoning. arXiv 2021, arXiv:2101.11073. Available online:
https://arxiv.org/abs/2101.11073 (accessed on 1 December 2024).
157. Liu, X.; Xie, L.; Wang, Y.; Zou, J.; Xiong, J.; Ying, Z.; Vasilakos, A.V. Privacy and Security Issues in Deep Learning: A Survey. In
IEEE Access 2020, 9, 4566–4593. https://doi.org/10.1109/ACCESS.2020.3045078.
158. Gilbert, A.C.; McMillan, A. Property Testing for Differential Privacy. In Procedings of the 56th Annual Allerton Conference on
Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2–5 October 2018; pp. 249–258.
https://doi.org/10.1109/ALLERTON.2018.8636068.
159. Liu, X.; Oh, S. Minimax Optimal Estimation of Approximate Differential Privacy on Neighbouring Databases. In Proceedings
of the 33rd International Conference on Neural Information Processing Systems (NIPS`19). 2019, 217, 2417-2428.
https://dl.acm.org/doi/10.5555/3454287.3454504
160. Tschantz, M.C.; Kaynar, D.; Datta, A. Formal Verification of Differential Privacy for Interactive Systems (Extended Abstract).
Electron. Notes Theor. Comput. Sci. 2011, 276, 61–79. https://doi.org/10.1016/j.entcs.2011.05.005.
161. Pillutla, K.; McMahan, H.B.; Andrew, G.; Oprea, A.; Kairouz, P.; Oh, S. Unleashing the Power of Randomization in Auditing
Differential Private ML. Adv. Neural Inf. Process. Syst. 2023, 36, 198465. Available online: https://arxiv.org/abs/2305.18447
(accessed on 30 December 2024).
162. Cebere, T.; Bellet, A.; Papernot, N. Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model. arXiv 2024,
arXiv:2405.14457. Available online: https://arxiv.org/abs/2405.14457 (accessed on 1 December 2024).
163. Zhang, J.; Das, D.; Kamath, G.; Tramèr, F. Membership Inference Attacks Cannot Prove that a Model Was Trained On Your
Data. arXiv 2024, arXiv:2409.19798. Available online: https://arxiv.org/abs/2409.19798 (accessed on 9 January 2025).
164. Yin, Y.; Chen, K.; Shou, L.; Chen, G. Defending Privacy against More Knowledge Membership Inference Attackers. In
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Singapore, 14–18
August 2021; pp. 2026–2036. https://doi.org/10.1145/3447548.3467444.
165. Bichsel, B.; Gehr, T.; Drachsler-Cohen, D.; Tsankov, P.; Vechev, M. DP-Finder: Finding Differential Privacy Violations, by
Sampling and Optimization. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security
(CCS ’18), Toronto, ON, Canada, 15–19 October 2018; ACM: New York, NY, USA, 2018; 17p.
https://doi.org/10.1145/3243734.3243863.
166. Niu, B.; Zhou, Z.; Chen, Y.; Cao, J.; Li, F. DP-Opt: Identify High Differential Privacy Violation by Optimization. In Wireless
Algorithms, Systems, and Applications. WASA 2022; Wang, L., Segal, M., Chen, J., Qiu, T., Eds.; Lecture Notes in Computer Science;
Springer, Cham, Switzerland, 2022; Volume 13472. https://doi.org/10.1007/978-3-031-19214-2_34.
167. Birhane, A.; Steed, R.; Ojewale, V.; Vecchione, B.; Raji, I.D. AI auditing: The broken bus on the road to AI accountability. arXiv
2024, arXiv:2401.14462. Available online: https://arxiv.org/abs/2401.14462 (accessed on 1 December 2024).
168. Dwork, C. A Firm Foundation for Private Data Analysis. Commun. ACM 2011, 54, 86–95.
https://doi.org/10.1145/1866739.1866758.
169. Dwork, C.; Su, W.J.; Zhang, L. Differential Private False Discovery Rate. J. Priv. Confidentiality 2021, 11, 2.
https://doi.org/10.29012/jpc.755 .
Appl. Sci. 2025, 15, 647 55 of 57
170. Liu, C.; He, X.; Chanyaswad, T.; Wang, S.; Mittal, P. Investigating Statistical Privacy Frameworks from the Perspective of
Hypothesis Testing. Proc. Priv. Enhancing Technol. (PoPETs) 2019, 2019, 234–254.
171. Balle, B.; Barthe, G.; Gaboardi, M.; Hsu, J.; Sato, T. Hypothesis Testing Interpretations and Rényi Differential Privacy. In
Proceedings of the 23rd International Conference on Artificial Intelligence and Statisitcs (AISTATS), Online, 26–28 August 2020;
Volume 108, pp. 2496–2506.
172. Kairouz, P.; Oh, S.; Viswanath, P. The Composition Theorem for Differential Privacy. In Proceedings of 32nd International
Conference on Machine Learning, ICML, Lille, France, 6–11 July 2015; pp. 1376–1385. Available online:
https://proceedings.mlr.press/v37/kairouz15.html (accessed on 30 December 2024).
173. Lu, Y.; Magdon-Ismail, M.; Wei, Y.; Zikas, V. Eureka: A General Framework for Black-box Differential Privacy Estimators. In
Proceedings of the IEEE Symposium on Security and Privacy, San Francisco, CA, USA, 19–23 May 2024; pp. 913–931.
174. Shamsabadi, A.S.; Tan, G.; Cebere, T.I.; Bellet, A.; Haddadi, H.; Papernot, N.; Wang, X.; Weller, A. Confident-Dproof:
Confidential Proof of Differential Private Training. In Proceeding of the 12th International Conference on Learning
Representations, ICLR, Hybrid, Vienna, 7–11 May 2024. Available online: https://openreview.net/forum?id=PQY2v6VtGe#tab-
accept-oral (accessed on 10 January, 2025).
175. Kazmi, M.; Lautraite, H.; Akbari, A.; Soroco, M.; Tang, Q.; Wang, T.; Gambs, S.; Lécuyer, M. PANORAMIA: Privacy Auditing
of Machine Learning Models without Retraining. arXiv 2024, arXiv:2402.09477. Available online:
https://arxiv.org/abs/2402.09477 (accessed on 1 December 2024).
176. Song, L.; Shokri, R.; Mittal, P. Membership Inference Attacks Against Adversarially Robust Deep Learning Models. In
Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), San Francisco, CA, USA, 19–23 May 2019.
177. Koskela, A.; Mohammadi, J. Black Box Differential Privacy Auditing Using Total Variation Distance. arXiv 2024, arXiv,
arXiv:2406.04827. Available online: https://arxiv.org/abs/2406.04827 (accessed on 1 December 2024).
178. Chen, J.; Wang, W.H.; Shi, X. Differential Privacy Protection Against Membership Inference Attack on Machine Learning for
Genomic Data. Pac. Symp. Biocomput. 2021, 26, 26–37. https://doi.org/10.1101/2020.08.03.235416.
179. Malek, M.; Mironov, I.; Prasad, K.; Shilov, I.; Tramèr, F. Antipodes of Label Differential Privacy: PATE and ALBI. arXiv 2021,
arXiv:2106.03408. Available online: https://arxiv.org/abs/2106.03408 (accessed on 1 December 2024).
180. Choquette-Choo, C.A.; Tramèr, F.; Carlini, N.; Papernot, N. Label-only Membership Inference Attacks. In Proceedings of the
38th International Conference on Machine Learning (ICML), Virtual, 18–24 July 2021, pp. 1964–1974.
181. Rahman, M.A.; Rahman, T.; Laganière, R.; Mohammed, N.; Wang, Y. Membership Inference Attack against Differentially
Private Deep Learning Models. Trans. Data Priv. 2018, 11, 61–79.
182. Humphries, T.; Rafuse, M.; Lindsey, T.; Oya, S.; Goldberg, I.; Kerschbaum, F. Differential Private Learning does not Bound
Membership Inference. arXiv 2020, arXiv:2010.12112. Available online: http://www.arxiv.org/abs/2010.12112v1 (accessed on 28
December 2024).
183. Askin, Ö .; Kutta, T.; Dette, H. Statistical Quantification of Differential Privacy. arXiv 2022, arXiv:2108.09528. Available online:
https://arxiv.org/abs/2108.09528 (accessed on 1 December 2024).
184. Aerni, M.; Zhang, J.; Tramèr, F. Evaluation of Machine Learning Privacy Defenses are Misleading. arXiv 2024, arXiv:2404.17399.
Available online: https://arxiv.org/abs/2404.17399 (accessed on 1 December 2024).
185. Kong, Z.; Chowdhury, A.R.; Chaudhurury, K. Forgeability and Membership Inference Attacks. In Proceedings of the 15th ACM
Workshop on Artificial Intelligence and Security (AISec ’22), Los Angeles, CA, USA, 11 November 2022.
https://doi.org/10.1145/3560830.3563731.
186. Kutta, T.; Askin, Ö .; Dunsche, M. Lower Bounds for Rényi Differential Privacy in a Black-Box Settings. arXiv 2022,
arXiv:2212.04739. Available online: https://arxiv.org/abs/2212.04739 (accessed on 1 December 2024).
187. Domingo-Enrich, C.; Mroueh, Y. Auditing Differential Privacy in High Dimensions with the Kernel Quantum Rényi
Divergence. arXiv 2022, arXiv:2205.13941. Available online: https://arxiv.org/abs/2205.13941 (accessed on 1 December 2024).
188. Koh, P.W.; Ling, P. Understanding Black-box Predictions via Influence Functions. arXiv 2017, arXiv:1703.04730. Available
online: https://arxiv.org/abs/1703.04730 (accessed on 1 December 2024).
189. Chen, C.; Campbell, N.D. Understanding training-data leakage from gradients in neural networks for image classification. arXiv
2021, arXiv:2111.10178. Available online: https://arxiv.org/abs/2111.10178 (accessed on 1 December 2024).
190. Xie, Z.; Yan, L.; Zhu, Z.; Sugiyama, M. Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Imrove
Generalization. arXiv 2021, arXiv:2103.17182. Available online: https://arxiv.org/abs/2103.17182 (accessed on 2 December 2024).
191. Liu, F.; Zhao, X. Disclosure Risk from Homogeneity Attack in Differntial Private Frequency Distribution. arXiv 2021,
arXiv:2101.00311. Available online: https://arxiv.org/abs/2101.00311 (accessed on 24 Decemeber 2024).
Appl. Sci. 2025, 15, 647 56 of 57
192. Steinke, T.; Ullman, J. Between Pure and Approximate Differential Privacy. arXiv 2015, arXiv:1501.06095. Available online:
https://arxiv.org/abs/1501.06095. (accessed on 24 Decemeber 2024)
193. Kairouz, P.; McMahan, B.; Song, S.; Thakkar, O.; Xu, Z. Practical and Private (Deep) Learning without Sampling on Shuffling.
In Proccedings of the 38th International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 5213–5225.
Available online: https://proceedings.mlr.press/v139/kairouz21b.html (accessed on 30 December 2024).
194. Li, Y. Theories in Online Information Privacy Research: A Critical Review and an Integrated Framework. Decis. Support. Syst.
2021, 54, 471–481. https://doi.org/10.1016/j.dss.2012.06.010.
195. Hay, M.; Machanavajjhala, A.; Miklau, G.; Chen, Y.; Zhang, D. Principled evaluation of differential private algorithms using
DPBench. In Proceeding of the ACM SIGMOD Conference on Management of Data, San Francisco, CA, USA, 26 June–1 July
2016; pp. 919–938. https://doi.org/10.1145/2882903.2882931.
196. Wang, Y.; Ding, Z.; Kifer, D.; Zhang, D. Checkdp: An Automated and Integrated Approach for Proving Differential Privacy or
Finding Precise Counterexamples. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications
Security, Virtual Event, 9–13 November 2020; pp. 919–938. https://doi.org/10.1145/3372297.3417282.
197. Barthe, G.; Chadha, R.; Jagannath, V.; Sistla, A.P.; Viswanathan, M. Deciding Differential Privacy for Programming with Finite
Inputs and Outpus. arXiv 2022, arXiv:1910.04137. Available online: https://arxiv.org/abs/1910.04137 (accessed on 2 December
2024).
198. Hitaj, B.; Ateniese, G.; Perez-Cruz, F. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning.
arXiv 2017, arXiv:1702.07464. Available online: https://arxiv.org/abs/1702.07464 (accessed on 1 December 2024).
199. Song, C.; Ristenpart, T.; Shmatikov, V. Machine Learning Models that Remember Too Much. In Proceedings of the ACM
SIGSAC Conference on Computer and Communications Security (CCS), Dallas, TX, USA, 30 October–3 November 2017; pp.
587–601. https://doi.org/10.1145/3133956.3134077.
200. Cummings, R.; Durfee, D. Individual Sensitivity Preprocessing for Data Privacy. In Proceedings of the Annual ACM-SIAM
Symposium on Discrete Algorithms (SODA), Salt Lake City, UT, USA, 5–8 January 2020; pp. 528–547.
201. Zhou, S.; Zhu, T.; Ye, D.; Yu, X.; Zhou, W. Boosting Model Inversion Attacks With Adversarial Examples. IEEE Trans. Dependable
Secur. Comput. 2023, 21, 1451–1468.
202. Zhu, L.; Liu, Z.; Han, S. Deep Leakage from Gradients. arXiv 2019, arXiv:1906.08935. Available online:
https://arxiv.org/abs/1906.08935 (accessed on 1 December 2024).
203. Huang, Y.; Gupta, S.; Song, Z.; Li, K.; Arora, S. Evaluating Gradient Inversion Attacks and Defenses in Federated Learning. Adv.
Neural Netw. Inf. Process. Syst. 2021, 9, 7232–7241.
https://proceedings.neurips.cc/paper_files/paper/2021/hash/3b3fff6463464959dcd1b68d0320f781-Abstract.html (accessed on 30
December 2024).
204. Wu, R.; Chen, X.; Guo, C.; Weinberger, K.Q. Learning to Invert: Simple Adaptive Attacks for Gradient Inversion in Federated
Learning. In Processing of the 39th Conferrence on Uncertainty in Artificial Intelligence (UAI), Pittsburgh, PA, USA, 31 July–4
August 2023; Volume 216, pp. 2293–2303. Available online: https://proceedings.mlr.press/v216/wu23a.html (accessed on 30
December 2024).
205. Zhu, H.; Huang, L.; Xie, Z. GGI: Generative Gradient Inversion Attack in Federated Learning. In Proceedings of the 6th
International Conference on Data-Driven Optimization of Complex Systems(DOCS), Hangzhou, China, 16–18 August 2024; pp.
379–384. Available online: http://arxiv.org/pdf/2405.10376.pdf (accessed on 30 December 2024).
206. Yang, Z.; Zhang, B.; Chen, G.; Li, T.; Su, D. Defending Model Inversion and Membership Inference Attacks vi Prediction
Purification. In Proceedings of the IEEE/CVF Conference on Computing Vision and Pattern Recognition (CVPR), Seattle, WA,
USA, 14–19 June 2020; pp. 1234–1243.
207. Zhang, Q.; Ma, J.; Xiao, Y.; Lou, J.; Xiong, L. Broadening Differential Privacy for Deep Learning against Model Inversion Attacks.
In Proceedings of the 2020 IEEE International Conference on Big Data, Atlanta, GA, USA, 10–13 December 2020; pp. 1061–1070.
https://doi.org/10.1109/BigData50022.2020.9360425.
208. Manchini, C.; Ospina, R.; Leiva, V.; Martin-Barreiro, C. A new approach to data differential privacy based on regression models
under heteroscedasticity with applications to machine learning repository data. Inf. Sci. 2023, 627, 280–300.
https://doi.org/10.1016/j.ins.2022.10.076.
209. Dziedzic, A.; Kaleem, M.A.; Lu, Y.S.; Papernot, N. Increasing the Cost of Model Extraction with Calibrated Proof of Work. In
Proceeding of the 10th International Conference on Learning Representations (ICLR), Virtual, 25 April 2022. Available online:
https://openreview.net/forum?id=EAy7C1cgE1L (accessed on 30 December 2024).
Appl. Sci. 2025, 15, 647 57 of 57
210. Li, X.; Yan, H.; Cheng, Z.; Sun, W.; Li, H. Protecting Regression Models with Personalized Local Differential Privacy. IEEE Trans.
Dependable Secur. Comput. 2023, 20, 960–974. https://doi.org/10.1109/TDSC.2022.3144690.
211. Zheng, H.; Ye, Q.; Hu, H.; Fang, C.; Shi, J. BDPL: A Boundary Differential Private Layer Against Machine Learning Model
Extraction Attacks. In Computer Security—ESORICS 2019; Sako, K., Schneider, S., Ryan, P., Eds.; Lecture Notes in Computer
Science; Springer: Cham, Switzerland, 2019; Volume 11735. https://doi.org/10.1007/978-3-030-29959-0_4.
212. Yan, H.; Li, X.; Li, H.; Li, J.; Sun, W.; Li, F. Monitoring-Based Differential Privacy Mechanism Against Query Flooding-based
Model Extraction Attack. IEEE Trans. Dependable Secur. Comput. 2022, 19, 2680–2694. https://doi.org/10.1109/TDSC.2021.3089670.
213. Suri, A.; Lu, Y.; Chen, Y.; Evans, D. Dissecting Distribution Inferrence. In Proceedings of the 2023 IEEE Confernce Security and
Trustworthy Machine Learning (SaTML), Raleigh, NC, USA, 8–10 February 2023; pp. 150–164.
214. Ganju, K.; Wang, Q.; Yang, W.; Gunter, C.A.; Borisov, N. Property Inference Attacks on Fully Connected Neural Networks
using Permutation Invariant Representations. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and
Communication Security, Toronto, ON, Canada, 15–19 October 2018; pp. 619–633.
https://dl.acm.org/doi/10.1145/3243734.3243834.
215. Melis, L.; Song, C.; De Cristofaro, E.; Shmatikov, V. Exploiting Unintended Feature Leakage in Collaborative Learning. In
Proceedings of the Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2019; pp. 691–706.
216. Huang, W.; Zhou, S. Unexpected Information Leakage of Differential Privacy Due to the Linear Properties of Queries. IEEE
Trans. Inf. Forensics Secur. 2021, 16, 3123–3137.
217. Ben Hamida, S.; Hichem, M.; Jemai, A. How Differential Privacy Reinforces Privacy of Machine Learning Modeles? In
Proceedings of the International Conference on Computational Collective Intelligence (ICCI), Leipzig, Germany, 9–11
September 2024.
218. Song, L.; Mittal, P.; Gong, N.Z. Systematic Evaluation of Privacy Risks in Machine Learning Models. In Proceedings of the ACM
on Asian Conference on Computer and Communication Security, Taipei, Taiwan, 5–9 October 2020.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual au-
thor(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.