Conference PaperPDF Available

Strong Machine Learning Attack Against PUFs with No Mathematical Model


Abstract and Figures

Although numerous attacks revealed the vulnerability of different PUF families to non-invasive Machine Learning (ML) attacks, the question is still open whether all PUFs might be learnable. Until now, virtually all ML attacks rely on the assumption that a mathematical model of the PUF functionality is known a priori. However, this is not always the case, and attention should be paid to this important aspect of ML attacks. This paper aims to address this issue by providing a provable framework for ML attacks against a PUF family, whose underlying mathematical model is unknown. We prove that this PUF family is inherently vulnerable to our novel PAC (Probably Approximately Correct) learning framework. We apply our ML algorithm on the Bistable Ring PUF (BR-PUF) family, which is one of the most interesting and prime examples of a PUF with an unknown mathematical model. We practically evaluate our ML algorithm through extensive experiments on BR-PUFs implemented on Field-Programmable Gate Arrays (FPGA). In line with our theoretical findings, our experimental results strongly confirm the effectiveness and applicability of our attack. This is also interesting since our complex proof heavily relies on the spectral properties of Boolean functions, which are known to hold only asymptotically. Along with this proof, we further provide the theorem that all PUFs must have some challenge bit positions, which have larger influences on the responses than other challenge bits.
Content may be subject to copyright.
Strong Machine Learning Attack against PUFs
with No Mathematical Model
Fatemeh Ganji, Shahin Tajik, Fabian F¨aßler, and Jean-Pierre Seifert
Security in Telecommunications,
Technische Universit¨at Berlin and Telekom Innovation Laboratories,
Berlin, Germany
Abstract. Although numerous attacks revealed the vulnerability of dif-
ferent PUF families to non-invasive Machine Learning (ML) attacks, the
question is still open whether all PUFs might be learnable. Until now,
virtually all ML attacks rely on the assumption that a mathematical
model of the PUF functionality is known a priori. However, this is not
always the case, and attention should be paid to this important aspect
of ML attacks. This paper aims to address this issue by providing a
provable framework for ML attacks against a PUF family, whose under-
lying mathematical model is unknown. We prove that this PUF family is
inherently vulnerable to our novel PAC (Probably Approximately Cor-
rect) learning framework. We apply our ML algorithm on the Bistable
Ring PUF (BR-PUF) family, which is one of the most interesting and
prime examples of a PUF with an unknown mathematical model. We
practically evaluate our ML algorithm through extensive experiments on
BR-PUFs implemented on Field-Programmable Gate Arrays (FPGA).
In line with our theoretical findings, our experimental results strongly
confirm the effectiveness and applicability of our attack. This is also
interesting since our complex proof heavily relies on the spectral proper-
ties of Boolean functions, which are known to hold only asymptotically.
Along with this proof, we further provide the theorem that all PUFs
must have some challenge bit positions, which have larger influences on
the responses than other challenge bits.
Keywords: Machine Learning, PAC Learning, Boosting Technique, Fourier Analysis,
Physically Unclonable Functions (PUFs).
©IACR 2016. This article is the final version submitted by the author(s) to the
IACR and to Springer-Verlag on June 6th, 2016. The version published by Springer-
Verlag is available at 10.1007/978-3-662-53140-2 19. Personal use of this material
is permitted. Permission from Springer-Verlag and IACR must be obtained for all
other uses, in any current or future media, including reprinting/ republishing this
material for advertising or promotional purposes, creating new collective work, for
resale or redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work.
1 Introduction
Nowadays, it is broadly accepted that Integrated Circuits (ICs) are subject to
overbuilding and piracy due to the adaption of authentication methods relying on
insecure key storage techniques [24]. To overcome the problem of secure key stor-
age, Physically Unclonable Functions (PUFs) have been introduced as promising
solutions [15, 30]. For PUFs, the manufacturing process variations lead eventu-
ally to instance-specific, and inherent physical properties that can generate vir-
tually unique responses, when the instance is given some challenges. Therefore,
PUFs can be utilized as either device fingerprints for secure authentication or as
a source of entropy in secure key generation scenarios. In this case, there is no
need for permanent key storage, since the desired key is generated instantly upon
powering up the device. Regarding the instance-specific, and inherent physical
properties of the PUFs, they are assumed to be unclonable and unpredictable,
and therefore trustworthy and robust against attacks [26]. However, after more
than a decade of the invention of PUFs, the design of a really unclonable phys-
ical function is still a challenging task. Most of the security schemes relying on
the notion of PUFs are designed based on a “design-break-patch” rule, instead
of a thorough cryptographic approach.
Along with the construction of a wide variety of PUFs, several different types
of attacks, ranging from non-invasive to semi-invasive attacks [18,19,33,39], have
been launched on these primitives. Machine learning (ML) attacks are one of
the most common types of non-invasive attacks against PUFs, whose popularity
stems from their characteristics, namely being cost-effective and non-destructive.
Moreover, these attacks require the adversary to solely observe the input-output
(i.e., so called challenge-response) behavior of the targeted PUF. In this at-
tack scenario, a relatively small subset of challenges along with their respective
responses is collected by the adversary, attempting to come up with a model
describing the challenge-response behavior of the PUF. In addition to heuris-
tic learning techniques, e.g., what has been proposed in [33, 34], the authors
of [12–14] have proposed the probably approximately correct (PAC) learning
framework to ensure the delivery of a model for prespecified levels of accuracy
and confidence. One of the key results reported in [12–14] is that knowing about
the mathematical model of the PUF functionality enables the adversary to estab-
lish a proper hypothesis representation (i.e., mathematical model of the PUF),
and then try to PAC learn this representation. This gives rise to the question
of whether a PUF can be PAC learned without prior knowledge of a precise
mathematical model of the PUF.
Bistable Ring PUFs (BR-PUF) [7] and Twisted Bistable Ring PUFs (TBR-
PUF) [37] are examples of PUFs, whose functionality cannot be easily translated
to a precise mathematical model. In an attempt, the authors of [37,41] suggested
simplified mathematical models for BR-PUFs and TBR-PUFs. However, their
models do not precisely reflect the physical behavior of these architectures.
In this paper, we present a sound mathematical machine learning framework,
which enables us to PAC learn the BR-PUF family (i.e., including BR- and
TBR-PUFs) without knowing their precise mathematical model. Particularly,
our framework contributes to the following novel aspects related to the security
assessment of PUFs in general:
Exploring the inherent mathematical properties of PUFs. One of the
most natural and commonly accepted mathematical representation of a PUF is
a Boolean function. This representation enables us to investigate properties of
PUFs, which are observed in practice, although they have not been precisely
and mathematically described. One of these properties exhaustively studied in
our paper is related to the “silent” assumption that each and every bit of a
challenge has equal influence on the respective response of a PUF. We prove that
this assumption is invalid for all PUFs. While this phenomenon has been already
occasionally observed in practice and is most often attributed to implementation
imperfections, we will give a rigorous mathematical proof on the existence of
influential bit positions, which holds for every PUF.
Strong ML attacks against PUFs without available mathematical
model. We prove that even in a worst case scenario, where the internal function-
ality of the BR-PUF family cannot be mathematically modeled, the challenge-
response behavior of these PUFs can be PAC learned for given levels of accuracy
and confidence.
Evaluation of the applicability of our framework in practice. In order
to evaluate the effectiveness of our theoretical framework, we conduct extensive
experiments on BR-PUFs and TBR-PUFs, implemented on a commonly used
Field Programmable Gate Array (FPGA).
2 Notation and preliminaries
This section serves as brief introduction into the required background knowledge
and known results to understand the approaches taken in this paper. For some
more complex topics we will occasionally refer the reader to important references.
2.1 PUFs
Note that elaborate and formal definitions as well as formalizations of PUFs
are beyond the scope of this paper, and for more details on them we refer the
reader to [3, 4]. In general, PUFs are physical input to output mappings, which
map given challenges to responses. Intrinsic properties of the physical primitive
embodying the PUF determine the characteristics of this mapping. Two main
classes of PUFs, namely strong PUFs and weak PUFs have been discussed in the
literature [16]. In this paper we consider the strong PUFs, briefly called PUFs.
Here we focus only on two characteristics of PUFs, namely unclonablity and
unpredictability (i.e., so called unforgeability). Let a PUF be described by the
mapping fPUF :C → Y , where fPUF(c) = y. In this paper, we assume that the
issue with noisy responses (i.e., the output is not stable for a given input) must
have been resolved by the PUF manufacturer. For an ideal PUF, unclonablity
means that for a given PUF fPUF it is virtually impossible to create another
physical mapping gPUF 6=fPUF, whose challenge-response behavior is similar to
fPUF [3].
Moreover, an ideal PUF is unpredictable. This property of PUFs is closely
related to the notion of learnability. More precisely, given a single PUF fPUF and
a set of challenge response pairs (CRPs) U={(c, y)|y=fPUF (c) and c∈ C}, it
is (almost) impossible to predict y0=fPUF(c0), where c0is a random challenge
so that (c0,·)/U. In this paper we stick to this (simple, but) classical definition
of unpredictability of a PUF, and refer the reader to [3, 4] for more refined
2.2 Boolean Functions as representations of PUFs
Defining PUFs as mappings (see Section 2.1), the most natural mathemati-
cal model for them are Boolean functions over the finite field F2. Let Vn=
{c1, c2, . . . , cn}denote the set of Boolean attributes or variables, where each at-
tribute can be true or false, commonly denoted by “1” and “0”, respectively.
In addition, Cn={0,1}ncontains all binary strings with nbits. We associate
each Boolean attribute ciwith two literals, i.e., ci, and ci(complement of ci). An
assignment is a mapping from Vnto {0,1}, i.e., the mapping from each Boolean
attribute to either “0” or “1”. In other words, an assignment is an n-bits string,
where the ith bit of this string indicates the value of ci(i.e., “0” or “1”).
An assignment is mapped by a Boolean formula into the set {0,1}. Thus,
each Boolean attribute can also be thought of as a formula, i.e., ciand ciare
two possible formulas. If by evaluating a Boolean formula under an assignment
we obtain “1”, it is called a positive example of the “concept represented by
the formula” or otherwise a negative example. Each Boolean formula defines
a respective Boolean function f:Cn→ {0,1}. The conjunction of Boolean at-
tributes (i.e., a Boolean formula) is called a term, and it can be true or false (“1”
or “0”) depending on the value of its Boolean attributes. Similarly, a clause that
is the disjunction of Boolean attributes can be defined. The number of literals
forming a term or a clause is called its size. The size 0 is associated with only
the term true, and the clause false.
In the related literature several representations of Boolean functions have
been introduced, e.g., juntas, Monomials (Mn), Decision Trees (DTs), and De-
cision Lists (DLs), cf. [29, 31].
A Boolean function depending on solely an unknown set of kvariables is
called a k-junta. A monomial Mn,k defined over Vnis the conjunction of at most
kclauses each having only one literal. A DT is a binary tree, whose internal
nodes are labeled with a Boolean variable, and each leaf with either “1” or “0”.
A DT can be built from a Boolean function in this way: for each assignment a
unique path form the root to a leaf should be defined. At each internal node, e.g,
at the ith level of the tree, depending on the value of the ith literal, the labeled
edge is chosen. The leaf is labeled with the value of the function, given the
respective assignment as the input. The depth of a DT is the maximum length
of the paths from the root to the leafs. The set of Boolean functions represented
by decision trees of depth at most kis denoted by k-DT. A DL is a list Lthat
contains rpairs (f1, v1),...,(fr, vr), where the Boolean formula fiis a term and
vi∈ {0,1}with 1 ir1. For i=r, the formula fris the constant function
vr= 1. A Boolean function can be transformed into a decision list, where for
a string cCnwe have L(c) = vj, where jis the smallest index in Lso that
fj(c) = 1. k-DL denotes the set of all DLs, where each fiis a term of maximum
size k.
Linearity of Boolean Functions Here, our focus is on Boolean linearity, which
must not be confused with the linearity over other domains different from F2.
A linear Boolean function f:{0,1}n→ {0,1}features the following equivalent
properties, cf. [29]:
c, c0∈ {0,1}n:f(c+c0) = f(c) + f(c0)
a∈ {0,1}n:f(c) = a·c.
Equivalently, we can define a linear Boolean function fas follows. There is some
set S⊆ {1, . . . , n}such that f(c) = f(c1, c2, . . . , cn) = PiSci.
Boolean linearity or linearity over F2is closely related to the notion of cor-
relation immunity. A Boolean function fis called k-correlation immune, if
for any assignment cchosen randomly from {0,1}nit holds that f(c) is inde-
pendent of any k-tuple (ci1, ci1, . . . , cik), where 1 i1< i2<· · · < ikn. Now
let deg(f) denote the degree of the F2-polynomial representation of the Boolean
function f. It is straightforward to show that such representation exists. Siegen-
thaler proved the following theorem, which states how correlation immunity can
be related to the degree of f.
Theorem 1. (Siegenthaler Theorem [29, 38]) Let f:{0,1}n→ {0,1}be a
Boolean function, which is k-correlation immune, then deg(f)nk.
Average Sensitivity of Boolean Functions The Fourier expansion of Boolean
functions serves as an excellent tool for analyzing them, cf. [29]. In order to de-
fine the Fourier expansion of a Boolean function f:Fn
2F2we should first
define an encoding scheme as follows. χ(0F2) := +1, and χ(1F2) := 1. Now the
Fourier expansion of a Boolean function can be written as
f(c) = X
where [n] := {1, . . . , n},χS(c) := QiSci, and ˆ
f(S) := Ec∈U [f(c)χS(c)]. Here,
Ec∈U [·] denotes the expectation over uniformly chosen random examples. The
influence of variable ion f:Fn
2F2is defined as
Infi(f) := Prc∈U [f(c)6=f(ci)],
where ciis obtained by flipping the i-th bit of c. Note that Infi(f) = PS3i(ˆ
cf. [29]. Next we define the average sensitivity of a Boolean function fas
I(f) :=
2.3 Our Learning Model
The Probably Approximately Correct (PAC) model provides a firm basis for
analyzing the efficiency and effectiveness of machine learning algorithms. We
briefly introduce the model and refer the reader to [23] for more details. In the
PAC model the learner, i.e., the learning algorithm, is given a set of examples to
generate with high probability an approximately correct hypothesis. This can be
formally defined as follows. Let F=n1Fndenote a target concept class that
is a collection of Boolean functions defined over the instance space Cn={0,1}n.
Moreover, according to an arbitrary probability distribution Don the instance
space Cneach example is drawn. Assume that hypothesis hFnis a Boolean
function over Cn, it is called an ε-approximator for fFn, if
[f(c) = h(c)] 1ε.
Let the mapping size :{0,1}nNassociate a natural number size(f) with
a target concept fFthat is a measure of complexity of funder a target
representation, e.g., k-DT. The learner is a polynomial-time algorithm denoted
by A, which is given labeled examples (c, f (c)), where cCnand fFn. The
examples are drawn independently according to distribution D. Now we can
define strong and weak PAC learning algorithms.
Definition 1 An algorithm Ais called a strong PAC learning algorithm for the
target concept class F, if for any n1, any distribution D, any 0< ε, δ < 1,
and any fFnthe follwing holds. When Ais given a polynomial number of
labeled examples, it runs in time polynomial in n,1,size(f),1, and returns
an ε-approximator for funder D, with probability at least 1δ.
The weak learning framework was developed to answer the question whether
a PAC learning algorithm with constant but insufficiently low levels of εand δ
can be useful at all. This notion is defined as follows.
Definition 2 For some constant δ > 0let algorithm Areturn with probability
at least 1δan (1/2γ)-approximator for f, where γ > 0.Ais called a weak
PAC learning algorithm, if γ= Ω (1/p(n,size(f)) for some polynomial p(·).
The equivalence of weak PAC learning and strong PAC learning has been
proved by Freund and Schapire in the early nineties in their seminal papers [9,35].
For that purpose boosting algorithms have been introduced.
Definition 3 An algorithm Bis called a boosting algorithm if the following
holds. Given any fFn, any distribution D,0< ε, δ < 1,0< γ 1/2, a
polynomial number of labeled examples, and a weak learning algorithm WL re-
turning an (1/2γ)-approximator for f, then Bruns in time, which is polyno-
mial in n,size(f),1,1,1and generates with probability at least 1δan
ε-approximator for funder D.
Algorithm 1 Canonical Booster
Require: Weak PAC learner WL, 0 < ε, δ < 1, 0 < γ 1/2, p olynomial number of examples, ithat
is the number of iterations
Ensure: Hypothesis hthat is an ε-approximator for f
1: D0=D, use WL to generate an approximator h0for funder D0
2: k= 1
3: while ki1do
4: Build a distribution Dkconsisting of examples, where the previous approximators
h0,··· , hk1can predict the value of fpoorly
5: use WL to generate an approximator hkfor funder Dk
6: k=k+ 1
7: od
8: Combine the hypotheses h0,··· , hi1to obtain h, where each hiis an (1/2γ)-approximator
for funder Di, and finally his an ε-approximator for funder D
9: return h
The construction of virtually all existing boosting algorithms is based pri-
marily on the fact that if WL is given examples drawn from any distribution D0,
WL returns a (1/2γ)-approximator for funder D0. At a high-level, the skeleton
of all such boosting algorithms is shown in Algorithm 1.
2.4 Non-linearity of PUFs over F2and the Existence of Influential
Section 2.2 introduced the notion of Boolean linearity. Focusing on this notion
and taking into account the definition of PUFs mentioned in Section 2.1, now we
prove the following theorem that is our first important result. For all PUFs, when
represented as a Boolean function, it holds that their degree as F2-polynomial
is strictly greater than one. This will then lead us to the following dramatic
consequence. There exists no PUF, in which all of its challenge bits have an
equal influence.
Theorem 2. For every PUF fPUF :{0,1}n→ {0,1}, we have deg(fPUF )2.
Consequently, for every PUF it holds that not all bit positions within respective
challenges are equally influential in generating the corresponding response.
Proof: Towards contradiction assume that fPUF is Boolean linear over F2and
unpredictable. From the unpredictability of fPUF it follows that the adversary
has access to a set of CRPs U={(c, y)|y=fPUF(c) and c∈ C}, which are
chosen uniformly at random, however, the adversary has only a negligible prob-
ability of success to predict a new random challenge (c0,·)/U(as he cannot
apply fPUF to this unseen challenge). Note that the size of Uis actually polyno-
mial in n. Now, by the definition of linearity over F2, cf. Section 2.2, we deduce
that the only linear functions over F2are the Parity functions, see also [29,38].
However, there are well-known algorithms to PAC learn Parity functions in gen-
eral [8, 20]. Thus, now we simply feed the right number of samples from our
CRP set Uinto such a PAC learner. For the right parameter setting, the re-
spective PAC algorithm delivers then with high probability an ε-approximator
hfor our PUF fPUF such that Pr[f(c0) = h(c0)] 1ε. This means that with
Fig. 1: (a) The logical circuit of an SRAM cell. (b) The small signal model of
bistable element in metastability
high probability, the response to every randomly chosen challenge can be cal-
culated in polynomial time. This is of course a contradiction to the definition
of fPUF, being a PUF. Hence, fPUF cannot be linear over F2. In other words,
for every PUF fPUF we have deg(fPUF)2. Moreover, in conjunction with the
above mentioned Siegenthaler Theorem, we deduce that every PUF is at most
an n2-correlation immune function, which indeed means that not all of its
challenge bits have an equal influence on the respective PUF response.
Theorem 2 states that every PUF has some challenge bits, which have some
larger influence on the responses than other challenge bits. We call these bits
“loosely” as influential bits1.
3 PUF Architectures
In this section, we explain the architectures of two intrinsic silicon PUFs, namely
the BR- and TBR-PUFs, whose internal mathematical models are more compli-
cated than other intrinsic PUF constructions. In an attempt, we apply simple
models to describe the functionality of these PUFs. However, we believe that
these models cannot completely reflect the real characteristics of the BR-PUF
family, and their concrete, yet unknown model should be much more complex.
3.1 Memory-Based PUFs
BR-PUFs can be thought of as a combination of memory-based and delay-based
PUFs. Memory-based PUFs exploit the settling state of digital memory circuits,
e.g., SRAM cells [16, 21] consisting of two inverters in a loop (see Figure 1a)
and two transistors for read and write operation. Due to manufacturing pro-
cess variations the inverters have different electrical gains, when the cell is in
the metastable condition. In the metastable condition the voltage of one of the
inverters is equal to Vm, where Vmis an invalid logic level. Moreover, the in-
vertes have different propagation delays due to the differences in their output
resistance and load capacitance. One can model the SRAM cell architecture as
1Note that the existence of such influential bits has been also noticed by several other
experimental research papers. However, none of them has been able to correctly and
precisely pinpoint the mathematical origin of this phenomenon.
c[1] c[2] c[i] c[i+1]
Fig. 2: The schematic of a BR-PUF with nstages. The response of the PUF can
be read between two arbitrary stages. For a given challenge, the reset signal can
be set low to activate the PUF. After a transient period , the BR-PUF might be
settled to an allowed logical state.
a linear amplifier with gain G, when Vinitial is close to the metastable voltage
Vm[40], see Figure 1b. In order to predict the metastable behavior, we have [40]
Vinitial(0) = Vm+V(0),
where V(0) is a small signal offset from the metastable point. To derive V(t) we
can write the equation of the circuit as follows.
R=C·dV (t)
dt .
By solving this equation, we obtain V(t) = V(0) ·et/τs, where τs=RC/G
1, c.f. [40]. The time required to reach a stable condition increases as Vinitial
approaches the metastable point and V(0) approaches 0. On the other hand, it
can approach infinity, if V(0) = 0, however, in practice this is not the case due
to the presence of noise. Nevertheless, there is no upper bound on the settling
time of the SRAM cell to one of the stable states. Therefore, the settling state of
the SRAM cells cannot be predicted after power-on. One can thus use the logical
addresses of SRAM cells as different challenges and the state of the SRAM cells
after power-on as PUF responses.
3.2 Bistable Ring PUF
SRAM PUFs are believed to be secure against modeling attacks. This can be
explained by the fact that knowing the state of one SRAM PUF after power-on
does not help the attacker to predict the response of other SRAM cells. However,
in contrast to delay-based PUFs, e.g., arbiter PUFs [25], the challenge space of
an SRAM PUF is not exponential. Therefore, if an adversary gets access to
the initial values stored in the SRAM cells, the challenge-response behavior of
the SRAM PUF can be emulated. In order to combine the advantages offered by
c[1] c[2] c[n-1] c[n]
Fig. 3: The schematic of a TBR-PUF with nstages. The response of the PUF is
read after the last stage. For a given challenge, the reset signal can be set low
to activate the PUF. After a transient period, the BR-PUF might be settled to
an allowed logical state.
delay-based PUFs and memory-based PUFs, namely, exponential challenge space
and the unpredictability, a new architecture called BR-PUF was introduced by
[7]. A BR-PUF consists of nstages (nis an even number), where each stage
consists of two NOR gates, one demultiplexer and one multiplexer, see Figure 2.
Based on the value of the ith bit of a challenge applied to the ith stage, one of
the NOR gates is selected. Setting the reset signal to low, the signal propagates
in the ring, which behaves like an SRAM cell with a larger number inverters.
The response of the PUF is a binary value, which can be read from a predefined
location on the ring between two stages, see Figure 2.
The final state of the inverter ring is a function of the gains and the propa-
gation delays of the gates. According to the model of the SRAM circuit in the
metastable state provided in Section 3.1, one might be able to extend the elec-
trical model and analyze the behavior of the inverter ring. Applying a challenge,
the ring may settle at a stable state after a oscillation time period. However,
for a specific set of challenges the ring might stay in the metastable state for an
infinite time, and the oscillation can be observed in the output of the PUF.
The analytical models of the metastable circuits introduced in Section 3.1
are valid for an ASIC implementation and respective simulations. Although few
simulation results of BR-PUF are available in the literature, to the best of our
knowledge there are no results for a BR-PUF implemented on an ASIC, and ex-
perimental results have been limited to FPGA implementations. In this case, the
BR-PUF model can be further simplified by considering the internal architecture
of the FPGAs. The NOR gates of the BR-PUF are realized by dedicated Lookup
Tables (LUTs) inside an FPGA. The output of the LUTs are read from one of
the memory cells of the LUT, which have always stable conditions. Hence, it
can be assumed that there is almost no difference in the gains of different LUTs.
As a result, the random behavior of the BR-PUF could be defined by the delay
differences between the LUTs.
3.3 Twisted Bistable Ring PUF
Although the mathematical model of the functionality of a BR-PUF is unknown,
it has been observed that this construction is vulnerable to bias and simple linear
Fig. 4: Our roadmap for proving the PAC learnability of BR-PUF family, whose
mathematical model is unknown
approximations [37]. Hence, the TBR-PUF, as an enhancement to BR-PUFs, has
been introduced [37]. Similar to BR-PUFs, a TBR-PUF consists of nstages (n
is an even number), where each stage consists of two NOR gates. In contrast
to BR-PUF, where for a given challenge only one of the NOR gates in each
stage is selected, all 2ngates are selected in a TBR-PUF. This can be achieved
by placing two multiplexers before and two multiplexers after each stage and
having feedback lines between different stages, see Figure. 3. As all NOR gates
are always in the circuit, the challenge specific bias can be reduced.
4 PAC Learning of PUFs without Prior Knowledge of
Their Mathematical Model
When discussing the PAC learnability of PUFs as a target concept, two scenar-
ios should be distinguished. First, the precise mathematical model of the PUF
functionality is known, and hence, a hypothesis representation is known to learn
the PUF. This scenario has been considered in several studies, e.g., [12–14],
where different hypothesis representations have been presented for each individ-
ual PUF family. Second, due to the lack of a precise mathematical model of the
respective PUF functionality, to learn the PUF a more sophisticated approach
is required. Therefore, the following question arises: is it possible to PAC learn a
PUF family, even if we have no mathematical model of the physical functionality
of the respective PUF family? We answer this question at least for the BR-PUF
family. Our roadmap for answering this question, more specifically, the steps
taken to prove the PAC learnability of BR-PUF family in the second scenario,
is illustrated in Figure 4. While theoretical insights into the notions related to
the first two blocks have been presented in Section 2.4, which are valid for all
PUF families, Section 4.1 provides more specific results for the BR-PUF family
(i.e., . According to these new insights, in Section 4.2 we eventually prove that
BR-PUF family (which lack a precise mathematical model) can nevertheless be
PAC learned (see last two blocks in Figure 4).
4.1 A Constant Upper Bound on the Number of Influential Bits
First, we reflect the fact that our Theorem 2 is in line with the empirical results
obtained by applying heuristic approaches, which are reported in [37, 42]. Al-
though here we compare their results for BR- and TBR-PUFs with our results,
Table 1: Statistical analysis of the 2048 CRPs, given to a 64-bit BR-PUF [42].
The first column shows the rule found in the samples, whereas the second column
indicates the estimated probability of predicting the response.
Rule Est. Pr.
(c1= 0) y= 1 0.684
(c9= 0) (c6= 1) y= 1 0.762
(c25 = 0) (c18 = 1) (c1= 0) y= 1 0.852
(c27 = 0) (c25 = 0) (c18 = 1) (c6= 1) y= 1 0.932
(c53 = 0) (c51 = 0) (c45 = 0) (c18 = 1) (c7= 0) y= 1 1
our proof of having influential bits in PUF families in general, speaks for itself,
and is one of the novel aspects of this paper.
In an attempt to assess the security of BR-PUFs, Yamamoto et al. have im-
plemented BR-PUFs on several FPGAs to analyze the influence of challenge bits
on the respective responses [42]. They have explicitly underlined the existence
of influential bits, and found so called prediction rules. Table 1 summarizes their
results, where for each type of the rules (monomials of different sizes) we report
only the one with the highest estimated response prediction probability. In ad-
dition to providing evidence for the existence of influential bits, the size of the
respective monomials is of particular importance for us. As shown in Table 1,
their size is surprisingly small, i.e., only five.
Similarly, the authors of [37] translate the influence of the challenge bits to
the weights needed in artificial neural networks that represent the challenge-
response behavior of BR-PUFs and the TBR-PUFs. They observed that there
is a pattern in these weights, which models the influence of the challenge bits. It
clearly reflects the fact that there are influential bits determining the response of
the respective PUF to a given challenge. From the results presented in [37], we
conclude that there is at least one influential bit, however, the precise number
of influential bits has not been further investigated by the authors.
Inspired by the above results from [37,42], we conduct further experiments.
We collect 30000 CRPs from BR-PUFs and TBR-PUFs implemented on Altera
Cyclone IV FPGAs. In all of our PUF instances at least one influential bit is
found, and the maximum number of influential bits (corresponding to the size of
the monomials) is just a constant value in all cases . For the sake of readability,
we present here only the results obtained for one arbitrary PUF instance.
Our results shown in Table 2 are not only aligned with the results reported
in [37, 42], but also reflect our previous theoretical findings. We could conclude
this section as follows. There is at least one influential bit determining the re-
sponse of a BR-PUF (respectively, TBR-PUF) to a given challenge. However,
for the purpose of our framework their existence is not enough, and we need an
upper bound on the number of influential bits.
Looking more carefully into the three different datasets, namely our own and
the data reported in [37, 42], we observe that the total number of influential
bits is always only a very small value. Motivated by this commonly observed
phenomenon, we compute for our PUFs (implemented on FPGAs) the average
Table 2: Our statistical analysis of the 30000 CRPs, given to a 64-bit BR-PUF.
The first column shows the rule found in the sample, whereas the second column
indicates the estimated probability of predicting the response.
Rule Est. Pr.
(c61 = 1) y= 1 0.71
(c11 = 1) y= 1 0.72
(c29 = 1) y= 1 0.725
(c39 = 1) y= 1 0.736
(c23 = 1) y= 1 0.74
(c46 = 1) y= 1 0.745
(c50 = 1) y= 1 0.75
(c61 = 1) (c23 = 1) y= 1 0.82
(c61 = 1) (c11 = 0) y= 1 0.80
(c23 = 1) (c46 = 1) y= 1 0.86
(c39 = 1) (c50 = 1) y= 1 0.85
(c61 = 1) (c11 = 1) (c29 = 1) y= 1 0.88
(c50 = 1) (c23 = 1) (c46 = 1) y= 1 0.93
(c50 = 1) (c23 = 1) (c46 = 1) (c39 = 0) y= 1 0.97
(c50 = 1) (c23 = 1) (c11 = 0) (c39 = 0) (c29 = 1) y= 1 0.98
(c50 = 1) (c23 = 1) (c46 = 1) (c39 = 0) (c29 = 1) y= 1 0.99
(c50 = 1) (c23 = 1) (c46 = 1) (c39 = 0) (c29 = 1) (c11 = 0) y= 1 0.994
(c50 = 1) (c23 = 1) (c46 = 1) (c39 = 0) (c29 = 1) (c61 = 0) y= 1 0.995
(c50 = 1) (c23 = 1) (c46 = 1) (c39 = 0) (c29 = 1) (c61 = 1) (c11 = 0) y= 1 1
sensitivity of their respective Boolean functions2. Averaging over many instances
of our BR-PUFs, we obtain the results shown in Table 3 (TBR-PUFs scored
similarly). This striking result3lead us to the following plausible heuristic.
“Constant Average Sensitivity of BR-PUF family”: for all practical val-
ues of nit holds that the average sensitivity of a Boolean function associated
with a physical n-bit PUF from the BR-PUF family is only a constant value.
Finally, some relation between the average sensitivity and the strict avalanche
criterion (SAC) can be recognized, although we believe that the average sensi-
tivity is a more direct metric to evaluate the security of PUFs under ML attacks.
2As explained in Section 2.2, for a Boolean function f, the influence of a variable
and the total average sensitivity can be calculated by employing Fourier analysis.
However, in practice this analysis is computationally expensive. Instead, it suffices
to simply approximate the respective average sensitivity. This idea has been exten-
sively studied in the learning theory-related and property testing-related literature
(see [22], for a survey). Here we describe how the average sensitivity of a Boolean
function, representing a PUF, can be approximated. We follow the simple and ef-
fective algorithm as explained in [32]. The central idea behind their algorithm is to
collect enough random pairs of labeled examples from the Boolean function, which
have the following property: (c, f (c)) and (ci, f(ci)), i.e., the inputs differ on a
single Boolean variable.
3Note that it is a known result and being folklore, cf. [29], that randomly chosen n-bit
Boolean functions have an expected average sensitivity of exactly n/2.
Table 3: The average sensitivity of n-bit BR-PUFs.
n The average sensitivity
4 1.25
8 1.86
16 2.64
32 3.6
64 5.17
4.2 Weak Learning and Boosting of BR-PUFs
The key idea behind our learning framework is the provable existence of influ-
ential bits for any PUF and the constant average sensitivity of BR-PUFs in our
scenario. These facts are taken into account to prove the existence of weak learn-
ers for the BR-PUF family. We start with the following theorem (Theorem 3)
proved by Friedgut [11].
Theorem 3. Every Boolean function f:{0,1}n→ {0,1}with I(f) = kcan
be ε-approximated by another Boolean function hdepending on only a constant
number of Boolean variables K, where K= exp (2 + p2εlog2(4k/ε)/k)k
ε > 0is an arbitrary constant.
We explain now how Theorem 3 in conjunction with the results presented in
Section 4.1 help us to prove the existence of a weak learner (Definition 2) for
the BR-PUF family.
Theorem 4. Every PUF from the BR-PUF family is weakly learnable.
Proof: For an arbitrary PUF from the BR-PUF family, consider its associated
but unknown Boolean function that is denoted by fPUF (i.e., our target concept).
Our weak learning framework has two main steps. In the first step, we identify
a (weak) approximator for fPUF, and in the second step this approximator is
PAC learned (in a strong sense). Still, we can guarantee only that the total
error of the learner does not exceed 1/2γ, where γ > 0, as we start with a
weak approximator of fPUF. The first step relies on the fact that Theorem 2
ensures the existence of influential bits for fPUF, while we can also upper bound
I(fPUF) by some small constant value kdue to the Constant Average Sensitivity
heuristic. According to the Theorem 3 there is a Boolean function hthat is an
ε-approximator of fPUF, which depends only on a constant number of Boolean
variables Ksince kand εare constant values, independent of n. However, note
that hdepends on an unknown set of Kvariables. Thus, our Boolean function
his a so called K-junta function, cf. [29]. More importantly, for constant Kit
is known that the K-junta function can be PAC learned by a trivial algorithm
within OnKsteps, cf. [2, 5, 6]. This PAC algorithm is indeed our algorithm
WL that weakly learns fPUF. Carefully choosing the parameters related to our
approximators as well as the PAC learning algorithm, we ensure that WL returns
a 1/2γ-approximator for fPUF and some γ > 0.
Applying now the canonical booster introduced in Section 2.3 to our WL
proposed in the proof of Theorem 4 and according to Definition 3, our weak
(a) (b) (c)
Fig. 5: The settling time of the BR-PUF response: (a) the PUF response after a
transient time reaches a stable logical state “1”. (b) after a transient time the
PUF response is “0”. (c) the PUF response does not settle and oscillates for an
undefined time period.
learning algorithm can be transformed into an efficient and strong PAC learning
Corollary 1 BR-PUFs are strong PAC learnable, regardless of any mathemat-
ical model representing their challenge-response behavior.
5 Results
5.1 PUF implementation
We implement BR and TBR-PUFs with 64 stages on an Altera Cyclone IV
FPGA, manufactured on a 60nm technology [1]. It turns out that most PUF
implementations are highly biased towards one of the responses. Therefore, we
apply different manual routing and placement configurations to identify PUFs
with a minimum bias in their responses. However, it is known that by reducing
the bias in PUF responses, the number of noisy responses increases [27].
Finding and resolving the noisy responses are two of the main challenges in
the CRP measurement process. In almost all PUF constructions it can be pre-
dicted, at which point in time a generated response is valid and can be measured.
For instance, for an arbiter PUF one can estimate the maximum propagation
delay (evaluation period) between the enable point and the arbiter. After this
time period the response is in a valid logical level (either “0” or “1”) and does
not change, and afterwards by doing majority voting on the responses generated
for a given challenge the stable CRPs can be collected. However, in the case of
BR-PUF family, for a given challenge the settling time of the response to a valid
logical level is not known a priori, see Figure 5. Furthermore, it is not known
whether the response to a given challenge would not be unstable after observing
the stable response during some time period (see Section 3.1). Therefore, the
majority voting technique cannot be employed for BR-PUFs and TBR-PUFs.
To deal with this problem, for a given challenge we read the response of the PUF
at different points in time, where at each point in time 11 measurements are con-
ducted additionally. We consider a response being stable, if it is the same at all
these different measurement time points. Otherwise, the response is considered
being unstable, and the respective CRP is excluded from our dataset.
In order to observe the impact of the existing influential bits on our PUF re-
sponses, first we apply a large set of challenges chosen uniformly at random, and
(a) (b)
Fig. 6: The impact of the influential bits on the responses of the PUF: (a) the
response of the PUF is “0”. (b) unstable responses. Here the y-axis shows the
percentage of the challenges, whose bits are set to either “0” or “1”, whereas the
x-axis shows the bit position.
then measure their respective responses. Afterwards, for both possible responses
of the PUF (i.e., “0” and “1”) we count the number of challenge bits, which are
set to either “0” or “1”, see Figure 6. It can be seen that some challenge bits
are more influential towards a certain response. These results are the basis for
our statistical analysis presented in Section 4.1. We also repeat this experiment
in the scenario, where the response of the PUF is unstable — in this case we
observe almost no influential challenge bits. The most important conclusion that
we can draw from these experiments is that a PUF with stable responses has
at least one influential bit, which can already predict with low probability the
response of the PUF to a respective challenge.
5.2 ML results
To evaluate the effectiveness of our learning framework, we conduct experiments
on CRPs collected from our PUF, whose implementation is described in Sec-
tion 5.1. As discussed and proved in Section 4, having influential bits enables us
to define a prediction rule, where this rule can serve as a hypothesis representa-
tion, which fulfills the requirements of a weak learner. The algorithm WL proposed
in the proof of the Theorem 4 relies on the PAC learnability of K-juntas, where
Kis a small constant. However, it is known that every efficient algorithm for
learning K-DTs (i.e., the number of leaves is 2K) is an efficient algorithm for
learning K-juntas, see, e.g., [28]. Furthermore, it is known that DLs generalize
K-DTs [31]. Moreover, a monomial Mn,K is a very simple type of a K-junta,
where only the conjunction of the relevant variables is taken into account. There-
fore, for our experiments we decide to let our weak learning algorithms deliver
DLs, Monomials, and DTs.
To learn the challenge-response behavior of BR- and TBR-PUFs using these
representations, we use the open source machine learning software Weka [17]. One
may argue that more advanced tools might be available, but here we only aim to
demonstrate that publicly accessible, and off-the-shelf software can be used to
launch our proposed attacks. All experiments are conducted on a MacBook Pro
with 2.6 GHz Intel Core i5 processor and 10GB of RAM. To boost the prediction
Fig. 7: The relation between the theoretical upper bound on the error of the
final model returned by Adaboost, the number of iterations, and K. The graph
is plotted for k= 2, ε0= 0.01, and n= 64. Here, ε0= 0.01 denotes the error of
the K-junta learner.
accuracy of the model established by our weak learners, we apply the Adaptive
Boosting (AdaBoost) algorithm [10]; nevertheless, any other boosting framework
can be employed as well. For Adaboost, it is known that the error of the final
model delivered by the boosted algorithm after Titeration is theoretically upper
bounded by QT
t=1 p14γ2, c.f. [36]. To provide a better understanding of the
relation between K, the number of iterations, and the theoretical bound on the
error of the final model, a corresponding graph4is shown in Figure 7.
Our experiments in Weka consist of a training phase and a testing phase. In
the training phase a model is established from the training data based on the
chosen representation. Afterwards, the established model is evaluated on the test
set, which contains an unseen subset of CRPs. The size of the training sets in our
experiments are 100 and 1000, whereas the test set contains 30000 CRPs. Our
experiments demonstrate that the weak learning of our test set always results in
the delivery of a model with more than 50% accuracy as shown in the first rows
of Table 4 and Table 5.
By boosting the respective models with AdaBoost, the accuracy is dramati-
cally increased, see Table 4 and Table 5. It can be observed that after 50 iterations
of Adaboost applied to the weak model generated from 100 CRPs, the predic-
tion accuracy of the boosted model is increased to more than 80% for all three
representations. By increasing the number of samples to 1000 CRPs, the predic-
tion accuracy is further increased up to 98.32 % for learning the BR-PUFs, and
99.37 % for learning the TBR-PUFs under DL representations. It is interesting
to observe that the simplest representation class, i.e., Monomials clearly present
the greatest advantage given by the boosting technique. As explained in [36] this
is due to avoiding any overfitting tendency.
4Note that at first glance the graph may seem odd as after a few iterations the error
is close to 1, although we start from a weak learner, whose error rate is strictly
below 0.5. As explained in [36, pp. 57-60], and shown in their Figure 3.1, this is due
to Adaboosts’s theoretical worst-case analysis, which is only asymptotically (in T)
Table 4: Experimental results for learning 64-bit BR-PUF and TBR-PUF, when
m= 100. The accuracy (1 ε) is reported for three weak learners. The first
row shows the accuracy of the weak learner, whereas the other rows show the
accuracy of the boosted learner.
# boosting iterations BR-PUF TBR-PUF
0 (No Boosting) 54.48 % 66.79 % 67.24 % 65.18 % 72.29 % 74.84 %
10 67.12 % 74.25 % 76.99 % 76.96 % 79.22 % 81.36 %
20 77.53 % 80.53 % 80.89 % 82.05 % 85.73 % 86.71 %
30 81.32 % 83.13 % 83.14 % 84.93 % 88.34 % 89.4 %
40 82.65 % 83.91 % 84.6 % 88.11 % 89.67 % 90.22 %
50 82.65 % 85.62 % 85.5 % 90.05 % 89.69 % 91.58 %
Table 5: Experimental results for m= 1000 (the same setting as for the Table 4).
# boosting iterations BR-PUF TBR-PUF
0 (No Boosting) 63.73 % 75.69 % 84.59 % 64.9 % 75.6 % 84.34 %
10 81.09 % 85.49 % 94.2 % 79.9 % 87.12 % 95.05 %
20 89.12 % 91.08 % 96.64 % 88.28 % 91.57 % 97.89 %
30 93.24 % 93.24 % 97.50 % 93.15 % 93.9 % 98.75 %
40 95.69 % 94.28 % 97.99 % 96.73 % 95.05 % 99.13 %
50 96.80 % 95.04 % 98.32 % 98.4 % 95.96 % 99.37 %
6 Conclusion
As a central result, which speaks for itself, we have proved that in general the
responses of all PUF families are not equally determined by each and every bit
of their respective challenges. Moreover, the present paper has further addressed
the issue of strong PAC learning of the challenge-response behavior of PUFs,
whose functionality lacks a precise mathematical model. We have demonstrated
that representing BR- and TBR-PUFs by Boolean functions, we are able to
precisely describe the characteristics of these PUFs as observed in practice. This
fact results in developing a new and generic machine learning framework that
strongly PAC learns the challenge-response behavior of the BR-PUF family.
The effectiveness and applicability of our framework have also been evaluated by
conducting extensive experiments on BR-PUFs and TBR-PUFs implemented on
FPGAs, similar to experimental platforms used in the most relevant literature.
Last but not least, although our strong PAC learning framework has its own
novelty value, we feel that our Theorem 3 and the precise mathematical descrip-
tion of the characteristics of BR-PUFs and TBR-PUFs are the most important
aspects of our paper. We strongly believe that this description can help to fill
the gap between the mathematical design of cryptographic primitives and the
design of PUFs in real world. As an evidence thereof, we feel that the Siegen-
thaler Theorem and the Fourier analysis that are well-known and widely used
in modern cryptography may provide special insights into the physical design of
secure PUFs in the future.
We would like to thank Prof. Dr. Frederik Armknecht for the fruitful discussion
as well as pointing out the Siegenthaler’s paper. Furthermore, the authors greatly
appreciate the support that they received from Helmholtz Research School on
Security Technologies.
1. Altera: Cyclone IV Device Handbook. Altera Corporation, San Jose (2014)
2. Angluin, D.: Queries and Concept Learning. Machine Learning 2(4), 319–342
3. Armknecht, F., Maes, R., Sadeghi, A., Standaert, O.X., Wachsmann, C.: A For-
malization of the Security Features of Physical Functions. In: Security and Privacy
(SP), 2011 IEEE Symp. on. pp. 397–412 (2011)
4. Armknecht, F., Moriyama, D., Sadeghi, A.R., Yung, M.: Towards a Unified Secu-
rity Model for Physically Unclonable Functions. In: Topics in Cryptology-CT-RSA
2016: The Cryptographers’ Track at the RSA Conf. vol. 9610, p. 271. Springer
5. Arvind, V., K¨obler, J., Lindner, W.: Parameterized Learnability of K-juntas and
Related Problems. In: Algorithmic Learning Theory. pp. 120–134. Springer (2007)
6. Blum, A.L., Langley, P.: Selection of Relevant Features and Examples in Machine
Learning. Artificial Intelligence 97(1), 245–271 (1997)
7. Chen, Q., Csaba, G., Lugli, P., Schlichtmann, U., R¨uhrmair, U.: The Bistable
Ring PUF: A New Architecture for Strong Physical Unclonable Functions. In:
Hardware-Oriented Security and Trust (HOST), 2011 IEEE Intl. Symp. on. pp.
134–141. IEEE (2011)
8. Fischer, P., Simon, H.U.: On Learning Ring-Sum-Expansions. SIAM Journal on
Computing 21(1), 181–192 (1992)
9. Freund, Y.: Boosting a Weak Learning Algorithm by Majority. Information and
Computation 121(2), 256–285 (1995)
10. Freund, Y., Schapire, R.E.: A Decision-Theoretic Generalization of On-line Learn-
ing and an Application to Boosting. Journal of Comp. and System Sciences 55(1),
119–139 (1997)
11. Friedgut, E.: Boolean Functions with Low Average Sensitivity Depend on Few
Coordinates. Combinatorica 18(1), 27–35 (1998)
12. Ganji, F., Tajik, S., Seifert, J.P.: Let Me Prove it to You: RO PUFs are Provably
Learnable, The 18th Annual Intl Conf. on Information Security and Cryptology
13. Ganji, F., Tajik, S., Seifert, J.P.: Why Attackers Win: On the Learnability of XOR
Arbiter PUFs. In: Trust and Trustworthy Computing, pp. 22–39. Springer (2015)
14. Ganji, F., Tajik, S., Seifert, J.P.: PAC Learning of Arbiter PUFs. Journal of Cryp-
tographic Engineering Special Section On Proofs 2014, 1–10 (2016)
15. Gassend, B., Clarke, D., Van Dijk, M., Devadas, S.: Silicon Physical Random Func-
tions. In: Proc. of the 9th ACM Conf. on Comp. and Communications Security.
pp. 148–160 (2002)
16. Guajardo, J., Kumar, S.S., Schrijen, G.J., Tuyls, P.: FPGA Intrinsic PUFs and
their Use for IP Protection. In: Cryptographic Hardware and Embedded Systems–
CHES 2007, pp. 63–80. Springer (2007)
17. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The
WEKA Data Mining Software: An Update. ACM SIGKDD Explorations Newslet-
ter 11(1), 10–18 (2009)
18. Helfmeier, C., Boit, C., Nedospasov, D., Seifert, J.P.: Cloning Physically Unclon-
able Functions. In: Hardware-Oriented Security and Trust (HOST), 2013 IEEE
Intl. Symp. on. pp. 1–6 (2013)
19. Helfmeier, C., Nedospasov, D., Tarnovsky, C., Krissler, J.S., Boit, C., Seifert, J.P.:
Breaking and Entering through the Silicon. In: Proc. of the 2013 ACM SIGSAC
Conf. on Comp. & Communications Security. pp. 733–744. ACM (2013)
20. Helmbold, D., Sloan, R., Warmuth, M.K.: Learning Integer Lattices. SIAM Journal
on Computing 21(2), 240–266 (1992)
21. Holcomb, D.E., Burleson, W.P., Fu, K.: Initial SRAM State as a Fingerprint and
Source of True Random Numbers for RFID Tags. In: Proc. of the Conf. on RFID
Security. vol. 7 (2007)
22. Kalai, G., Safra, S.: Threshold Phenomena and Influence: Perspectives from Mathe-
matics, Comp. Science, and Economics. Computational Complexity and Statistical
Physics, St. Fe Inst. Studies in the Science of Complexity pp. 25–60 (2006)
23. Kearns, M.J., Vazirani, U.V.: An Introduction to Computational Learning Theory.
MIT press (1994)
24. Koushanfar, F.: Hardware Metering: A Survey. In: Introduction to Hardware Se-
curity and Trust, pp. 103–122. Springer (2012)
25. Lee, J.W., Lim, D., Gassend, B., Suh, G.E., Van Dijk, M., Devadas, S.: A Technique
to Build a Secret Key in Integrated Circuits for Identification and Authentication
Applications. In: VLSI Circuits, 2004. Digest of Technical Papers. 2004 Symp. on.
pp. 176–179 (2004)
26. Maes, R.: Physically Unclonable Functions: Constructions, Properties and Appli-
cations. Springer Berlin Heidelberg (2013)
27. Maes, R., van der Leest, V., van der Sluis, E., Willems, F.: Secure Key Generation
from Biased PUFs. In: Cryptographic Hardware and Embedded Systems–CHES
2015, pp. 517–534. Springer (2015)
28. Mossel, E., O’Donnell, R., Servedio, R.A.: Learning Functions of k Relevant Vari-
ables. Journal of Comp. and System Sciences 69(3), 421–434 (2004)
29. O’Donnell, R.: Analysis of Boolean Functions. Cambridge University Press (2014)
30. Pappu, R., Recht, B., Taylor, J., Gershenfeld, N.: Physical One-way Functions.
Science 297(5589), 2026–2030 (2002)
31. Rivest, R.L.: Learning Decision Lists. Machine learning 2(3), 229–246 (1987)
32. Ron, D., Rubinfeld, R., Safra, M., Samorodnitsky, A., Weinstein, O.: Approximat-
ing the Influence of Monotone Boolean Functions in O(n) Query Complexity.
ACM Trans. on Computation Theory (TOCT) 4(4), 11 (2012)
33. uhrmair, U., Sehnke, F., S¨olter, J., Dror, G., Devadas, S., Schmidhuber, J.: Mod-
eling Attacks on Physical Unclonable Functions. In: Proc. of the 17th ACM Conf.
on Comp. and Communications Security. pp. 237–249 (2010)
34. Saha, I., Jeldi, R.R., Chakraborty, R.S.: Model Building Attacks on Physically
Unclonable Functions using Genetic Programming. In: Hardware-Oriented Security
and Trust (HOST), 2013 IEEE Intrl. Symp. on. pp. 41–44. IEEE (2013)
35. Schapire, R.E.: The Strength of Weak Learnability. Machine learning 5(2), 197–227
36. Schapire, R.E., Freund, Y.: Boosting: Foundations and Algorithms. MIT press
37. Schuster, D., Hesselbarth, R.: Evaluation of Bistable Ring PUFs using Single Layer
Neural Networks. In: Trust and Trustworthy Computing, pp. 101–109. Springer
38. Siegenthaler, T.: Correlation-Immunity of Nonlinear Combining Functions for
Cryptographic Applications (Corresp.). Information Theory, IEEE Transactions
on 30(5), 776–780 (1984)
39. Tajik, S., Dietz, E., Frohmann, S., Seifert, J.P., Nedospasov, D., Helfmeier, C., Boit,
C., Dittrich, H.: Physical Characterization of Arbiter PUFs. In: Cryptographic
Hardware and Embedded Systems–CHES 2014, pp. 493–509. Springer (2014)
40. Weste, N.H.E., Harris, D.: CMOS VLSI Design: A Circuits and Systems Perspec-
tive. Addison Wesley, fourth edn. (2010)
41. Xu, X., R¨uhrmair, U., Holcomb, D.E., Burleson, W.P.: Security Evaluation and
Enhancement of Bistable Ring PUFs. In: Radio Frequency Identification, pp. 3–16.
Springer (2015)
42. Yamamoto, D., Takenaka, M., Sakiyama, K., Torii, N.: Security Evaluation of
Bistable Ring PUFs on FPGAs using Differential and Linear Analysis. In: Comp.
Science and Information Systems (FedCSIS), 2014 Federated Conf. on. pp. 911–918
... The security analysis of these advanced applications and protocols 2 relies on assuming that a PUF behaves like a random oracle; upon receiving a challenge, a uniform random response with replacement is selected, measurement noise is added, and the resulting response is returned. This assumption turns out to be too strong because (1) in practical implementations, the PUF returns biased response bits, and (2) classical ML and advanced ML attacks [11][12][13][14][15][16][17] demonstrate that a prediction model for response bits with accuracy typically up to 75% can be trained and this defeats the random oracle assumption. For example, FPGA implementations of the interpose PUF in [18] showed that the bias of individual Arbiter PUFs ranges from 50.2% to 61.6%. ...
... This assumption can be realized by means of 'hardware isolation' where PUF P is only accessible through the hardware interface defined by GetResponse. For example, in a secure processor architecture like Intel SGX [46] this access control 14 can be implemented by micro code which represents GetResponse and only allows access to the PUF by this micro-code. ...
Full-text available
Analysis of advanced physical unclonable function (PUF) applications and protocols relies on assuming that a PUF behaves like a random oracle; that is, upon receiving a challenge, a uniform random response with replacement is selected, measurement noise is added, and the resulting response is returned. In order to justify such an assumption, we need to rely on digital interface computation that to some extent remains confidential—otherwise, information about PUF challenge–response pairs leak with which the adversary can train a prediction model for the PUF. We introduce a theoretical framework that allows the adversary to have a prediction model (with a typical accuracy of 75% for predicting response bits for state-of-the-art silicon PUF designs). We do not require any confidential digital computing or digital secrets, while we can still prove rigorous statements about the bit security of a system that interfaces with the PUF. In particular, we prove the bit security of a PUF-based random oracle construction; this merges the PUF framework with fuzzy extractors.
... The premise underlying the protection of hardware IP using ML is that, if properly trained, ML models can identify any slight change in hardware behavior. Different ML-based techniques are proven to be resistant against hardware trojan attacks [45-50], side-channel attacks [51,52], IC counterfeit/reverse engineering attacks [53,54], and attacks on Physical Unclonable Functions (PUF) [55][56][57][58][59]. The features on which these models primarily focus are power usage, latency in run-time, electromagnetic emission data, memory access pattern, current supply, aging deterioration of recycled ICs etc. ...
... (iv) Defense against Attacks on PUFs: When it comes to authenticating and identifying FPGA hardware, PUFs are essential for solving the key generation and storage challenges which were initially imposed by the volatile nature of memory in FPGA [59]. Assuming that PUF with unknown mathematical model can be contaminated with ML attacks, Ganji et al. [58] proposed an ML method to defend FPGA based Bistable Ring PUF (BR-PUF). ...
Full-text available
Intellectual Property (IP) includes ideas, innovations, methodologies, works of authorship (viz., literary and artistic works), emblems, brands, images, etc. This property is intangible since it is pertinent to the human intellect. Therefore, IP entities are indisputably vulnerable to infringements and modifications without the owner’s consent. IP protection regulations have been deployed and are still in practice, including patents, copyrights, contracts, trademarks, trade secrets, etc., to address these challenges. Unfortunately, these protections are insufficient to keep IP entities from being changed or stolen without permission. As for this, some IPs require hardware IP protection mechanisms, and others require software IP protection techniques. To secure these IPs, researchers have explored the domain of Intellectual Property Protection (IPP) using different approaches. In this paper, we discuss the existing IP rights and concurrent breakthroughs in the field of IPP research; provide discussions on hardware IP and software IP attacks and defense techniques; summarize different applications of IP protection; and lastly, identify the challenges and future research prospects in hardware and software IP security.
... Modeling Attacks. PUFs are susceptible to modeling attacks [54], [55], [56]. Rührmair et al. [57] apply machine learning to challenge-response pairs of PUFs with many inputs and predict their outputs, which requires the ability to eavesdrop PUF responses. ...
Security is essential for the Internet of Things (IoT). Cryptographic operations for authentication and encryption commonly rely on random input of high entropy and secure, tamper-resistant identities, which are difficult to obtain on constrained embedded devices. In this paper, we design and analyze a generic integration of physically unclonable functions (PUFs) into the IoT operating system RIOT that supports about 250 platforms. Our approach leverages uninitialized SRAM to act as the digital fingerprint for heterogeneous devices. We ground our design on an extensive study of PUF performance in the wild, which involves SRAM measurements on more than 700 IoT nodes that aged naturally in the real-world. We quantify static SRAM bias, as well as the aging effects of devices and incorporate the results in our system. This work closes a previously identified gap of missing statistically significant sample sizes for testing the unpredictability of PUFs. Our experiments on COTS devices of 64 kB SRAM indicate that secure random seeds derived from the SRAM PUF provide 256 Bits-, and device unique keys provide more than 128 Bits of security. In a practical security assessment we show that SRAM PUFs resist moderate attack scenarios, which greatly improves the security of low-end IoT devices.
Physical Unclonable Functions (PUFs) have been widely considered an attractive security primitive. They use the deviations in the fabrication process to have unique responses from each device. Due to their nature, they serve as a DNA-like identity of the device. But PUFs have also been targeted for attacks. It has been proven that machine learning (ML) can be used to effectively model a PUF design and predict its behavior, leading to leakage of the internal secrets. To combat such attacks, several designs have been proposed to make it harder to model PUFs. One design direction is to use Non-Volatile Memory (NVM) as the building block of the PUF. NVM typically are multi-level cells, i.e, they have several internal states, which makes it harder to model them. However, the current state of the art of NVM-based PUFs is limited to ‘weak PUFs’, i.e., the number of outputs grows only linearly with the number of inputs, which limits the number of possible secret values that can be stored using the PUF. To overcome this limitation, in this work we design the Arbiter Non-Volatile PUF (ANV-PUF) that is exponential in the number of inputs and that is resilient against ML-based modeling. The concept is based on the famous delay-based Arbiter PUF (which is not resilient against modeling attacks) while using NVM as a building block instead of switches. Hence, we replace the switch delays (which are easy to model via ML) with the multi-level property of NVM (which is hard to model via ML). Consequently, our design has the exponential output characteristics of the Arbiter PUF and the resilience against attacks from the NVM-based PUFs. Our results show that the resilience to ML modeling, uniqueness, and uniformity are all in the ideal range of 50%. Thus, in contrast to the state-of-the-art, ANV-PUF is able to be resilient to attacks, while having an exponential number of outputs.
Full-text available
Future Industry 4.0 scenarios are characterized by seamless integration between computational and physical processes. To achieve this objective, dense platforms made of small sensing nodes and other resource constraint devices are ubiquitously deployed. All these devices have a limited number of computational resources, just enough to perform the simple operation they are in charge of. The remaining operations are delegated to powerful gateways that manage sensing nodes, but resources are never unlimited, and as more and more devices are deployed on Industry 4.0 platforms, gateways present more problems to handle massive machine-type communications. Although the problems are diverse, those related to security are especially critical. To enable sensing nodes to establish secure communications, several semiconductor companies are currently promoting a new generation of devices based on Physical Unclonable Functions, whose usage grows every year in many real industrial scenarios. Those hardware devices do not consume any computational resource but force the gateway to keep large key-value catalogues for each individual node. In this context, memory usage is not scalable and processing delays increase exponentially with each new node on the platform. In this paper, we address this challenge through predictor-corrector models, representing the key-value catalogues. Models are mathematically complex, but we argue that they consume less computational resources than current approaches. The lightweight models are based on complex functions managed as Laurent series, cubic spline interpolations, and Boolean functions also developed as series. Unknown parameters in these models are predicted, and eventually corrected to calculate the output value for each given key. The initial parameters are based on the Kane Yee formula. An experimental analysis and a performance evaluation are provided in the experimental section, showing that the proposed approach causes a significant reduction in the resource consumption.
This chapter introduces the concept of Physical Unclonable Functions (PUFs) and formally defines security properties and metrics for PUFs. It discusses typical attacker models considered in the study of PUFs and gives an overview over PUF modeling attacks. Hardware aspects of PUF security are considered briefly.KeywordsFormal definitionPUFAttacker modelHardware securityMachine learning attackModeling attackSecurity metricsIntra-distance requirementReliabilityUniquenessBiasChosen message attackKnown message attack
Full-text available
Physical Unclonable Functions (PUFs) have been increasingly used as an alternative to non-volatile memory for the storage of cryptographic secrets. Research on side channel and fault attacks with the goal of extracting these secrets has begun to gain interest but no fault injection attack targeting the necessary error correction within a PUF device has been shown so far. This work demonstrates one such attack on a hardware fuzzy commitment scheme implementation and thus shows a new potential attack threat existing in current PUF key storage systems. After presenting evidence for the overall viability of the profiled attack by performing it on an FPGA implementation, countermeasures are analysed: we discuss the efficacy of hashing helper data with the PUF-derived key to prevent the attack as well as codeword masking, a countermeasure effective against a side channel attack. The analysis shows the limits of these approaches. First, we demonstrate the criticality of timing in codeword masking by confirming the attack’s effectiveness on ostensibly protected hardware. Second, our work shows a successful attack without helper data manipulation and thus the potential for sidestepping helper data hashing countermeasures.
Full-text available
When the applied PUF in a PUF-based key generator does not produce full entropy responses, information about the derived key material is leaked by code-offset helper data. If the PUF’s entropy level is too low, the PUF-derived key is even fully disclosed by the helper data. In this work we analyze this entropy leakage, and provide several solutions for preventing leakage for PUFs suffering from i.i.d. biased bits. Our methods pose no limit on the amount of PUF bias that can be tolerated for achieving secure key generation, with only a moderate increase in the required PUF size. This solves an important open problem in this field. In addition, we also consider the reusability of PUF-based key generators and present a variant of our solution which retains the reusability property. In an exemplary application of these methods, we are able to derive a secure 128-bit key from a 15 %-noisy and 25 %-biased PUF requiring only 4890 PUF bits for the non-reusable variant, or 7392 PUF bits for the reusable variant.
Threshold phenomena refer to settings in which the probability for an event to occur changes rapidly as some underlying parameter varies. Threshold phenomena play an important role in probability theory and statistics, physics, and computer science, and are related to issues studied in economics and political science. Quite a few questions that come up naturally in those fields translate to proving that some event indeed exhibits a threshold phenomenon, and then finding the location of the transition and how rapid the change is. The notions of sharp thresholds and phase transitions originated in physics, and many of the mathematical ideas for their study came from mathematical physics. In this chapter, however, we will mainly discuss connections to other fields. A simple yet illuminating example that demonstrates the sharp threshold phenomenon is Condorcet's jury theorem, which can be described as follows. Say one is running an election process, where the results are determined by simple majority, between two candidates, Alice and Bob. If every voter votes for Alice with probability p > 1/2 and for Bob with probability 1 — p, and if the probabilities for each voter to vote either way are independent of the other votes, then as the number of voters tends to infinity the probability of Alice getting elected tends to 1. The probability of Alice getting elected is a monotone function of p, and when there are many voters it rapidly changes from being very close to 0 when p < 1/2 to being very close to 1 when p > 1/2. The reason usually given for the interest of Condorcet's jury theorem to economics and political science [535] is that it can be interpreted as saying that even if agents receive very poor (yet independent) signals, indicating which of two choices is correct, majority voting nevertheless results in the correct decision being taken with high probability, as long as there are enough agents, and the agents vote according to their signal. This is referred to in economics as asymptotically complete aggregation of information.
This paper introduces a new representation for Boolean functions, called decision lists, and shows that they are efficiently learnable from examples. More precisely, this result is established for k-;DL – the set of decision lists with conjunctive clauses of size k at each decision. Since k-DL properly includes other well-known techniques for representing Boolean functions such as k-CNF (formulae in conjunctive normal form with at most k literals per clause), k-DNF (formulae in disjunctive normal form with at most k literals per term), and decision trees of depth k, our result strictly increases the set of functions that are known to be polynomially learnable, in the sense of Valiant (1984). Our proof is constructive: we present an algorithm that can efficiently construct an element of k-DL consistent with a given set of examples, if one exists.
Conference Paper
The use of Physically Unclonable Functions (PUFs) in cryptographic protocols attracted an increased interest over recent years. Since sound security analysis requires a concise specification of the alleged properties of the PUF, there have been numerous trials to provide formal security models for PUFs. However, all these approaches have been tailored to specific types of applications or specific PUF instantiations. For the sake of applicability, composability, and comparability, however, there is a strong need for a unified security model for PUFs (to satisfy, for example, a need to answer whether a future protocol requirements match a new and coming PUF realization properties). In this work, we propose a PUF model which generalizes various existing PUF models and includes security properties that have not been modeled so far. We prove the relation between some of the properties, and also discuss the relation of our model to existing ones.
Conference Paper
This paper presents an analysis of a bistable ring physical unclonable function (BR-PUF) implemented on a field-programmable gate array (FPGA) using a single layer artificial neural network (ANN). The BR-PUF was proposed as a promising circuit-based strong PUF candidate, given that a simple model for its behaviour is unknown by now and hence modeling-based attacks would be hard. In contrast to this, we were able to find a strongly linear influence in the mapping of challenges to responses in this architecture. Further, we show how an alternative implementation of a bistable ring, the twisted bistable ring PUF (TBR-PUF), leads to an improved response behaviour. The effectiveness and a possible explaination of the improvements is demonstrated using our machine learning analysis approach.
Conference Paper
PUF-based key generators have been widely considered as a root-of-trust in digital systems. They typically require an error-correcting mechanism (e.g. based on the code-offset method) for dealing with bit errors between the enrollment and reconstruction of keys. When the used PUF does not have full entropy, entropy leakage between the helper data and the device-unique key material can occur. If the entropy level of the PUF becomes too low, the PUF-derived key can be attacked through the publicly available helper data. In this work we provide several solutions for preventing this entropy leakage for PUFs suffering from i.i.d. biased bits. The methods proposed in this work pose no limit on the amount of bias that can be tolerated, which solves an important open problem for PUF-based key generation. Additionally, the solutions are all evaluated based on reliability, efficiency, leakage and reusability showing that depending on requirements for the key generator different solutions are preferable.
Conference Paper
The Bistable Ring (BR) Physical Unclonable Function (PUF) is a newly proposed hardware security primitive in the PUF family. In this work, we comprehensively evaluate its resilience against Machine Learning (ML) modeling attacks. Based on the success of ML attacks, we propose XOR strategies to enhance the security of BR PUFs. Our results show that the XOR BR PUF with more than four parallel BR PUFs is able to resist the ML modeling methods in this work. We also evaluate the other PUF metrics of reliability, uniqueness and uniformity, and find that the XOR function is also effective in improving the uniformity of BR PUFs.
The general concept of Physically Unclonable Functions (PUFs) has been nowadays widely accepted and adopted to meet the requirements of secure identification and key generation/storage for cryptographic ciphers. However, shattered by different attacks, e.g., modeling attacks, it has been proved that the promised security features of arbiter PUFs, including unclonability and unpredictability, are not supported unconditionally. However, so far the success of existing modeling attacks relies on pure trial and error estimates. This means that neither the probability of obtaining a useful model (confidence), nor the sufficient number of CRPs, nor the probability of correct prediction (accuracy) is guaranteed. To address these issues, this work presents a Probably Approximately Correct (PAC) learning algorithm. Based on a crucial discretization process, we are able to define a Deterministic Finite Automaton (of polynomial size), which exactly accepts the regular language corresponding to the challenges mapped by the given PUF to one responses.
Physically Unclonable Function (PUF) is expected to be an innovation for anti-counterfeiting devices for secure ID generation, authentication, etc. In this paper, we propose novel methods of evaluating the difficulty of predicting PUF responses (i.e. PUF outputs), inspired by well-known differential and linear cryptanalysis. According to the proposed methods, we perform a first third-party evaluation for Bistable Ring PUF (BR-PUF), proposed in 2011. The BR-PUFs have been claimed that they have a resistance against the response predictions. Through our experiments using FPGAs, we demonstrate, however, that BR-PUFs have two types of correlations between challenges and responses, which may cause the easy prediction of PUF responses. First, the same responses are frequently generated for two challenges (i.e. PUF inputs) with small Hamming distance. A number of randomly-generated challenges and their variants with Hamming distance of one generate the same responses with the probability of 0.88, much larger than 0.5 in ideal PUFs. Second, particular bits of challenges in BR-PUFs have a great impact on the responses. The value of responses becomes '1' with the high probability of 0.71 (> 0.5) when just particular 5 bits of 64-bit random challenges are forced to be zero or one. In conclusion, the proposed evaluation methods reveal that BR-PUFs on FPGAs have some correlations of challenge-response pairs, which helps an attacker to predict the responses.