An In-depth and Black-box Characterization of the Effects of Clock Glitches on 8-bit MCUs.
ABSTRACT The literature about fault analysis typically describes fault injection mechanisms, e.g. glitches and lasers, and cryptanalytic techniques to exploit faults based on some assumed fault model. Our work narrows the gap between both topics. We thoroughly analyse how clock glitches affect a commercial low-cost processor by performing a large number of experiments on five devices. We observe that the effects of fault injection on two-stage pipeline devices are more complex than commonly reported in the literature. While injecting a fault is relatively easy, injecting an exploitable fault is hard. We further observe that the easiest to inject and reliable fault is to replace instructions, and that random faults do not occur. Finally we explain how typical fault attacks can be mounted on this device, and describe a new attack for which the fault injection is easy and the cryptanalysis trivial.
- SourceAvailable from: Andrea Visconti[Show abstract] [Hide abstract]
ABSTRACT: Most of the attacks against the Advanced Encryption Standard based on faults mainly aim at either altering the temporary value of the message or key during the computation. Few other attacks tamper the instruction flow in order to reduce the number of round iterations to one or two. In this work, we extend this idea and present fault attacks against the AES algorithm that exploit the misbehavior of the instruction flow during the last round. In particular, we consider faults that cause the algorithm to skip, repeat or corrupt one of the four AES round functions. In principle, these attacks are applicable against both software and hardware implementations, by targeting the execution of instructions or the control logic. As conclusion countermeasures against fault attacks must also cover the instruction flow and not only the processed data.Secrypt 2014; 01/2014
Conference Paper: Electromagnetic glitch on the AES round counter[Show abstract] [Hide abstract]
ABSTRACT: This article presents a Round Addition Analysis on a software implementation of the Advanced Encryption Standard (aes) algorithm. The round keys are computed on-the-fly during each encryption. A non-invasive transient fault injection is achieved on the aes round counter. The attack is performed by injecting a very short electromagnetic glitch on a 32-bit microcontroller based on the arm Cortex-M3 processor. Using this experimental setup, we are able to disrupt the round counter increment at the end of the penultimate round and execute one additional round. This faulty execution enables us to recover the encryption key with only two pairs of corresponding correct and faulty ciphertexts.Proceedings of the 4th international conference on Constructive Side-Channel Analysis and Secure Design; 03/2013
- [Show abstract] [Hide abstract]
ABSTRACT: Injection of transient faults can be used as a way to attack embedded systems. On embedded processors such as microcontrollers, several studies showed that such a transient fault injection with glitches or electromagnetic pulses could corrupt either the data loads from the memory or the assembly instructions executed by the circuit. Some countermeasure schemes which rely on temporal redundancy have been proposed to handle this issue. Among them, several schemes add this redundancy at assembly instruction level. In this paper, we perform a practical evaluation for two of those countermeasure schemes by using a pulsed electromagnetic fault injection process on a 32-bit microcontroller. We provide some necessary conditions for an efficient implementation of those countermeasure schemes in practice. We also evaluate their efficiency and highlight their limitations. To the best of our knowledge, no experimental evaluation of the security of such instruction-level countermeasure schemes has been published yet.07/2014;
An In-depth and Black-box Characterization
of the Effects of Clock Glitches on 8-bit MCUs
Josep Balasch, Benedikt Gierlichs, and Ingrid Verbauwhede
Katholieke Universiteit Leuven, COSIC and IBBT
Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium
Abstract—The literature about fault analysis typically de-
scribes fault injection mechanisms, e.g. glitches and lasers,
and cryptanalytic techniques to exploit faults based on some
assumed fault model. Our work narrows the gap between
both topics. We thoroughly analyse how clock glitches affect a
commercial low-cost processor by performing a large number
of experiments on five devices. We observe that the effects of
fault injection on two-stage pipeline devices are more complex
than commonly reported in the literature. While injecting a
fault is relatively easy, injecting an exploitable fault is hard.
We further observe that the easiest to inject and reliable fault
is to replace instructions, and that random faults do not occur.
Finally, we explain how typical fault attacks can be mounted
on this device, and we describe a new attack for which the
fault injection is easy and the cryptanalysis trivial.
Keywords-Fault effect characterization; AVR MCU; clock
Physical attacks came to the attention of the scientific
community in the late 1990’s. Today it is a well known
fact that variations of e.g. the execution time , the in-
stantaneous power consumption  and the electromagnetic
emanations ,  of a device implementing a crypto-
graphic algorithm leak information about secret data. At-
tacks exploiting such information are called (passive) side
channel attacks. It is further well understood that unusual
conditions of the close, physical environment of a crypto-
graphic implementation have an impact on the operation of
a device. The latter can be exploited by an adversary that
modifies the operating environment of a device temporarily
to induce computational errors, known as transient faults.
Such and other active attacks are fault attacks. Typical fault
injection mechanisms include glitches on the clock signal or
the power supply , and the photoelectric effects caused
by lasers or white light . The former mechanisms are
typically inexpensive to implement but affect the entire chip,
while laser setups are more expensive but allow to stimulate
specific regions of a chip.
The first fault attack, published at Eurocrypt’97, is the
Bellcore attack , targeting implementations of RSA-
CRT. The attack is particularly powerful as any single
fault injected into either of the CRT-branches allows to
factorize the RSA modulus. Fault attacks quickly gained
importance due to their severe power to break cryptographic
Many later works focus on attacking implementations of
symmetric-key algorithms. Today, we know about e.g. Dif-
ferential Fault Analysis (DFA) , Collision Fault Analysis
(CFA)  and Ineffective Fault Analysis (IFA) , .
These generic attack techniques all represent a union of
physical fault injection and cryptanalytic fault exploitation.
As a consequence, most publications in this area of research
focus on either fault exploitation assuming that a certain
fault pattern can be injected physically (fault model), or on
how to physically inject faults, sometimes in combination
with verifying a theoretical attack. At the same time, there
is a rich body of literature about countermeasures against
fault attacks . Many of them are based on some form of
Related Work. Our work focuses on the case of non-
invasive fault attacks, particularly, on the injection of faults
via glitches in the external clock signal provided to a
MicroController Unit (MCU). This type of attack has been
successfully implemented using cryptographic hardware co-
processors as target platforms , , . However
the effects of glitches (either in the clock signal or in the
power supply) on software implementations in embedded
MCUs have only been addressed in a handful of articles.
K¨ ommerling and Kuhn  enumerate a series of potential
vulnerabilities that can be triggered by inducing glitches. For
instance, glitches can be used to extend the runtime of loops
in serial port output routines to see more of the memory
after the output buffer ; or they can also be used to
reduce the number of loop iterations, e.g. to convert a secure
iterated block cipher into weaker single-round variant .
Bar-El et al.  enumerate some effects observed on MCUs
when inducing spikes on the power supply. Depending on
some conditions, the authors are able to inject faults such
that instructions are skipped or the data manipulated by the
processor is modified. However, the technical details on how
to perform such attacks, including the characteristics of the
target MCU, are not provided in these papers.
Further work has focused on the injection of spikes in the
power supply with the aim of skipping critical instructions:
2011 Workshop on Fault Diagnosis and Tolerance in Cryptography
978-0-7695-4526-4/11 $26.00 © 2011 IEEE
Choukri and Tunstall  reduce the number of rounds of
an AES implementation on a PIC16F877 smart card; Kim
and Quisquater  skip two subroutine calls in an RSA-
CRT implementation on an AVR MCU; similarly, Schmidt
and Herbst  prevent a subroutine call in a square-and-
multiply RSA implementation on an AVR MCU. Perhaps
closer to our work, Barenghi et. al  perform a study of
the fault effects on a 32-bit multi-stage pipeline ARM MCU;
the results are later used to break implementations of AES
The goal of our work is to thoroughly analyse the effects
of clock glitches on the instructions being executed by a
MCU. To this end, we choose a legacy 8-bit AVR controller
as target platform, a typical representative of low-cost em-
bedded devices. This MCU allows a potential attacker to
manipulate the external clock signal at will. It should be
noted that secure smart cards have built-in countermeasures,
typically an internal clock or sensors, in order to prevent
fault injection via clock glitches. However, our research
does not aim to propose or experimentally prove theoretical
fault attacks on cryptographic implementations, but rather
to answer questions such as how, when, and why do faults
The two main contributions of this paper are, first, to
provide a complete study and characterization of the effects
of clock glitches on legacy smart card architectures without
hardware coprocessor modules. The analysis is done in a
black-box setting, and using as target platform an 8-bit AVR
controller operating on a two-stage pipeline. And second,
to put forward a more concrete foundation for future work
in fault attacks and countermeasures, by highlighting which
type of faulty behaviour can be expected and/or exploited
as a result of glitches in the clock signal.
Outline. The rest of this paper is organized as follows.
We describe our experimental setup and our target platform
in Section II and Section III, respectively. The testing
framework is introduced in Section IV. Section V describes
the effects of fault injections on the program flow of an
application, whereas Section VI focuses on the effects on
the application’s data flow. We summarize our findings
and enumerate potential applications in Section VII, and
conclude in Section VIII.
II. EXPERIMENTAL SETUP
The experimental setup used in this work is depicted in
Figure 1. We have implemented a custom ISO/IEC 7816-
3  compliant smart card reader with a fully controllable
clock signal using a Virtex-II Pro XC2VP30 FPGA .
The interface with the smart card at the link and physical
layers follows the ISO/IEC 7816-3 standard, while the
communication is performed via Application Protocol Data
Units (APDUs) as specified in ISO/IEC 7816-4 . The
computer, acting as a user interface, communicates with the
FPGA via an RS232 interface.
The FPGA behaves as an off-the-shelf smart card reader.
The clock signal provided to the smart card has a fixed
nominal frequency in accordance to the MCU specifications.
The APDU commands exchanged between the computer and
the smart card during communication are simply forwarded
by the FPGA to the end receiver. Note that the T0 and
T1 protocols specified in ISO/IEC 7816-3 are based on
request/response commands, i.e. the computer is always the
device that triggers an action of the smart card.
Figure 1. Experimental setup.
Glitch generation. The effect of injecting a glitch in the
clock signal is depicted in Figure 2. We denote the nominal
period of the clock signal as Tnand the period (or duration)
of a glitch as Tg. The idea of injecting a glitch is to
temporarily overclock the smart card, i.e. to insert a clock
period such that Tg<< Tnthat potentially causes a transient
malfunction of the MCU. Notice that after injecting a glitch,
the following clock period is reduced from Tnto Tn−Tgin
our setup. However, given that Tg<< Tn, this “post-glitch”
period does not affect the normal behaviour of the MCU.
Figure 2. Injection of a glitch in the clock signal.
We have developed two different mechanisms to introduce
glitches in the clock signal. In the first mechanism, illus-
trated in Figure 3, the FPGA generates the output CLK fed to
the smart card using a combination of two reference signals
denoted as nominal CLK (with period Tn) and high-freq.
CLK (with period Tg). Glitches in the frequency of output
CLK are injected when indicated by the selection signal. For
the sake of reproducibility, both nominal CLK and high-freq.
CLK signals have to be perfectly phase-aligned. This can
be easily achieved by generating the nominal CLK signal
from the high-freq. CLK signal, which in turn determines
the granularity of the glitch width. For instance, by fixing
the frequency of the nominal CLK to 1 MHz (such that Tn
is 1μs), the possible frequency values of the high-freq. CLK
are tied to be multiples of 2 MHz. In other words, the set
of possible glitch periods is given by Tg (in μs) = 1/2i,
where i = 1,2,3,.... The accuracy of the glitch period has
a standard deviation of 60 ps.
Figure 3.Glitch generation using high-frequency signal.
In the second mechanism, shown in Figure 4, we use
a similar approach as in , . In this case, a glitch
in output CLK is generated by switching between three
signals with the same period Tnbut with different phases.
The advantage of this mechanism w.r.t. the first one is
that it provides more granularity in the glitch period, in
particular for low frequencies. We can increase the glitch
period in steps of approximately 1 ns such that the set of
possible glitch periods is given by Tg (in ns) = i, where
i = 1,2,3,.... The standard deviation of the glitch period
for this mechanism is 70 ps.
Figure 4.Glitch generation using phase-shifted signals.
As a final comment, note that the selection signal in these
mechanisms allows a wide range of glitch injection patterns,
which are by no means restricted to one glitch per trial
execution. The selection of all parameters involved in the
generation of the output clock signal (e.g. glitch mechanism,
nominal period Tn, glitch period Tg, glitch position, etc.),
is completely configurable by the user via commands send
from the PC to the FPGA, thus obtaining a highly-flexible
yet automatized experimental setup to carry out our study.
III. TARGET PLATFORM
We choose as target platform a MCU belonging to the
8-bit Atmel AVR family, namely the ATMega163 micro-
controller . There are several reasons for this choice.
First, and most important, this device operates on an external
clock signal, such that it is possible to inject faults to the
device using this interface. Second, the characterization of
the effects of fault injection on AVR MCUs is a challenging
task: AVR controllers have a modified Harvard architecture,
i.e. although access to program code (flash memory) and
data (internal SRAM) is physically separated in the chip
(strict Hardvard architecture), the CPU can concurrently use
both buses in a clock cycle. This characteristic, combined
with a RISC architecture with most of the instructions
executing in a single-cycle, allows to obtain a two-stage
pipeline: while one instruction is being executed, the next
one is pre-fetched from program memory as shown in
Figure 5. In contrast to von Neumann architectures (e.g.
8051 MCUs), several critical operations happen in parallel
during one clock cycle. Consequently, fault injections can
have multiple and complex effects.
Figure 5.Pipeline in AVR controllers (source: ATMega163 datasheet).
Finally, Atmel AVR controllers are known devices largely
used in the related literature, not only in fault analysis ,
, but also in side channel attacks . So far no study
has been made to fully characterize and understand the
reaction of these devices to fault injection via clock glitches.
We aim to fill this gap in the following sections.
We stress that although Atmel offers a family of AVR
MCUs specifically designed for security applications ,
the smart cards used in our tests have no security claims
whatsoever. Our research motivation is not to evaluate the
level of resistance of such MCUs to fault attacks, but rather
to understand and characterize the effects of fault injection
via clock glitches on one model of the low-cost family. Note
that the analysis is done in a black-box setting, i.e. we only
have access to the publicly available data sheets.
IV. TESTING FRAMEWORK
The approach followed in our experiments consists in
decreasing the glitch period Tg, starting with a value such as
125 ns (or 8 MHz) for which the MCU functions correctly,
until 15 ns. This lower bound is determined by the switching
speed of the FPGA board’s I/O pins as well as some external
analogue circuitry of our experimental setup. When faults
start occurring, we analyse them in order to be able to
characterize the chip’s behaviour. Our experiments show that
the critical path (i.e. maximum frequency tolerated by the
MCU) is determined by the access to Program Memory.
In other words, the first effect noticed when decreasing the
glitch period is an erroneous behaviour of the pre-fetching
In the following we will make a distinction between which
pipeline stage is affected by the glitch. We will begin by
focusing on the effects of clock glitches on the pre-fetching
stage, analysing how it is possible to inject faults such that
the program flow is altered. After this, we will focus on
the effects of the glitches on the execution stage, studying
how the expected data flow of a program is changed. In our
experiments we have used a total of five ATMega163 smart
cards to verify that they all respond to fault injection in a
very similar way.
We have implemented several test applications1in as-
sembly language and executed them a large number of
times in order to obtain and analyse the effects of clock
glitches. In order to make the interpretation of the results
more clear, we provide some exemplary code fragments.
Although such tests do not correspond to any particular
cryptographic implementation, the results obtained can be
easily and perfectly extrapolated to the general case.
Finally, note that by targeting a device with a two-
stage pipeline and without access to details of the inner
workings of the MCU, the analysis of the faults’ outcome
becomes an arduous task. The only information available
for the interpretation of the faults’s effects consists of an
array of output data. Before running a test, we bring the
MCU to a state A such that all possible variables (SRAM
values, program memory, registers, flags, and even room
temperature) are fixed and known. A normal execution of
the test application brings the MCU to an “expected” state
B, whereas a faulty execution brings it to an “incorrect”
state B?. Manually reverse-engineering the chain of events
that explains the transition from state B to state B?is far
V. EFFECTS OF CLOCK GLITCHES ON PROGRAM FLOW
The AVR instruction set consists of 130 commands, most
of them executing within a single clock cycle. Instruction
opcodes are typically encoded and stored in Program Mem-
ory in 16-bit words. Although being an 8-bit device, the
AVR MCU has a 16-bit Program Bus. This means that in
the pre-fetching phase, the 16-bit opcode pointed to by the
Program Counter (PC) is loaded at once. In turn, the PC is
also incremented in this stage, such that the next opcode is
correctly loaded in the following clock cycle. The behaviour
of multi-cycle instructions differs from that of single-cycle
instructions; these differences will be discussed later.
A. NOP: No Operation
We start our analysis by testing the effects on the most
simple command available, namely NOP. As this instruction
does not perform any operation in the execution phase,
glitches will only affect the pre-fetching stage. Our first test
is depicted in Figure 6, where Inst refers to any of the
available AVR instructions. By injecting a glitch in clock
1These routines have been executed using a modified version of the
T0 protocol compliant Simple Operating System for Smartcard Education
(SOSSE), available here: http://www.mbsks.franken.de/sosse/
cycle i (when NOP is being executed), one can possibly cause
an erroneous behaviour in the pre-fetching phase.
0000 0000 0000 0000
Figure 6.Code example for NOP (I).
For testing purposes, let us assume that Inst is the
command EOR R15,R5 (Exclusive OR) as illustrated in
the top part of Figure 7. When injecting a glitch with period
smaller than or equal to 59 ns in clock cycle i, we observe
that EOR R15,R5 is never executed. Intuitively, one can
assume that the MCU does not have time to load the next
command from Program Memory as a consequence of the
glitch. So a reasonable explanation is that the opcode being
executed at the time of the glitch (e.g. NOP in cycle i) is
executed again in cycle i+1, as shown in the lower part of
Figure 7. Note however, that the PC is clearly not affected
by the glitch and is correctly incremented. Otherwise, the
MCU would simply pre-fetch the command EOR R15,R5
in cycle i+1 and execute it in cycle i+2. A likely explanation
for the PC not being affected by the glitch in any of our
experiments is that incrementing the PC simply requires less
time than 15 ns.
≤ 59 ns
0000 0000 0000 0000
0010 0100 1111 0101
0000 0000 0000 0000
Figure 7.Code example for NOP (II).
Suppose now that Inst is the command SER R18
(Set Bits in Register) as illustrated in Figure 8. In this
case, we observe that depending on the glitch period the
command SER R18 is substituted by instructions other than
NOP. In particular, for a glitch period equal to 61 ns the
command LDI R18,0xEF (Load Immediate to Register)
is executed. Decreasing the glitch period to 60 ns produces
the appearance of the command SBC R12,R15 (Subtract
with Carry). Finally, for any glitch period smaller than or
equal to 59 ns, we observe the same effect as shown in
Figure 7, namely, NOP is executed.
≤ 61 ns
≤ 60 ns
≤ 59 ns
0000 0000 0000 0000
1110 1111 0010 1111
1110 1110 0010 1111
0000 1000 0010 1111
0000 0000 0000 0000
Figure 8.Code example for NOP (III).
These results shown in Figure 8 illustrate the transition
in which the MCU internally updates the opcode to be
executed. As one can notice, there is a progression from the
expected command (SER R18) to the previous command
(NOP), in the sense that more bits of the erroneous opcodes
are degraded to zero as the glitch period decreases. Strictly
speaking, at this point it is not fully correct to describe
the effect of the glitch as skipping an instruction; rather
differently, as another command is executed instead of the
expected instruction, a more accurate description of the
glitch effect would be replacing an instruction.
Note that when LDI or SBC are executed instead of
SER, some registers are overwritten with the values resulting
from the execution of such commands; thus, a single fault
injection disrupts at the same time both the program flow
and the data flow.
The effects depicted in Figure 8 are observed in all five
ATMega163 smart cards, although there are some slight
differences. First, the glitch periods for which instructions
are replaced can vary from card to card. And second, it
is possible that instructions different than those in Figure 8
appear in cycle i+1. We have however verified that for glitch
widths smaller than approximately 52 ns a NOP is always
effectively executed in all cards.
A particularly interesting case in the test is observed when
the command Inst has a 32-bit opcode. Consider, as shown
in Figure 9, that this instruction is LDS R22,0x0128
(Load Direct From Data Space). As the program bus of the
AVR is 16-bit wide, LDS requires an extra cycle to fetch
the second half of the opcode, i.e. the value 0x0128, from
Program Memory. By injecting a clock glitch with period
59 ns in cycle i, LDS is replaced by NOP in cycle i+1.
However, as the skipped command has a 32-bit opcode, the
value 0x0128 is pre-fetched from Program Memory in cycle
i+1 and interpreted as a command in cycle i+2. As a result, a
completely wrong instruction is inserted in the program flow.
For this particular example, the instruction corresponding to
opcode 0x0128 is MOVW R4,R16 (Copy Register Word),
which moves a 16-bit word from a pair of registers to another
pair of registers in a single cycle.
0000 0000 0000 0000
1001 0001 0110 0000
0000 0001 0010 1000
0000 0000 0000 0000
0000 0001 0010 1000
≤ 59 ns
Figure 9.Code example for NOP (IV).
If the second half of the 32-bit opcode is not a valid
command, for instance, LDS R22,0x0060, the MCU will
execute the opcode 0x0060 in cycle i+2 as a consequence of
the fault. However, the execution of illegal opcodes in AVR
MCUs is carried out without affecting the program flow; in
fact, they have the same effect as NOPs2.
2It is possible that other MCUs behave differently when interpreting an
invalid opcode, for example, resetting the chip. For the AVR controllers we
have used this is however not the case observed.
B. Branching instructions
For the second set of experiments we target branching
instructions. These commands do not have an execution
phase that directly affects data; instead, they modify the
value of the PC according to a tested condition. Consider the
code example shown in the top part of Figure 10. In cycle
i, the TST command checks whether register R12 holds a
value equal to zero. If so, it sets the Zero flag in the Status
Register (SREG); otherwise, the flag is cleared. In cycle i+1,
the BREQ (Branch if Equal) command checks the value of
the Zero flag: if the flag is set, it modifies the value of the PC
in order to branch to a different code segment; otherwise, PC
is incremented such that the next instruction is SER R26.
The former option requires two cycles to complete, while
the latter executes in a single cycle.
If the Zero flag is cleared, BREQ simply increments the
PC in a single cycle, thus behaving similarly to a NOP. By
injecting a fault in cycle i+1 one would expect a faulty
behaviour such as in the previous experiments. However,
as shown in Figure 10, the amount of faulty instructions
executed instead of SER R26 and the glitch periods for
which errors appear are quite different.
Instruction Opcode (bin)
0010 0000 1100 1100
1111 0000 0000 1001
1110 1111 1010 1111
1110 1110 1010 1111
1110 1100 1010 1111
1110 0000 1010 1111
1110 0000 0000 1001
1000 0000 0000 1001
1000 0000 0000 1000
1110 0000 0000 1001
1111 0000 0000 1001
≤ 57 ns
≤ 56 ns
≤ 52 ns
≤ 45 ns
≤ 32 ns
≤ 28 ns
≤ 27 ns
≤ 15 ns
Figure 10.Code example for BREQ.
In the range 57 ns ≤ Tg ≤ 28 ns the opcode values
describe a clear transition towards zero, i.e. stuck-at-zero
pattern. The first erroneous command to be executed (LDI
R26,0xEF) differs from the expected (SER R26) in that
bit 8 is zero instead of one; for the second command
(LDI R26,0xCF) bit 9 is also cleared. This progression
is observed until for LD R0,Y a total of 11 bits are cleared
compared to those of SER R26.
By decreasing the glitch period further than 27 ns, one
would expect to obtain an erroneous opcode consisting of
only zeroes. However, results show that after LD R0,Y
(with only two bits set to one), there is a transition to LDI
R16,0x09 (with five bits set to one). Finally, for a glitch
period equal to 15 ns, the command BREQ PC+0x02 is
executed again instead of SER R26.
At this point, it is clear that the figure shows the transition
from the expected opcode to the previous opcode, most
probably going through an intermediate all-zero state. De-
pending on the period of the glitch injected, the instruction