Content uploaded by Kaiyuan Yang
Author content
All content in this area was uploaded by Kaiyuan Yang on Sep 17, 2020
Content may be subject to copyright.
A 65nm Edge-Chasing Quantizer-Based Digital LDO Featuring 4.58ps-FoM and Side-
Channel-Attack Resistance
Yan He, Kaiyuan Yang
Rice University, Houston TX 77006
2020 IEEE International Solid- State Circuits Conference - (ISSCC)
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other
uses, in any current or future media, including reprinting/republishing this material for advertising or
promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse
of any copyrighted component of this work in other works.
A 65nm Edge-Chasing Quantizer-Based Digital LDO Featuring 4.58ps-FoM and Side-Channel-Attack
Resistance
Low-Dropout Regulators (LDOs) are commonly desired for fine-grained power management in SoCs because of their compact area, high
current efficiency, and small output ripple. Digital LDOs (DLDOs) are increasingly adopted in recent years thanks to voltage and process
scalability. However, achieving a fast response to
a load change requires a conventional synchronous DLDO to either increase its
sampling
frequency with a large power overhead, or include a large output
capacitor (C
OUT
) with increased chip area and cost. Various control
schemes have been proposed to mitigate these drawbacks. In synchronous designs, attempts
have been made to employ more complex
control algorithms to achieve faster response [1], and to replace the PMOS switch array with switched capacitors for reduced dynamic
energy consumption [2]. On the other hand, asynchronous
DLDOs with event-driven [3] and beat-frequency VCO-based structures [4]
show promise across the key performance, power and area metrics. Recently, voltage regulators are also found to be useful in enhancing the
resistance of cryptographic engines and processors against power and EM side-channel attacks (SCAs) [5-
7], which are physical attacks
representing a severe threat to mobile and
embedded devices. Employing regulators for SCA defense is promising because
they are
already used in most systems and require no modifications to existing
computing architectures and algorithms like other circuit-level
defenses [8-9]. In this paper, we present a high-performance and SCA-aware DLDO leveraging the
unique properties of the Edge-Chasing
Quantizer (ECQ). The 65nm DLDO prototype achieves: 1) a 101.7mV droop and 506ns settling time after a 20mA, 0.1ns step load
change, with only a 0.1nF capacitor; 2) a 0.018mm2 active area and 99.4% peak current efficiency; and, 3) more than 14000x
improvement in
power SCA resistance on a 128b AES engine, with 27.5% area, 19.4% power, and 4.54% performance overheads over a
standalone AES, and negligible overhead
when compared to an AES design integrated with conventional DLDOs.
The proposed DLDO consists of an ECQ, a second path booster, a digital PI controller, a PMOS switch array, and a 10b output register
to eliminate switching glitches (Fig. 25.3.1, top left). Inspired by the edge-pursuit comparator in [10-
11], the proposed ECQ quantizes the
input-voltage difference using an even-stage
ring oscillator, by measuring the time to integrate the difference of two input- voltage-
controlled delay paths until reaching a threshold. The time is inversely
proportional to the input difference (voltage error in LDOs),
making it a
continuously running quantizer with input-dependent sampling frequency and non-linear quantization (Fig. 25.3.1, top right).
Both properties are desirable for
DLDOs because they allow the DLDO to respond quickly after load changes, and
perform updates slower
but more accurately during steady state to save power and reduce steady-state error (Fig. 25.3.1, bottom right). This breaks the direct
trade-off between transient performance and power consumption in conventional
DLDOs. However, due to integration, a sudden increase in
error voltage (from 0 to E
max
) requires the same integration time as having a constant 1/2 E
max
error
voltage, limiting the DLDO’s response
speed, especially when supply voltage is low (<0.9V). To alleviate the ECQ’s integration effects, a second boosting path is
designed to boost
the current when the droop is larger than a threshold (Fig. 25.3.1, bottom left). The boost current is provided by a PMOS switch and an
RC filter is added to slow down the recovery, in order to avoid abrupt changes in the
output voltage.
The ECQ is implemented with an even-stage ring oscillator (RO) having two NAND gates at opposite positions to inject edges simultaneously
[10]. The RO is forced
into oscillation until the two injected edges collapse due to different propagation
paths. By connecting the current-
limiting PMOS of alternating stages to different
input voltages, the RO behaves like a comparator. As modeled in [10], the final collapse
state is decided by the polarity of the input difference, and the collapse time is inversely proportional to the absolute voltage difference
(Fig. 25.3.2, top
right). However, since the reference voltage (V
REF
) and LDO output voltage (V
OUT
)
in a high-performance condition (VOUT
close to 1.2V) are too high to directly control the current limiting transistors, they are first shifted down using a
programmable Input
Level Shifter (ILS) (Fig. 25.3.2, top left). Configuring PMOS
width in the ILS effectively changes the frequency of the ECQ. In order to further
configure the gain and resolution of the quantizer, a tunable resistor (6kΩ to 200kΩ) is inserted between the output nodes of the
ILS. To keep the ECQ
operating continuously and asynchronously, a self-triggered resetting circuit is designed to detect oscillation
collapse and restart the RO, without using an external clock (Fig. 25.3.2, bottom left). Conceptually, oscillation collapse is
detected
by sampling the RO outputs, UP/DN, with their properly delayed signals (Fig. 25.3.2, bottom right). In practice, the delay is implemented with
an RO replica for PVT tracking and simply delaying the original UP/DN signals fails because the
last narrow pulse will disappear after the
replica. The circuit shown in Fig. 25.3.2
(bottom left) produces a wider pulse to replace the narrow one for robust
detection.
Combined with a counter and a trigger generator for the PI controller, an asynchronous, non-linear ECQ with error-dependent
conversion time is
finished.
Compared to conventional DLDOs, the ECQ-based DLDO also presents stronger obfuscation of power signatures, thanks to its nonlinearity
and varying frequency.
The ECQ oscillation can also be used as a frequency-changing clock for digital blocks to add temporal distortion.
Aside from these inherent randomizations, a resistance randomizer fits inside the ILS to randomize both the resolution and
gain of the
ECQ (Fig. 25.3.3, left). Lastly, a reference-voltage randomizer, similar
to [5], is added with reasonable overhead. 11b LFSRs are
employed as the randomizer in the prototype, which is expected to be sufficient because of the
randomizer’s complex and difficult-to-
inverse impact on the current traces. But a
truly random source, such as a digital synthesizable true random number
generator like
[11], can also be easily deployed with small overhead and sufficient
throughput. A 128b parallel AES engine is included for SCA resistance
tests.
The test chip is fabricated in 65nm LP process. Fig. 25.3.4 presents measured DLDO performance. Under a 20mA/100ps load step at
nominal condition (V
IN
= 1.2V, V
OUT
= 1.15V), the DLDO achieves a minimum voltage droop (V
droop
) of 101.9mV and 180µA quiescent
current (I
Q
) without turning on the second path
(Fig. 25.3.4, top left). When VIN is 0.9V, activating the second path booster
improves
V
droop
by 46% (Fig. 25.3.4, top right). By reconfiguring the ILS and ECQ
oscillation frequency, the DLDO can be reconfigured for performance
and power
trade-offs (Fig. 25.3.4, middle left). It also shows the effect of the second path is limited when input voltage is high, since the
response time of the main path is comparable to second path at high V
IN
. At 1.2V V
IN
and 30mA I
LOAD
, a maximum
current efficiency of
99.4% is achieved. The PI controller works as expected, with the integration coefficient (K
I
) having a greater impact on the DLDO’s settling time
(Fig. 25.3.4, bottom left). Load and line regulations of the DLDO demonstrate
stable operation across 0.65V to 1.2V variations of V
IN
(Fig.
25.3.4, bottom right).
To evaluate the DLDO’s resistance against power side-channel attacks, 32K traces are collected in each design configuration for Test Vector
Leakage Assessment (TVLA), and 7 million traces are collected for Correlation Power Analysis (CPA).
The peak t-statistic in TVLA reduces
significantly from 82.9 in baseline AES to less than 4.5 (failure line for TVLA) when the voltage reference randomizer is enabled. CPA
extracts the 2nd key byte with 500 traces in baseline, while no key
byte is revealed after 7 million traces for DLDO-AES with both randomizers
turned on, achieving more than 14000x improvement in Minimum Traces to Disclosure (MTD) increase. With both randomizers turned on,
the DLDO output has a 50mV
ripple, effectively causing digital performance to degrade by 4.54%. The DLDO
consumes 330A current at
1.2V V
IN
, representing a 19.4% power overhead. The
area overhead of the whole DLDO is 27.5% compared to baseline AES. In most
cases,
however, LDOs are already necessary in SoCs so that the overheads are amortized. Fig. 25.3.6 and Fig. 25.3.7 compare the proposed DLDO
with state-of-
the-art DLDOs and SCA countermeasures. A die micrograph is in Fig. 25.3.7.
References:
[1] X. Sun et al., "A 0.6-to-1.1V Computationally Regulated Digital LDO with 2.79-
Cycle Mean Settling Time and Autonomous Runtime Gain
Tracking in 65nm
CMOS," ISSCC, pp. 230-232, 2019.
[2] L. G. Salem et al., "A Sub-1.55mV-Accuracy 36.9ps-FOM Digital-LowDropout
Regulator Employing Switched-Capacitor Resistance," ISSCC,
pp. 312-314, 2018.
[3] S. Kundu et al., "A Fully Integrated 40pf Output Capacitor Beat-Frequency-
Quantizer-Based Digital LDO with Built-In Adaptive Sampling
and Active Voltage
Positioning," ISSCC, pp. 308-310, 2018.
[4] D. Kim et al., "A 0.5V-VIN1.44mA-class Event-Driven Digital LDO with a Fully
Integrated 100pf Output Capacitor," ISSCC, pp. 346-
347, 2017.
[5] A. Singh et al., "A 128b AES Engine with Higher Resistance to Power and
Electromagnetic Side-Channel Attacks Enabled by a
Security-Aware Integrated
All-Digital Low-Dropout Regulator," ISSCC, pp. 404-406, 2019.
[6] M. Kar et al., "Reducing Power Side-Channel Information Leakage of AES Engines Using Fully Integrated Inductive Voltage
Regulator," IEEE JSSC, vol. 53, no. 8, pp. 2399-2414, Aug. 2018.
[7] D. Das et al., "ASNI: Attenuated Signature Noise Injection for Low-Overhead Power Side-Channel Attack Immunity," TCAS-I, vol.
65, no. 10, pp. 3300-3311, Oct. 2018.
[8] S. Lu et al., "1.32GHz High-Throughput Charge-Recovery AES Core with Resistance to DPA Attacks," IEEE Symp. VLSI Circuits,
pp. C246-C247, 2015.
[9] C. Tokunaga et al., "Secure AES Engine with a Local Switched-Capacitor Current Equalizer," ISSCC, pp. 64-65, 2009.
[10] M. Shim et al., "Edge-Pursuit Comparator: An Energy-Scalable Oscillator Collapse-Based Comparator with Application in a 74.1
dB SNDR and 20 kS/s 15 b SAR ADC," IEEE JSSC, vol. 52, no. 4, pp. 1077-1090, April 2017.
[11] K. Yang et al., "16.3 A 23Mb/s 23pJ/b Fully Synthesized True-Random- Number Generator in 28nm and 65nm CMOS," ISSCC,
pp. 280-281, 2014.
Figure 25.3.1: Architecture and principles of the proposed DLDO with an Edge Chasing Quantizer (ECQ) (top); diagrams of
the second-path booster (bottom left); and illustrations of the DLDO's load response (bottom right).
Figure 25.3.2: Circuit diagrams and operating waveforms of the edge-chasing quantizer.
Figure 25.3.3: Diagrams of DLDO with added security modules (top); trace measurement setup and flowchart for trace
processing (right); measured current traces under different configurations (bottom).
Figure 25.3.4: Measured DLDO step responses (top); droop and quiescent current at different ECQ oscillation frequencies
(middle left); current efficiency (middle right), impacts of KP, KI (bottom left), load and line regulation (bottom).
Figure 25.3.5: TVLA results for different DLDO configurations (32K traces per configuration) (top); CPA results for baseline
AES and DLDO with both randomizers turned on (bottom).
Figure 25.3.6: Summary of the DLDO performance and comparison with prior art.
Figure 25.3.7: Comparison table of state-of-the-art defenses against power side-channel attacks and die micrograph of the
DLDO.
Figure 25.3.S1: Testing setup using a 20GS/s oscilloscope with a 1GHz- bandwidth active probe, a source meter for
measuring power consumption and LabVIEW for chip interface.
Figure 25.3.S2: Testing PCB and its simplified schematic.