A low-cost concurrent error detection technique for processor control logic
Jacob A. Abraham†
† Computer Engineering Research Center
University of Texas at Austin
‡ Design and Technology Solutions
This paper presents a concurrent error detection tech-
phasis on low area overhead. Rather than detect all mod-
eled transient faults, the techniqueselects faults which have
a high probability of causing damage to the architectural
state of the processor and protects the circuit against these
faults. Fault detection is achievedthrough a series of asser-
tions. Each assertion is an implication from inputs to the
outputs of a combinational circuit. Fault simulation exper-
iments performed on control logic modules of an industrial
processor suggest that high reduction in damage causing
faults can be achieved with a low overhead.
Transient faults can occur in a processor as a result of
electrical noise, like crosstalk, or high energy particles, like
neutrons and alpha particles. These faults can cause a pro-
gram running on the processor to behave erratically, if they
propagate and change the architectural state of the proces-
sor. These faults can occur in memory arrays, sequential el-
ements or in the combinational logic in the processor. Pro-
tection against transient faults in combinational logic has
not received much attention traditionally because combina-
tional logic has a natural barrier stopping the propagation
of the faults . Three masking factors - logical, electri-
cal and latching-window, reduce the probability that a tran-
sient fault propagates and latches on to sequential elements.
With the current trends in the processor industry, however,
the masking provided by these factors is reducing . Re-
duced logical depth between latches means that there are
more sensitized paths and hence more paths for a transient
fault at a gate to propagate and latch on. Decreasing feature
sizes and lowering operating voltages result in the lesser
charge stored at any node. Thus the electrical noise or the
energy of the particle strikes required for triggering a tran-
sient fault is decreasing. High operating frequencies mean
that there are more latching-windowsper unit time, thus in-
Due to the reasons mentioned, the combinational portion of
the processor is projected to become a dominant source of
failures due to transient faults .
Various techniques have been proposed to detect tran-
sient faults in combinational circuits. Residue codes have
been found to be very effective in detecting faults in data-
path circuits. For control logic circuits, codeword based
schemes have been proposed. Codes like parity ,,
, Berger  and Bose-Lin  are predicted for the out-
puts of the circuit and the codes of real-time outputs are
matched against the predicted codes. These techniques for
control logic circuits are viable for mission critical applica-
tions where reliability is of primary concern and area, tim-
ing and power take second place. There is another class
of techniques which do not attempt to detect all the mod-
eled faults like the above methods. Rather, they try to de-
tect most of the errors at a reasonable overhead. Such tech-
niques are more viable for mainstream applications which
do not have the same stringent FIT (Failure In Time) re-
quirements as the mission critical applications . The
work presented in this paper falls under this category of
This paperpresentsa newlow-costtechniqueforconcur-
rent error detection (CED) in processor control logic. The
proposedtechniquetakes advantageofthefact thattransient
faults in gates along some paths are much more likely to
propagate to an architectural state under normal running of
the processorthan othersandprotects againsterrorsin these
paths. The techniqueautomaticallyextractsthe controlcon-
ditions (input value combinations) under which these paths
are sensitized and converts these conditions into assertions.
Each assertion is an implication from the control conditions
to the value of an output. Depending on the area overhead
budget and the required transient error reduction, a subset
of the extracted assertions can be selected for CED. The
work presented here is similar to the work presented in 
in the sense that outputs are predicted for a few input com-
binations. As opposed to this technique, the present work
978-3-9810801-3-1/DATE08 © 2008 EDAA
Figure 1: Block diagram for the proposed CED
does not make any assumptions as to the duration of tran-
sient faults and has a very low latency. The work presented
here can be considered a more fine grained approach than
the concept of architectural vulnerability factor (AVF) .
Instead of just determining which modules are more vul-
nerable, we determine which flops in these modules are the
most vulnerable and protect the combinational logic which
feeds these flops.
Fault simulation experiments were performed on se-
sample segments of real applications. The proposed tech-
nique was implemented for each of these modules for fault
escape reductions of 50%, 75%, 90%, 95% and 99%. A
fault escape happens if a transient fault propagates unde-
tected to the architectural state of the processor. Results
show that high fault escape reductions can be achieved at
low costs. Over 95% fault escape reduction can be obtained
with just 25% area overhead.
The rest of the paper is organized as follows. Section 2
gives an overview of the proposed technique and provides
an insight as to why high fault escape reductions can be ob-
tained with a few assertions. Section 3 and Section 4 pro-
vide a detailed discussion about the algorithm. Section 5
presents the experimental results and Section 6 provides
In this section, we will present an overview of the pro-
posed technique. The technique protects the combinational
portion of the circuit against transient errors.
shows a circuit which has CED capability. The technique
introduces an Assertion Checker which takes as inputs the
inputs and the outputs of the combinational circuit and
which gives out a signal whether they conform to a series
of assertions. Each assertion is an implication of the form
antecedent =⇒ consequent (if antecedent is true, then
consequent is true). The antecedent in each assertion is a
Table 1: Distributionofinputvectorsof combinationalpor-
tion of an example module
Number of unique
% of total vectors
minterm on a subset of inputs to the circuit. All the inputs
need not be part of the antecedent. In many cases, the an-
tecedent is a minterm on just one or two of the inputs. The
consequent in each assertion is a literal of an output of the
circuit. For example, the following would be a valid as-
sertion according to our technique: i2i3?=⇒ o5, where
i2 and i3 are two of the inputs to the circuit and o5 is an
output of the circuit. This assertion states that o5 should
have a value of 1 when i2 = 1 and i3 = 0. An assertion
will detect all the faults which propagateto the outputin the
consequent when the antecedent is true. The above exam-
ple assertion will detect all faults propagating to o5 when
i2,i3 = 1,0.
In testing terminology, the antecedent of an assertion
would form a test vector for a stuck-at fault for the out-
put in the consequent. In the above example, i2 = 1 and
i3 = 0 would form a test vector for a stuck-at-0 fault at o5.
In fact, any test vector which detects any stuck-at fault at an
output of the combinationalcircuit can be converted into an
assertion on that output.
An assertion can also be viewed as checking for a subset
of the truth table for the corresponding output. The above
example checks for that subset of the o5 truth table which
has i2 = 1 and i3 = 0.
In order to keep the overhead for concurrent error de-
tection to a minimum, we need to select the minimal set of
assertions such that the required transient fault coverage is
achieved. To assist us in selecting the assertions to be in-
cluded in the assertion checker, we use transient fault simu-
lations using sample segments of real applications and take
into consideration only the faults which propagate all the
way to the architectural states of the processor. This differs
from the experimental methodology followed in almost all
previous CED schemes proposed ,,,,,,
where random vectors are used as inputs of the combina-
tional circuits and all the faults which propagate to the out-
puts of the combinational portion are considered important.
Themethodologyfollowedin this paperis similarto theone
used in  for logic derating.
The effectiveness of an assertion in detecting transient
faults which propagate to the primary outputs varies widely
depending on the following factors.
Figure 2: Asymmetry among flops contributing to faults
which propagate to primary outputs
• Faults in some paths are more likely to propagate and
latch on to sequential elements than others.
control-logic of a processor, the distribution of vec-
tors applied at run-time to the inputs of the combi-
national portion (primary inputs as well as outputs of
latches/flops) is highly skewed. A small subset of input
vectors is applied for a large percentage of clock cy-
cles. This is due to the fact that some state transitions in
the finite state machine (FSM) of a control logic mod-
ule are more common than others. Additionally, some
input combinations are invalid and hence cannot occur.
As a result, some paths in the circuit are more often ex-
ercised than others. Transient faults in these paths are
more likely to propagate to the outputs of the combina-
tional part of the circuit and hence to the inputs of the
sequential elements (latches and flops). To show their
skewed nature, we collected vectors that were applied
at the inputs of the combinational portions of a control
logic module (module3 in Table 3) during 703547 cy-
cles. The module has 390 inputs to the combinational
portion. We collected these vectors from traces of sam-
ple programsrunningon the processor. The unique vec-
tors are sorted according to the number of times they
occur and their distribution is shown in Table 1. The
703547 vectors have a total of 72353 unique vectors.
the table. Just 32 unique vectors contribute to about
50% of all the vectors.
• Due to clock gating, bit-flips at inputs at some of the
sequential elements are more likely to get latched on
• Bit flips incertainsequentialelementsaremorelikelyto
affect the architectural states of the processor than oth-
ers. Bit flips in latches (flops) may be masked logically
in the combinational portion which prevents them from
propagating to the architectural states. Due to asymme-
try in the vectors applied at the inputs of combinational
logic, bit flips in some latches are more likely to prop-
agate to the next level of latches. For example, if the
output of a latch is fed to an AND gate whose other
input is predominantly 0, a bit flip in that latch has a
very low probability of propagating. The greater the
pipeline distance between a latch and the architectural
states, the more probable it is that a bit flip in that latch
is masked. To show the asymmetry in the importance
of latches in terms of bit flips in them propagating, we
randomly injected transient faults in the sequential ele-
ments of the modules listed in Table 3. We then marked
the faults which propagate to the primary outputs of the
module in which each of those modules is instantiated.
tural states and Figure 2 shows the results. We can see
that bit flips in just 5% of the flops contribute to more
than 90% of all the faults propagating to the primary
The net effect of the above observations is that some asser-
tions detect more transient faults propagating to the archi-
tectural states of the processor than others. An effective as-
sertion is one whose consequent is on a combinational out-
put which feeds a vulnerable latch. The antecedent of the
assertion will cover the most common subset of the output
truth table. The next few sections deal with how we auto-
matically extract such assertions based on fault simulations
on the circuit.
3 Algorithm for assertion extraction
This section describesthe algorithmforextractingall the
assertions which are valid for a particular input vector. In
the next section, we integrate this algorithm with the rest
of the flow for finding the minimal set of assertions. For
ease of explanation, we define the term control assignment
(CA). For any particular vector, the CA for any net in the
circuit defines the assignments of values to inputs which
guaranteethe currentvalue of the net (value of the net when
the current input vector is applied). In testing terminology,
trollability condition which is true for the current vector. A
CA is in sum of products (SOP) format where each prod-
uct defines a different controllability condition. A CA of
i1 + i2?for a net means that the net is guaranteed to have
the current value if i1 = 1 or if i2 = 0. Additionally, i1
and i2 have values 1 and 0 in the current vector. A product
in the controlassignmentfor an outputof the combinational
circuit is a test vector for stuck-at fault at that output since
propagationcondition is also met (in addition to controlling
If we knowthe CAs of all the inputs of anygate, theCA
of its output can be calculated according to the rules stated
Table 2: Propagation of CAs for an AND gate
• If the gate has at least one controlling input, the CA of
theoutputis thesumofCAs ofall thegateinputswhich
have controlling values.
• If the gate has all non-controlling values at the inputs,
the CA of the output is the product of CAs of all the
Table2illustratesthepropagationofCAs foranAND gate
with inputs i1 and i2. CA1, CA2and CAorepresent the
CAs of i1, i2 and the output of the gate. The propagation
tables for other types of gates can be similarly obtained.
We will now present an algorithm for extracting the as-
sertions on the outputs of the circuit for a given vector. Ini-
tially, all the nets in the circuit are orderedtopologically(all
inputs of a gate are listed before the output of the gate). For
each net in the ordered list
1. If the net is an input of the circuit, the CA of the net is
the positive literal of the net if the net has a value 1 in
the simulation vector. The CA is the negativeliteral of
the net if it has a value 0.
2. If the net is not an input to the circuit, calculate the
CA of the net from the CAs of the inputs of the gate
driving the net according to the rules stated above.
3. Convert the CA into the SOP format if it is not al-
ready in the format.
4. Trim the CA to contain only those products which
havenumberofliteralslesser than(n+thresh), where
n is the minimum number of literals in all the products
and thresh is a parameter of the algorithm.
Assertions on the outputs of the circuit are then extracted
as follows. Each product in the output CA can be made an
antecedent of a different assertion on that output. If the out-
put has a value 0, then the consequent of the assertions will
be the negative literal of the output. It will be the positive
literal of the output otherwise.
We need to trim the CAs of the nets (step 4 in the al-
gorithm above) to prevent the explosion of the number of
terms in the CAs calculated subsequently from this CA.
We trim away the products which have a large number of
literals. The intuition behind this trimming is that the lesser
the number of literals in the antecedent of any assertion,
the more probable it is to occur very often and hence the
more probable it is to be picked among the most effective
assertions. On the other hand, trimming away some of the
productsmay lead to droppingsome of the assertions which
may detect a large number of transient faults.
Figure 3 gives an example circuit and shows how the
control assignments are propagated. Each net in the cir-
cuit is accompanied by the tuple (net name, value, control
Figure 3: Example control assignment propagation
assignment). The circuit has inputs a, b, c and d and has an
output y. A vector 0001 is applied to the circuit. The calcu-
lated controlassignmentsare givenin the figure. The AND
gate gives an example of how CAs are propagated when
both gate inputs are controlling. The output gate gives an
example of how CAs are propagatedwhen both gate inputs
are non-controlling. The output y has a CA of a?d + b?d.
Two assertions can then be extracted for the given vector,
(a?d =⇒ y?) and (b?d =⇒ y?).
4Algorithm for low-cost CED
In this section, we describe the algorithm for construct-
ing the assertion checker for a given circuit. The algorithm
takes as inputs the description of the circuit and the func-
tional vectors applied to the circuit. The algorithm works
for a given target reduction in fault escapes. A fault escape
is a fault which propagates to the architectural state of the
processor without being detected. In the absence of any
concurrenterror detection (CED), all the faults which prop-
agate to the architectural states are fault escapes. In pres-
ence of CED, some of these faults are detected and hence
there is a reduction in the number of fault escapes. The tar-
get reduction in fault escapes that is required is given as a
parameter to the algorithm. Given below are the steps in-
volved in implementing the algorithm.
Step 1: Performingfault simulations andbuilding fault
database. In this step we inject m transient faults in each
cycle of the functional vectors, where m is a parameter to
the algorithm. The transient faults are injected in the com-
binationalportion of the design accordingto any givenfault
model (single-event transients, cross-talk faults etc.). We
set as observation points the architectural state as well as
the outputs of combinational portions. For each fault which
propagates to the architectural state, we note the outputs of
the combinational portion to which the fault propagates be-
fore being first latched on to a sequential element. For these
faults, we store the vector, the fault site and the outputs of
the combinationalportion to which the fault propagatesin a
fault database. Since we store only the faults which propa-
gate to the architecturalstate, we automaticallyconsiderthe
various masking factors mentioned in Section 2.
Step 2: Extracting assertions. For each unique vector
in the fault database, we extract assertions as described in
Table 3: Details of modules used for evaluation
Module Num. of
Section 3 for all combinational outputs to which any fault
injected in that vector propagates to.
Step 3: Building the assertion database. For each ex-
tracted assertion, we find out all the faults which are de-
tected by that assertion. An assertion detects a fault if the
the fault is injected and the fault propagates to the output in
the consequent of the assertion. We store the list of asser-
tions and the faults they detect in an assertion database.
Step 4: Picking top assertions for a given reduction in
fault escapes. Ideally we would like to pick the minimal
numberof assertions for detecting a given number of faults.
This problem is similar to the set-covering problem and is
NP-complete. We employ a simple greedy approximation
algorithm to pick assertions for a given reduction in fault
escapes. The target number of faults requiredto be detected
for achieving the target reduction in fault escapes is calcu-
lated. The assertions are greedily picked till the target num-
ber of faults is detected.
Step 5: Constructing the assertion checker. Once the
assertions needed for a given reduction in fault escapes are
picked, the assertion checker is constructed by synthesizing
the conjunction of all the individual assertions. We consid-
ered two different implementations for synthesizing the as-
sertion checker - a totally self-checking checker and a self-
exercising strongly code-disjoint checker . A dual-rail
implementation is used for synthesizing the self-checking
checker. For the self-exercising checker, during the test
phase, all the antecedents are forced to be true and the con-
sequents are forced to be false one after the other. This im-
plementation takes advantage of the fact that the assertion
checker is the conjunction of all the individual assertions to
obtain a low-overheadself-exercising checker.
The algorithm presented in this paper was evaluated on
five random control logic modules in the integer execution
unit of an industrial processor. The modules - module1,
module2, module3, module4 and module5 - are instanti-
ated in the execution unit of the processor. The details of
these modules are given in Table 3.
An in-house transient fault simulator was used for all the
fault simulations. The vectors used for fault simulation are
functional traces extracted when running programs on the
processor simulation model. 416 different functional traces
with a total of 703547 vectors are used. For each transient
fault to be injected a fault site was chosen randomly among
all the nets in a module and the value at the net during a
given cycle was corrupted. 5 transient faults were injected
outputs of the execution unit. An implicit assumption here
is that the faults which propagate to the primary outputs of
the execution unit are going to affect the architectural state
of the processor.
The entire algorithm for extracting and picking asser-
tions is written in perl. The program was run 5 times for
each module with target fault escape reductions of 50%,
75%, 90%, 95% and 99%.
was used to synthesize the modules. The assertion check-
ers (both self-checking and self-exercising) for each point
were implemented and the area overheads were calculated.
Thetechnologylibraryusedis thelsi 10klibrarydistributed
along with Synopsys Design Compiler. For comparison
purposes, the partial duplication technique described in 
was also implemented on all five of the modules for the
given target fault escape reductions. Consistent with our
methodology, we considered only the faults which escape
instead of considering all the faults which propagate to out-
puts of combinational logic.
Table 4 shows the area overhead results for partial du-
plication (PD), the proposedtechnique with dual rail imple-
mentation (PT-D) and with self-exercising implementation
(PT-S) when achieving different fault escape reductions.
The average area overheads for different fault escape tar-
gets are plotted in Figure 4. It can be seen that high amount
of fault escape reductions can be obtained with a low area
overhead. On an average, 50% fault escape reduction can
be obtained with just 3% overhead. This number increases
to 54% for PT-D and 42% for PT-S when the target fault
escape reduction is 99%. It can be seen from the figure that
compared to partial duplication technique, the average area
overhead of the proposed technique with dual rail imple-
mentation is always lower. Further area savings can be ob-
tained if just a self-exercising checker is needed. For a tar-
get fault escape reduction of 95%, dual rail implementation
implementation is 40% better.
duplication (PD in the table) are very low compared to the
results presented in the original paper . This difference
can be attributed to the differences in selection methodol-
ogy followed. In this paper, we performed fault simulations
using traces from real programs instead of using random
vectors at the inputs of combinational logic. We also con-
sidered only the faults which escape to the primary outputs
Synopsys Design Analyzer
Table 4: % Area overhead for different target fault escape reductions for partial duplication (PD), proposed technique as a
dual-rail checker (PT-D) and proposed technique as a self-exercising checker (PT-S)
50% reduction75% reduction90% reduction
module20.90.6 0.78.3 8.47.8 51.2
module412.3 126.96.36.1994.8 15.725.5
Figure4: Averagearea overheadoverdifferentfault escape
of the execution unit instead of considering all the faults
which propagate to the outputs of the combinational logic.
A new algorithm for detecting transient faults in the con-
trol logic of a processor with a low overhead has been pre-
sented. An assertion checker is automatically constructed
using the architectural traces of real programs. The checker
checks the outputs of a combinational circuit against a sub-
set of the truth table. The algorithm takes advantage of
the following properties of the control logic of a processor
to yield a low-overhead checker - asymmetry in the paths
which are exercised at real-time and the asymmetry in the
propagativity of bit-flips in individual flops to the architec-
tural state of the processor. Fault simulation experiments
were run on five different random control logic modules in
an industrial processor. Results show that more than 95%
of all the faults which propagate to architectural states can
be detected with an average area overhead of just around
25%. This is more than 40% lesser when compared with
previously proposed work for the same amount of fault de-
We would like to acknowledge the contribution
of Suriyaprakash Natarajan of Intel for early pattern-
distribution results on a couple of Intel test cases demon-
strating significant input vector bias.
 D. Das and N. A. Touba. Synthesis of circuits with low-
cost concurrent error detection based on bose-lin codes. J.
Electron. Test., 15(1-2):145–155, 1999.
 P. Drineas and Y. Makris. Non-intrusive design of concur-
rently self-testable fsms. In ATS ’02: Proceedings of the
11th Asian Test Symposium, pages 33–38, 2002.
 N. K. Jha and S.-J. Wang. Design and synthesis of self-
checking vlsi circuits and systems. In ICCD ’91: Proceed-
ings of the 1991 IEEE International Conference on Com-
puter Design on VLSI in Computer & Processors, pages
 P. Liden, P. Dahlgren, R. Johansson, and J. Karlsson. On
latching probability of particle induced transients in combi-
national networks. In24th Int. Symposium on Fault-Tolerant
Computing, pages 340–349, 1994.
 K. Mohanram and N. Touba. Cost-effective approach for re-
ducing soft error failurerate inlogic circuits. InProceedings
of the International Test Conference, pages 893–901, 2003.
 H. T. Nguyen and Y. Yagil. A systematic approach to ser
estimation and solutions. In Proceedings of IEEE Interna-
tional Reliability Physics Symposium, pages 60–70, 2003.
 M. Nicolaidis. Self-exercising checkers for unified built-in
self-test (ubist). IEEE Transactions on CAD, 8(3):203–218,
 M. Nicolaidis, R. O. Duarte, S. Manich, and J. Figueras.
Fault-secure parity prediction arithmetic operators. IEEE
Des. Test, 14(2):60–71, 1997.
 P. Shivakumar et al.Modeling the effect of technology
trends on the soft error rate of combinational logic. In DSN
’02: Proceedings of the 2002 International Conference on
Dependable Systems and Networks, pages 389–398, 2002.
 S. S. Mukherjee et al.
compute the architectural vulnerability factors for a high-
performance microprocessor. In MICRO 36: Proceedings
of the 36th annual IEEE/ACM International Symposium on
Microarchitecture, page 29, 2003.
 F.F.Sellers,M.-Y.Hsiao, andL.W.Bearnson, editors. Error
Detection Logic for Digital Computers. McGraw-Hill Book
 N. A. Touba and E. J. McCluskey. Logic synthesis of mul-
tilevel circuits with concurrent error detection. IEEE Trans-
actions on CAD, 16(7):783–789, 1997.
A systematic methodology to