ArticlePublisher preview available

# Formal Verification of ECCs for Memories Using ACL2

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

## Abstract and Figures

Due to the ever-increasing toll of soft errors in memories, Error Correction Codes (ECCs) like Hamming and Reed-Solomon Codes have been used to protect data in memories, in applications ranging from space to terresterial work stations. In past seven decades, most of the research has focused on providing better ECC strategies for data integrity in memories, but the same pace research efforts have not been made to develop better verification methodologies for the newer ECCs. As the memory sizes keep increasing, exhaustive simulation-based testing of ECCs is no longer practical. Hence, formal verification, particularly theorem proving, provides an efficient, yet scarcely explored, alternative for ECC verification. We propose a framework, with extensible libraries, for the formal verification of ECCs using the ACL2 theorem prover. The framework is easy to use and particularly targets the needs of formally verified ECCs in memories. We also demonstrate the usefulness of the proposed framework by verifying two of the most commonly used ECCs, i.e., Hamming and Convolutional codes. To illustrate that the ECCs verified using our formal framework are practically reliable, we utilized a formal record-based memory model to formally verify that the inherent properties of the ECCs like hamming distance, codeword decoding, and error detection/correction remain consistent even when the ECC is implemented on the memory.
This content is subject to copyright. Terms and conditions apply.
https://doi.org/10.1007/s10836-020-05904-2
Formal Veriﬁcation of ECCs for Memories Using ACL2
Received: 12 April 2020 / Accepted: 2 September 2020
Abstract
Due to the ever-increasing toll of soft errors in memories, Error Correction Codes (ECCs) like Hamming and Reed-Solomon
Codes have been used to protect data in memories, in applications ranging from space to terresterial work stations. In
past seven decades, most of the research has focused on providing better ECC strategies for data integrity in memories,
but the same pace research efforts have not been made to develop better verification methodologies for the newer ECCs.
As the memory sizes keep increasing, exhaustive simulation-based testing of ECCs is no longer practical. Hence, formal
verification, particularly theorem proving, provides an efficient, yet scarcely explored, alternative for ECC verification. We
propose a framework, with extensible libraries, for the formal verification of ECCs using the ACL2 theorem prover. The
framework is easy to use and particularly targets the needs of formally verified ECCs in memories. We also demonstrate the
usefulness of the proposed framework by verifying two of the most commonly used ECCs, i.e., Hamming and Convolutional
codes. To illustrate that the ECCs verified using our formal framework are practically reliable, we utilized a formal record-
based memory model to formally verify that the inherent properties of the ECCs like hamming distance, codeword decoding,
and error detection/correction remain consistent even when the ECC is implemented on the memory.
Keywords Error Correction Codes (ECCs) ·Memory soft errors ·Hamming codes ·Convolutional codes ·Formal
verification ·Theorem proving ·ACL2
1 Introduction
Soft errors are type of errors that do not cause permanent
damage to the semi-conductor devices [56], yet leading to
temporary faults in them. In particular, radiation induced
soft errors have been a major concern in semi-conductor
devices since 1970s [12,60]. In a long chain of events,
both the high speed protons in cosmic rays and the alpha
particles emitted during the decay of radioactive impurities
Responsible Editor: V. D. Agrawal
Mahum Naseer
mnaseer.msee16seecs@seecs.edu.pk
Osman Hasan
osman.hasan@seecs.nust.edu.pk
1School of Electrical Engineering and Computer Science
(SEECS), National University of Sciences and Technology
in IC packaging material, induce the silicon based semi-
conductor memories to change their logic states, hence
resulting in soft errors [10,47].
Recent advancements in technology, including circuit
miniaturization, voltage reduction, and increased circuit
clock frequencies, have augmented the problem of soft
errors in memories [10,48]. The most obvious drawbacks of
memory errors include the loss of correct data and the addi-
tion of faulty data into the memory. However, depending
on the application/system using the memory, the severity
of these memory errors could vary. This is summarized in
Fig. 1. In a LEON3 processor, a memory error may simply
cause a result error, i.e., an erroneous output from an algo-
rithm running on the system, or a system timeout, i.e., the
termination of an application without any result [39]. Sim-
ilarly, in a Xilinx FPGA, such errors may cause the system
to halt [33].
Error Correction Codes (ECCs) [44], are used to cater
for memory errors by adding extra bits, often called parity
or check bits, to the data bits in the memory. The parity bits
are calculated using the available data bits, and in case of an
error, the lost data is retrieved using these parity bits. Hence,
ECCs are considered to be the most effective solution for
memory errors [10], and since the introduction of Hamming
/ Published online: 26 September 2020
Journal of Electronic Testing (2020) 36:643–663
Chapter
Full-text available
The magic wand $$\mathbin {-\!\!*}$$ - ∗ (also called separating implication) is a separation logic connective commonly used to specify properties of partial data structures, for instance during iterative traversals. A footprint of a magic wand formula "Equation missing" is a state that, combined with any state in which A holds, yields a state in which B holds. The key challenge of proving a magic wand (also called packaging a wand) is to find such a footprint. Existing package algorithms either have a high annotation overhead or, as we show in this paper, are unsound. We present a formal framework that precisely characterises a wide design space of possible package algorithms applicable to a large class of separation logics. We prove in Isabelle/HOL that our formal framework is sound and complete, and use it to develop a novel package algorithm that offers competitive automation and is sound. Moreover, we present a novel, restricted definition of wands and prove in Isabelle/HOL that it is possible to soundly combine fractions of such wands, which is not the case for arbitrary wands. We have implemented our techniques for the Viper language, and demonstrate that they are effective in practice.
Chapter
Full-text available
Spot is a C++17 library for LTL and $$\omega$$ ω -automata manipulation, with command-line utilities, and Python bindings. This paper summarizes its evolution over the past six years, since the release of Spot 2.0, which was the first version to support $$\omega$$ ω -automata with arbitrary acceptance conditions, and the last version presented at a conference. Since then, Spot has been extended with several features such as acceptance transformations, alternating automata, games, LTL synthesis, and more. We also shed some lights on the data-structure used to store automata. Artifact: https://zenodo.org/record/6521395 .
Chapter
Full-text available
SMT solvers are highly complex pieces of software with performance, robustness, and correctness as key requirements. Complementing traditional testing techniques for these solvers with randomized stress testing has been shown to be quite effective. Recent work has showcased the value of input fuzzing for finding issues, but this approach typically does not comprehensively test a solver’s API. Previous work on model-based API fuzzing was tailored to a single solver and a small subset of SMT-LIB. We present Murxla, a comprehensive, modular, and highly extensible model-based API fuzzer for SMT solvers. Murxla randomly generates valid sequences of solver API calls based on a customizable API model, with full support for the semantics and features of SMT-LIB. It is solver-agnostic but extensible to allow for solver-specific testing and supports option fuzzing, cross-checking with other solvers, translation to SMT-LIBv2, and SMT-LIBv2 input fuzzing. Our evaluation confirms its efficacy in finding issues in multiple state-of-the-art SMT solvers.
Chapter
Full-text available
RIOT is a micro-kernel dedicated to IoT applications that adopts eBPF (extended Berkeley Packet Filters) to implement so-called femto-containers. As micro-controllers rarely feature hardware memory protection, the isolation of eBPF virtual machines (VM) is critical to ensure system integrity against potentially malicious programs. This paper shows how to directly derive, within the Coq proof assistant, the verified C implementation of an eBPF virtual machine from a Gallina specification. Leveraging the formal semantics of the CompCert C compiler, we obtain an end-to-end theorem stating that the C code of our VM inherits the safety and security properties of the Gallina specification. Our refinement methodology ensures that the isolation property of the specification holds in the verified C implementation. Preliminary experiments demonstrate satisfying performance.
Chapter
Full-text available
Most methods of data transmission and storage are prone to errors, leading to data loss. Forward erasure correction (FEC) is a method to allow data to be recovered in the presence of errors by encoding the data with redundant parity information determined by an error-correcting code. There are dozens of classes of such codes, many based on sophisticated mathematics, making them difficult to verify using automated tools. In this paper, we present a formal, machine-checked proof of a C implementation of FEC based on Reed-Solomon coding. The C code has been actively used in network defenses for over 25 years, but the algorithm it implements was partially unpublished, and it uses certain optimizations whose correctness was unknown even to the code’s authors. We use Coq’s Mathematical Components library to prove the algorithm’s correctness and the Verified Software Toolchain to prove that the C program correctly implements this algorithm, connecting both using a modular, well-encapsulated structure that could easily be used to verify a high-speed, hardware version of this FEC. This is the first end-to-end, formal proof of a real-world FEC implementation; we verified all previously unknown optimizations and found a latent bug in the code.
Chapter
Full-text available
Compositional synthesis relies on the discovery of assumptions, i.e., restrictions on the behavior of the remainder of the system that allow a component to realize its specification. In order to avoid losing valid solutions, these assumptions should be necessary conditions for realizability. However, because there are typically many different behaviors that realize the same specification, necessary behavioral restrictions often do not exist. In this paper, we introduce a new class of assumptions for compositional synthesis, which we call information flow assumptions . Such assumptions capture an essential aspect of distributed computing, because components often need to act upon information that is available only in other components. The presence of a certain flow of information is therefore often a necessary requirement, while the actual behavior that establishes the information flow is unconstrained. In contrast to behavioral assumptions, which are properties of individual computation traces, information flow assumptions are hyperproperties , i.e., properties of sets of traces. We present a method for the automatic derivation of information-flow assumptions from a temporal logic specification of the system. We then provide a technique for the automatic synthesis of component implementations based on information flow assumptions. This provides a new compositional approach to the synthesis of distributed systems. We report on encouraging first experiments with the approach, carried out with the BoSyHyper synthesis tool.
Chapter
Full-text available
Workflow nets are a well-established mathematical formalism for the analysis of business processes arising from either modeling tools or process mining. The central decision problems for workflow nets are k -soundness, generalised soundness and structural soundness. Most existing tools focus on k -soundness. In this work, we propose novel scalable semi-procedures for generalised and structural soundness. This is achieved via integral and continuous Petri net reachability relaxations. We show that our approach is competitive against state-of-the-art tools.
Chapter
Full-text available
MoGym, is an integrated toolbox enabling the training and verification of machine-learned decision-making agents based on formal models, for the purpose of sound use in the real world. Given a formal representation of a decision-making problem in the JANI format and a reach-avoid objective, MoGym (a) enables training a decision-making agent with respect to that objective directly on the model using reinforcement learning (RL) techniques, and (b) it supports rigorous assessment of the quality of the induced decision-making agent by means of deep statistical model checking (DSMC). MoGym implements the standard interface for training environments established by OpenAI Gym, thereby connecting to the vast body of existing work in the RL community. In return, it makes accessible the large set of existing JANI model checking benchmarks to machine learning research. It thereby contributes an efficient feedback mechanism for improving in particular reinforcement learning algorithms. The connective part is implemented on top of Momba. For the DSMC quality assurance of the learned decision-making agents, a variant of the statistical model checker modes of the Modest Toolset is leveraged, which has been extended by two new resolution strategies for non-determinism when encountered during statistical evaluation.KeywordsFormal Methods Statistical Model CheckingReinforcement Learning
Chapter
Full-text available
In many synthesis problems, it can be essential to generate implementations which not only satisfy functional constraints but are also randomized to improve variety, robustness, or unpredictability. The recently-proposed framework of control improvisation (CI) provides techniques for the correct-by-construction synthesis of randomized systems subject to hard and soft constraints. However, prior work on CI has focused on qualitative specifications, whereas in robotic planning and other areas we often have quantitative quality metrics which can be traded against each other. For example, a designer of a patrolling security robot might want to know by how much the average patrol time needs to be increased in order to ensure that a particular aspect of the robot’s route is sufficiently diverse and hence unpredictable. In this paper, we enable this type of application by generalizing the CI problem to support quantitative soft constraints which bound the expected value of a given cost function, and randomness constraints which enforce diversity of the generated traces with respect to a given label function. We establish the basic theory of labelled quantitative CI problems, and develop efficient algorithms for solving them when the specifications are encoded by finite automata. We also provide an approximate improvisation algorithm based on constraint solving for any specifications encodable as Boolean formulas. We demonstrate the utility of our problem formulation and algorithms with experiments applying them to generate diverse near-optimal plans for robotic planning problems.
Chapter
Full-text available
In this paper, we present the first fully-automated expected amortised cost analysis of self-adjusting data structures, that is, of randomised splay trees , randomised splay heaps and randomised meldable heaps , which so far have only (semi-)manually been analysed in the literature. Our analysis is stated as a type-and-effect system for a first-order functional programming language with support for sampling over discrete distributions, non-deterministic choice and a ticking operator. The latter allows for the specification of fine-grained cost models. We state two soundness theorems based on two different—but strongly related—typing rules of ticking, which account differently for the cost of non-terminating computations. Finally we provide a prototype implementation able to fully automatically analyse the aforementioned case studies."Image missing"
Article
Full-text available
Error-correcting codes add redundancy to transmitted data to ensure reliable communication over noisy channels. Since they form the foundations of digital communication, their correctness is a matter of concern. To enable trustful verification of linear error-correcting codes, we have been carrying out a systematic formalization in the Coq proof-assistant. This formalization includes the material that one can expect of a university class on the topic: the formalization of well-known codes (Hamming, Reed–Solomon, Bose–Chaudhuri–Hocquenghem) and also a glimpse at modern coding theory. We demonstrate the usefulness of our formalization by extracting a verified decoder for low-density parity-check codes based on the sum-product algorithm. To achieve this formalization, we needed to develop a number of libraries on top of Coq’s Mathematical Components. Special care was taken to make them as reusable as possible so as to help implementers and researchers dealing with error-correcting codes in the future.
Conference Paper
Full-text available
Multiple bit upsets (MBUs) caused by high energy radiation is the most common source of soft errors in static random-access memories (SRAMs) affecting multiple cells. Burst error correcting Hamming codes have most commonly been used to correct MBUs in SRAM cell since they have low redundancy and low decoder latency. But with technology scaling, the number of bits being affected increases, thus requiring a need for increasing the burst size that can be corrected. However, this is a problem because it increases the number of syndromes exponentially thus increasing the decoder complexity exponentially as well. In this paper, a new burst error correcting code based on Hamming codes is proposed which allows much better scaling of decoder complexity as the burst size is increased. For larger burst sizes, it can provide significantly smaller and faster decoders than existing methods thus providing higher reliability at an affordable cost. Moreover, there is frequently no increase in the number of check bits or a very minimal increase in comparison with existing methods. A general construction and decoding methodology for the new codes is proposed. Experimental results are presented comparing the decoder complexity for the proposed codes with conventional burst error correcting Hamming codes demonstrating the significant improvements that can be achieved.
Article
Due to the emergence of extremely high density memory along with the growing number of embedded memories, memory yield is an important issue. Memory self-repair using redundancies to increase the yield of memories is widely used. Because high density memories are vulnerable to soft errors, memory ECC (Error Correction Code) plays an important role in memory design. In this paper, methods to exploit spare columns including replaced defective columns are proposed to improve memory ECC. To utilize replaced defective columns, the defect information needs to be stored. Two approaches to store defect information are proposed - one is to use a spare column, and the other is to use a content-addressable-memory (CAM). Experimental results show that the proposed method can significantly enhance the ECC performance.
Article
Radiation effects cause several types of errors on memories including single event upsets (SEUs) or single event functional interrupts (SEFIs). Error correction codes (ECCs) are widely used to protect against those errors. For a number of reasons, there is a large interest in using double data rate type three (DDR-3) synchronous dynamic random-access (SDRAM) memories in space applications. Radiation testing results show that these memories will suffer both SEUs and SEFIs when used in space. Protection against a SEFI and an SEU is needed to achieve high reliability. In this paper, a method to protect 16-bit and 64-bit data word memories composed of 8-bit memory devices against a simultaneous SEFI and an SEU is presented. The scheme uses orthogonal Latin square (OLS) codes and can be activated when a SEFI occurs, using a conventional double error correction approach otherwise.
Conference Paper
By adding redundancy to transmitted data, error-correcting codes (ECCs) make it possible to communicate reliably over noisy channels. Minimizing redundancy and (de)coding time has driven much research, culminating with Low-Density Parity-Check (LDPC) codes. At first sight, ECCs may be considered as a trustful piece of computer systems because classical results are well-understood. But ECCs are also performance-critical so that new hardware calls for new implementations whose testing is always an issue. Moreover, research about ECCs is still flourishing with papers of ever-growing complexity. In order to provide means for implementers to perform verification and for researchers to firmly assess recent advances, we have been developing a formalization of ECCs using the SSReflect extension of the Coq proof-assistant. We report on the formalization of linear ECCs, duly illustrated with a theory about the celebrated Hamming codes and the verification of the sum-product algorithm for decoding LDPC codes.
Article
With technology scaling and complexity, better error detection and correction mechanisms within chips and systems are becoming increasingly important in order to provide sufficient protection against both soft and hard errors. Verifying the correctness of error detection circuits and ensuring they provide enough design coverage is a hard problem which usually involves substantial amount of manual work. This problem is even more challenging in the presence of different design methodologies, such as with the inclusion of third party IP blocks where functional descriptions of logic designs may not be available. This paper addresses the problem by proposing a completely automated RTL-based verification flow for error detection and correction circuits. Several related challenges are solved: first, that of identification of potential error detection circuits in logic designs where no functional description or methodology hints are given. Second, identification of structures of the latches that are potentially protected by such error detection circuits. Third, using formal verification for ensuring that the implemented circuits for resiliency indeed detect all single bit errors in the latches they are intended to cover. The approach is described with parity detection as an example, although it is extensible to other coding methods such as ECC and state orthogonality checking. Novel algorithms are given and results on industrial designs are presented.
Conference Paper
We present a formal approach to minimize the number of voters in triple-modular redundant sequential circuits. Our technique actually works on a single copy of the circuit and considers a user-defined fault model (under the form “at most 1 bit-flip every k clock cycles”). Verification-based voter minimization guarantees that the resulting circuit (i) is fault tolerant to the soft-errors defined by the fault model and (ii) is functionally equivalent to the initial one. Our approach operates at the logic level and takes into account the input and output interface specifications of the circuit. Its implementation makes use of graph traversal algorithms, fixed-point iterations, and BDDs. Experimental results on the ITC'99 benchmark suite indicate that our method significantly decreases the number of inserted voters which entails a hardware reduction of up to 55% and a clock frequency increase of up to 35% compared to full TMR. We address scalability issues arising from formal verification with approximations and assess their efficiency and precision.
Conference Paper
Redundant techniques, that use voting principles, are often used to increase the reliability of systems by ensuring fault tolerance. In order to increase the efficiency of these redundancy strategies we propose to exploit the inherent fault masking properties of software-algorithms at application-level. An important step in early development stages is to choose from a class of algorithms that achieve the same goal in different ways, one or more that should be executed redundantly. In order to evaluate the resilience of the algorithm variants, there is a great need for a quantitative reasoning about the algorithms fault tolerance in early design stages. Here, we propose an approach of analyzing the vulnerability of given algorithm variants to hardware faults in redundant designs by applying a model checker and fault injection modelling. The method is capable of automatically identifying all input and fault combinations that remain undetected by a voting system. This leads to a better understanding of algorithm-specific resilience characteristics.