Figure 2 - uploaded by Dominic P. Mulligan
Content may be subject to copyright.
Example instruction description, in vendor documentation and in Sail, showing the close correspondence between the two for execute and decode

Example instruction description, in vendor documentation and in Sail, showing the close correspondence between the two for execute and decode

Source publication
Conference Paper
Full-text available
Weakly consistent multiprocessors such as ARM and IBM POWER have been with us for decades, but their subtle programmer-visible concurrency behaviour remains challenging, both to implement and to use; the traditional architecture documentation, with its mix of prose and pseudocode, leaves much unclear. In this paper we show how a precise architectur...

Contexts in source publication

Context 1
... illustrate Sail with an example instruction de- scription in Fig. 2 (stdu, one of the simplest of the several hundred instructions we consider). On the left is the vendor documentation, while on the right are Sail definitions of a clause of the abstract-syntax type ast of instructions (Stdu, with three bitvector fields), a clause of the decode function, pattern-matching 32- bit opcode values into that ...
Context 2
... and there is also simple effect polymorphism. To keep Sail defini- tions readable, we use type inference and some limited automatic coercions (between bits, bitvectors of length 1, and numbers; from bit vectors to unsigned numbers; and between constants and bit vectors), so very few type annotations are needed -none in the right-hand side of Fig. 2 except for the sizes of the instruction opcode fields in the decode ...
Context 3
... this lets us have definitions of decoding and behaviour that are simultaneously precise and close enough to the vendor pseudocode to be readable, as Fig. 2 ...
Context 4
... tool has to deal with many irregularities in the XML to pull out the main blocks shown on the left of Fig. 2 (instruction name, form, mnemonic, binary rep- resentation, pseudocode, and list of special registers al- tered) and parse the pseudocode into a simple untyped grammar. The powerful type inference that Sail pro- vides makes it simple to generate Sail code from ...
Context 5
... back at the Fig. 1 overview, Sections 2, 3, and 4 have described the left-hand block: our ISA model and its interface. We now describe the concurrency model, with respect to that of our starting point ...
Context 6
... the abstract micro-op state of an instruction is an element of: At present we treat instruction and data memory separately, not having investigated the interactions be- tween concurrency and instruction-cache effects. In- struction fetches read values from a fixed instruction memory, decode them (if possible) using the Sail decode function, as in Fig. 2, and use the exhaustive interpreter to analyse their register footprint and potential next fetch addresses. The exhaustive interpreter is also used as necessary to re-analyse the possible future memory footprint of partially executed ...

Similar publications

Article
Full-text available
Conventional homogeneous multicore processors are not able to provide the continued performance and energy improvement that we have expected from past endeavors. Heterogeneous architectures that feature specialized hardware accelerators are widely considered a promising paradigm for resolving this issue. Among different heterogeneous devices, FPGAs...

Citations

... Making relaxed-memory semantics exhaustively executable is essential for exploring their behaviour on examples [67,54,53,20,9,36,66,23,64,49,57]. Handling relaxed virtual memory 8 Experimental testing of hardware 50 brings several new challenges. ...
... There is extensive previous work on "user" relaxed-memory semantics of modern architectures, but very little extending this to cover systems aspects such as virtual memory. We build on the approaches established in "user" models for x86, IBM Power, Arm, and RISC-V, combining executable-as-test-oracle models, discussion with architects, and experimental testing [54,5,7,47,55,53,21,52,46,9,36,31,32,49,65]. ...
Preprint
Full-text available
Virtual memory is an essential mechanism for enforcing security boundaries, but its relaxed-memory concurrency semantics has not previously been investigated in detail. The concurrent systems code managing virtual memory has been left on an entirely informal basis, and OS and hypervisor verification has had to make major simplifying assumptions. We explore the design space for relaxed virtual memory semantics in the Armv8-A architecture, to support future system-software verification. We identify many design questions, in discussion with Arm; develop a test suite, including use cases from the pKVM production hypervisor under development by Google; delimit the design space with axiomatic-style concurrency models; prove that under simple stable configurations our architectural model collapses to previous "user" models; develop tooling to compute allowed behaviours in the model integrated with the full Armv8-A ISA semantics; and develop a hardware test harness. This lays out some of the main issues in relaxed virtual memory bringing these security-critical systems phenomena into the domain of programming-language semantics and verification with foundational architecture semantics. This document is an extended version of a paper in ESOP 2022, with additional explanation and examples in the main body, and appendices detailing our litmus tests, models, proofs, and test results.
... Architecture specifications have two main parts: the sequential and relaxed-memory concurrent aspects of instruction behaviour, each of which have been studied in previous work. For Armv8-A and RISC-V, Armstrong et al. have established full-scale sequential models in Sail [10,15], a domain-specific language for instruction-set architecture (ISA) specification, that are complete enough to boot real-world operating systems such as Linux. For Armv8-A this model is automatically derived from the authoritative Arm-internal specification [24], while for RISC-V it has been hand-written and adopted by RISC-V International. ...
... In general, existing ISA descriptions do not cover this aspect of the architecture well, as they are principally developed only to describe the sequential behaviour. Previous tools have either hand-coded dependency information, which is acceptable for cut-down ISA models but too laborious and error-prone at the scale of the ISA models we use, or used a heavyweight taint-tracking interpreter [15]. Our approach avoids both of these. ...
Chapter
Architecture specifications such as Armv8-A and RISC-V are the ultimate foundation for software verification and the correctness criteria for hardware verification. They should define the allowed sequential and relaxed-memory concurrency behaviour of programs, but hitherto there has been no integration of full-scale instruction-set architecture (ISA) semantics with axiomatic concurrency models, either in mathematics or in tools. These ISA semantics can be surprisingly large and intricate, e.g. 100k+ lines for Armv8-A. In this paper we present a tool, Isla, for computing the allowed behaviours of concurrent litmus tests with respect to full-scale ISA definitions, in Sail, and arbitrary axiomatic relaxed-memory concurrency models, in the Cat language. It is based on a generic symbolic engine for Sail ISA specifications, which should be valuable also for other verification tasks. We equip the tool with a web interface to make it widely accessible, and illustrate and evaluate it for Armv8-A and RISC-V. By using full-scale and authoritative ISA semantics, this lets one evaluate litmus tests using arbitrary user instructions with high confidence. Moreover, because these ISA specifications give detailed and validated definitions of the sequential aspects of systems functionality, as used by hypervisors and operating systems, e.g. instruction fetch, exceptions, and address translation, our tool provides a basis for developing concurrency semantics for these. We demonstrate this for the Armv8-A instruction-fetch model and self-modifying code examples of Simner et al.
... They show that the store buffer semantics of TSO and PSO corresponds to their semantics of "speculations". Gray and Flur et al. [13,22] have established axiomatic and operational models for TSO and their equivalence. Their work is also integrated with detailed instruction semantics for x86, IBM Power, ARM, MIPS, and RISC-V. ...
... In this sense, reasoning about our memory model is even simpler than in the memory abstraction introduced in [47]. It is possible to translate Gray and Flur et al.'s work [13,22] to Isabelle/HOL or Coq code. However, the resulting formal model would rely on the correctness of the translation tool such as Lem [34], which adds one more layer of complication for verification. ...
Article
Full-text available
The SPARC instruction set architecture (ISA) has been used in various processors in workstations, embedded systems, and in mission-critical industries such as aviation and space engineering. Hence, it is important to provide formal frameworks that facilitate the verification of hardware and software that run on or interface with these processors. In this work, we give the first formal model for multi-core SPARC ISA and Total Store Ordering (TSO) memory model in Isabelle/HOL. We present two levels of modelling for the ISA: The low-level ISA model, which is executable, covers many features specific to SPARC processors, such as delayed-write for control registers, windowed general registers, and more complex memory access. We have tested our model extensively against a LEON3 simulation board, the test covers both single-step executions and sequential execution of programs. We also prove some important properties for our formal model, including a non-interference property for the LEON3 processor. The high-level ISA model is an abstraction of the low-level model and it provides an interface for memory operations in multi-core processors. On top of the high-level ISA model, we formalise two TSO memory models: one is an adaptation of the axiomatic SPARC TSO model (Sindhu et al. in Formal specification of memory models, Springer, Boston, 1992; SPARC in The SPARC architecture manual version 8, 1992. http://gaisler.com/doc/sparcv8.pdf), the other is a new operational TSO model which is suitable for verifying execution results. We prove that the operational model is sound and complete with respect to the axiomatic model. Finally, we give verification examples with two case studies drawn from the SPARCv9 manual.
... The two starting points for developing the mixed-size axiomatic model are the existing Flat model [147,148], an operational model with mixed-size support that is part of the rmem tool [149], and Arm's reference model [150,147,151], an axiomatic specification defined in herd [152], without mixed-size support. The two models are based on extensive past research on architectural concurrency for Armv8-a (and related Power), discussion with architects, and experimental hardware testing [147,148,153,154,152,155,156,157,158,149,159,160,161,162,163]. Our mixed-size axiomatic model generalises the reference axiomatic model to mixed-size programs in a way that aims to follow the Flat model's behaviour -Flat has been developed in collaboration with Arm and is extensively experimentally validated, although it is beyond the scope of this work to further investigate the correctness of Flat itself. ...
Thesis
WebAssembly is the first new programming language to be supported natively by all major Web browsers since JavaScript. It is designed to be a natural low-level compilation target for languages such as C, C++, and Rust, enabling programs written in these languages to be compiled and executed efficiently on the Web. WebAssembly’s specification is managed by the W3C WebAssembly Working Group (made up of representatives from a number of major tech companies). Uniquely, the language is specified by way of a full pen-and-paper formal semantics. This thesis describes a number of ways in which I have both helped to shape the specification of WebAssembly, and built upon it. By mechanising the WebAssembly formal semantics in Isabelle/HOL while it was being drafted, I discovered a number of errors in the specification, drove the adoption of official corrections, and provided the first type soundness proof for the corrected language. This thesis also details a verified type checker and interpreter, and a security type system extension for cryptography primitives, all of which have been mechanised as extensions of my initial WebAssembly mechanisation. A major component of the thesis is my work on the specification of shared memory concurrency in Web languages: correcting and verifying properties of JavaScript’s existing relaxed memory model, and defining the WebAssembly-specific extensions to the corrected model which have been adopted as the basis of WebAssembly’s official threads specification. A number of deficiencies in the original JavaScript model are detailed. Some errors have been corrected, with the verified fixes officially adopted into subsequent editions of the language specification. However one discovered deficiency is fundamental to the model, an instance of the well-known "thin-air problem". My work demonstrates the value of formalisation and mechanisation in industrial programming language design, not only in discovering and correcting specification errors, but also in building confidence both in the correctness of the language’s design and in the design of proposed extensions.
... Memory specifications define what values may be read in a concurrent system. Current evaluators rely on ad hoc algorithms [3,6,14] or satisfiability (SAT) solvers [40]. However, flaws in existing language memory specifications [5] -where one must account for executions introduced through aggressive optimisationhave led to a new class of memory specifications [22,20] that cannot be practically solved using existing ad hoc or SAT techniques. ...
Chapter
Full-text available
We present PrideMM, an efficient model checker for second-order logic enabled by recent breakthroughs in quantified satisfiability solvers. We argue that second-order logic sits at a sweet spot: constrained enough to enable practical solving, yet expressive enough to cover an important class of problems not amenable to (non-quantified) satisfiability solvers. To the best of our knowledge PrideMM is the first automated model checker for second-order logic formulae.
... There has been much work on formal MCM specifications of hardware ISAs in recent years [2,3,5,9,10,13,20,25,26,28,32,35,38]. There has also been much work on MCM verification, using a variety of approaches. ...
Preprint
Modern SoCs are heterogeneous parallel systems comprised of components developed by distinct teams and possibly even different vendors. The memory consistency model (MCM) of processors in such SoCs specifies the ordering rules which constrain the values that can be read by load instructions in parallel programs running on such systems. The implementation of required MCM orderings can span components which may be designed and implemented by many different teams. Ideally, each team would be able to specify the orderings enforced by their components independently and then connect them together when conducting MCM verification. However, no prior automated approach for formal hardware MCM verification provided this. To bring automated hardware MCM verification in line with the realities of the design process, we present RealityCheck, a methodology and tool for automated formal MCM verification of modular microarchitectural ordering specifications. RealityCheck allows users to specify their designs as a hierarchy of distinct modules connected to each other rather than a single flat specification. It can then automatically verify litmus test programs against these modular specifications. RealityCheck also provides support for abstraction, which enables scalable verification by breaking up the verification of the entire design into smaller verification problems. We present results for verifying litmus tests on 7 different designs using RealityCheck. These include in-order and out-of-order pipelines, a non-blocking cache, and a heterogeneous processor. Our case studies cover the TSO and RISC-V (RVWMO) weak memory models. RealityCheck is capable of verifying 98 RVWMO litmus tests in under 4 minutes each, and its capability for abstraction enables up to a 32.1% reduction in litmus test verification time for RVWMO.
... Finally, in the academic works that describe the operational semantics of machine instructions, we can single out various dialects of the nML language [24], as well as the languages ISDL [25], L3 [26], and SAIL [27]. The most interesting approach is implemented in L3 and SAIL: to describe both decoding and behavior of instructions, a dialect of a purely functional language is employed. ...
Article
Many binary code analysis tools rely on intermediate representation (IR) derived from a binary code, instead of working directly with machine instructions. In this paper, we first consider binary code analysis problems that benefit from IR and compile a list of requirements that the IR suitable for solving these problems should meet. Generally speaking, a universal binary analysis platform requires two principal components. The first component is a retargetable instruction decoder that utilizes external specifications to describe target instruction sets. External specifications facilitate maintenance and allow one to quickly implement support for new instruction sets. We analyze some of the most popular instruction set architectures (ISAs), including those used in microcontrollers, and from that compile a list of requirements for the retargetable decoder. We then overview existing multi-ISA decoders and propose our vision of a more generic approach, based on a multi-layer directed acyclic graph that describes the decoding process in universal terms. The second component of the analysis platform is the actual architecture-neutral IR. In this paper, we describe such IRs and propose Pivot 2, an IR that is low-level enough to be easily constructed from decoded machine instructions, also being easy to analyze. The main features of Pivot 2 are explicit side effects, SSA variables, simpler alternative to phi-functions, and extensible elementary operation set at the core. This IR also supports machines that have multiple memory address spaces. Finally, we propose a way to tie the decoder and the IR together to fit them to most of the binary code analysis tasks through abstract interpretation on top of the IR. The proposed scheme takes into account various aspects of target architectures that are overlooked in many other works, including pipeline specifics (handling of delay slots, hardware loop support, etc.), exception and interrupt management, and generic address space model, in which accesses may have arbitrary side effects due to memory-mapped devices or other non-trivial behavior of the memory system.
... We give a example of this on the ARM architecture, in Fig. 13. The execution, a previously studied ARM litmus test, was verified using the rmem tool [Gray et al. 2015], which can explore and visualise the possible relaxed behaviours of program fragments in various architectures. It can be viewed as an abstraction (for brevity) of the compiled code of Fig. 12, sufficent to depict its memory consistency properties. ...
Article
Full-text available
WebAssembly (Wasm) is a safe, portable virtual instruction set that can be hosted in a wide range of environments, such as a Web browser. It is a low-level language whose instructions are intended to compile directly to bare hardware. While the initial version of Wasm focussed on single-threaded computation, a recent proposal extends it with low-level support for multiple threads and atomic instructions for synchronised access to shared memory. To support the correct compilation of concurrent programs, it is necessary to give a suitable specification of its memory model. Wasm's language definition is based on a fully formalised specification that carefully avoids undefined behaviour. We present a substantial extension to this semantics, incorporating a relaxed memory model, along with a few proposed extensions. Wasm's memory model is unique in that its linear address space can be dynamically grown during execution, while all accesses are bounds-checked. This leads to the novel problem of specifying how observations about the size of the memory can propagate between threads. We argue that, considering desirable compilation schemes, we cannot give a sequentially consistent semantics to memory growth. We show that our model provides sequential consistency for data-race-free executions (SC-DRF). However, because Wasm is to run on the Web, we must also consider interoperability of its model with that of JavaScript. We show, by counter-example, that JavaScript's memory model is not SC-DRF, in contrast to what is claimed in its specification. We propose two axiomatic conditions that should be added to the JavaScript model to correct this difference. We also describe a prototype SMT-based litmus tool which acts as an oracle for our axiomatic model, visualising its behaviours, including memory resizing.
... However, it is a critical step toward building a trustworthy decoder that can decode binary streams into their machine instruction representations, enabling verification of program binaries (for that machine) in a subsequent phase. Second, the formalization of the targeted ISA's should be elegant, readable, and as close as possible to their machine-vendor-specific representations [11], [13], [14]. This can reduce verification costs. ...
... The type checker will automatically generate thousands of proof obligations. Moreover, UniVS7 will install our proof scripts, for example, Listing 6 (lines [13][14][15][16], to discharge them for the entire test suit atomically. ...
... It is usually invoked using a proof strategy (eval-formula in line 3 of Listing 7). Since members of the same class often share these symbol names, UniVS7 can reuse the same Proof-Lite script [7], [24], such as Listing 6 (lines [13][14][15][16], for all the class members. This allows discharging tens of thousands of test lemmas with only one proof command for each test suite. ...
... They show that the store buffer semantics of TSO and PSO corresponds to their semantics of "speculations". Gray and Flur et al. [13,7] have established axiomatic and operational models for TSO, and their equivalence. Their work is also integrated with detailed instruction semantics for x86, IBM Power, ARM, MIPS, and RISC-V. ...
... They formalise both the ISA and relatex memory models such as x86-CC and x86-TSO in HOL and show the correspondence between different styles of memory models. It is possible to translate Gray and Flur et al.'s work [13,7] to Isabelle/HOL or Coq code. However, the resulting formal model would rely on the correctness of the translation tool such as Lem [22], which adds one more layer of complication in our verification tasks. ...
Preprint
SPARC processors have many applications in mission-critical industries such as aviation and space engineering. Hence, it is important to provide formal frameworks that facilitate the verification of hardware and software that run on or interface with these processors. This paper presents the first mechanised SPARC Total Store Ordering (TSO) memory model which operates on top of an abstract model of the SPARC Instruction Set Architecture (ISA) for multi-core processors. Both models are specified in the theorem prover Isabelle/HOL. We formalise two TSO memory models: one is an adaptation of the axiomatic SPARC TSO model, the other is a novel operational TSO model which is suitable for verifying execution results. We prove that the operational model is sound and complete with respect to the axiomatic model. Finally, we give verification examples with two case studies drawn from the SPARCv9 manual.