Figure 2 - uploaded by Dominic P. Mulligan
Content may be subject to copyright.
Example instruction description, in vendor documentation and in Sail, showing the close correspondence between the two for execute and decode

Example instruction description, in vendor documentation and in Sail, showing the close correspondence between the two for execute and decode

Source publication
Conference Paper
Full-text available
Weakly consistent multiprocessors such as ARM and IBM POWER have been with us for decades, but their subtle programmer-visible concurrency behaviour remains challenging, both to implement and to use; the traditional architecture documentation, with its mix of prose and pseudocode, leaves much unclear. In this paper we show how a precise architectur...

Contexts in source publication

Context 1
... illustrate Sail with an example instruction de- scription in Fig. 2 (stdu, one of the simplest of the several hundred instructions we consider). On the left is the vendor documentation, while on the right are Sail definitions of a clause of the abstract-syntax type ast of instructions (Stdu, with three bitvector fields), a clause of the decode function, pattern-matching 32- bit opcode values into that ...
Context 2
... and there is also simple effect polymorphism. To keep Sail defini- tions readable, we use type inference and some limited automatic coercions (between bits, bitvectors of length 1, and numbers; from bit vectors to unsigned numbers; and between constants and bit vectors), so very few type annotations are needed -none in the right-hand side of Fig. 2 except for the sizes of the instruction opcode fields in the decode ...
Context 3
... this lets us have definitions of decoding and behaviour that are simultaneously precise and close enough to the vendor pseudocode to be readable, as Fig. 2 ...
Context 4
... tool has to deal with many irregularities in the XML to pull out the main blocks shown on the left of Fig. 2 (instruction name, form, mnemonic, binary rep- resentation, pseudocode, and list of special registers al- tered) and parse the pseudocode into a simple untyped grammar. The powerful type inference that Sail pro- vides makes it simple to generate Sail code from ...
Context 5
... back at the Fig. 1 overview, Sections 2, 3, and 4 have described the left-hand block: our ISA model and its interface. We now describe the concurrency model, with respect to that of our starting point ...
Context 6
... the abstract micro-op state of an instruction is an element of: At present we treat instruction and data memory separately, not having investigated the interactions be- tween concurrency and instruction-cache effects. In- struction fetches read values from a fixed instruction memory, decode them (if possible) using the Sail decode function, as in Fig. 2, and use the exhaustive interpreter to analyse their register footprint and potential next fetch addresses. The exhaustive interpreter is also used as necessary to re-analyse the possible future memory footprint of partially executed ...

Similar publications

Article
Full-text available
Conventional homogeneous multicore processors are not able to provide the continued performance and energy improvement that we have expected from past endeavors. Heterogeneous architectures that feature specialized hardware accelerators are widely considered a promising paradigm for resolving this issue. Among different heterogeneous devices, FPGAs...

Citations

... One can visualise the state of a single core abstractly as a tree of partially and completely executed instances, as in Fig. 1 (top). Abstract-microarchitectural operational models use this abstraction [24,25,28,49,51,52]. We depict the retired (committed) FDX instances as solid dark green, and partially/tentatively executed in-flight instances as light green. ...
Preprint
Full-text available
To manage exceptions, software relies on a key architectural guarantee, precision: that exceptions appear to execute between instructions. However, this definition, dating back over 60 years, fundamentally assumes a sequential programmers model. Modern architectures such as Arm-A with programmer-observable relaxed behaviour make such a naive definition inadequate, and it is unclear exactly what guarantees programmers have on exception entry and exit. In this paper, we clarify the concepts needed to discuss exceptions in the relaxed-memory setting -- a key aspect of precisely specifying the architectural interface between hardware and software. We explore the basic relaxed behaviour across exception boundaries, and the semantics of external aborts, using Arm-A as a representative modern architecture. We identify an important problem, present yet unexplored for decades: pinning down what it means for exceptions to be precise in a relaxed setting. We describe key phenomena that any definition should account for. We develop an axiomatic model for Arm-A precise exceptions, tooling for axiomatic model execution, and a library of tests. Finally we explore the relaxed semantics of software-generated interrupts, as used in sophisticated programming patterns, and sketch how they too could be modelled.
... This has spurred significant interest in open architectures like RISC-V where the semantics can be defined all the way down to the hardware implementation. There has been promising work on building a machine interpretable formal semantics for RISC-V [45] (mentioned further in Section 4.1), but we still have a gap when it comes to semantics for other architectures in widespread use. There have been efforts to formalize x86 as early as 2004 with VeryPCC [123], as well as more recent work [28,49,38] in both the Coq and Isabelle/HOL proof assistants in support of proof-carrying code research, but these have limited usability in any production setting. ...
... • Sail [45] has been used to specify instruction set architectures (ISAs) and is the canonical representation for RISC-V as well as newer versions of ARM. While Sail can be used to generate Coq, the code is not usable with existing tools (such as CompCert) without further development. ...
Technical Report
Full-text available
In this report, we describe the current capabilities and research needs related to formal methods at the NNSA labs. In particular, we identify medium-term and long-term research gaps in programming languages, formalization efforts of complex systems, embedded systems verification, hardware verification, cybersecurity, formal methods usability, workflows, numerical methods, the use of formal methods for artificial intelligence (and its converse, artificial intelligence for formal methods), and collaboration opportunities and considerations on these topics. We conclude with a small number of exemplar research problems related to these topic
... While HLS is a popular design paradigm and can provide significant engineering efficiency gains, it often produces low-performance RTL [1]. Contemporary work on ISA specification falls into two main categories: ad hoc specification of existing ISAs [19], [33] and frameworks which are more analogous to PEak for specifying ISAs such as SAIL [21], ILA [23], and ISA-Formal [34]. These systems use declarative descriptions of the semantics of instructions as state updates predicated on the bit-level representation of an instruction. ...
Preprint
Full-text available
Domain-specific languages for hardware can significantly enhance designer productivity, but sometimes at the cost of ease of verification. On the other hand, ISA specification languages are too static to be used during early stage design space exploration. We present PEak, an open-source hardware design and specification language, which aims to improve both design productivity and verification capability. PEak does this by providing a single source of truth for functional models, formal specifications, and RTL. PEak has been used in several academic projects, and PEak-generated RTL has been included in three fabricated hardware accelerators. In these projects, the formal capabilities of PEak were crucial for enabling both novel design space exploration techniques and automated compiler synthesis.
... For Armv8-A and RISC-V, there exist full-scale sequential models in Sail [1,2], a domain-specific language for instruction-set architecture (ISA) specification, that are complete enough to boot real-world operating systems such as Linux. For Armv8-A this model is automatically derived from the authoritative Arm-internal specification [3], while for RISC-V it has been hand-written, and adopted by RISC-V International. ...
... In general, existing ISA descriptions do not cover this aspect of the architecture well, as they are principally developed only to describe the sequential behaviour. Previous tools have either hand-coded dependency information, which is acceptable for cut-down ISA models but too laborious and error-prone at the scale of the ISA models we use, or used a heavyweight taint-tracking interpreter [2]. Our approach avoids both of these. ...
Article
Full-text available
Architecture specifications such as Armv8-A and RISC-V are the ultimate foundation for software verification and the correctness criteria for hardware verification. They should define the allowed sequential and relaxed-memory concurrency behaviour of programs, but hitherto there has been no integration of full-scale instruction-set architecture (ISA) semantics with axiomatic concurrency models, either in mathematics or in tools. These ISA semantics can be surprisingly large and intricate, e.g. 100k++ lines for Armv8-A. In this paper we present a tool, Isla, for computing the allowed behaviours of concurrent litmus tests with respect to full-scale ISA definitions, in the Sail language, and arbitrary axiomatic relaxed-memory concurrency models, in the Cat language. It is based on a generic symbolic engine for Sail ISA specifications. We equip the tool with a web interface to make it widely accessible, and illustrate and evaluate it for Armv8-A and RISC-V. The symbolic execution engine is valuable also for other verification tasks: it has been used in automated ISA test generation for the Arm Morello prototype architecture, extending Armv8-A with CHERI capabilities, and for Iris program-logic reasoning about binary code above the Armv8-A and RISC-V ISA specifications. By using full-scale and authoritative ISA semantics, Isla lets one evaluate litmus tests using arbitrary user instructions with high confidence. Moreover, because these ISA specifications give detailed and validated definitions of the sequential aspects of systems functionality, as used by hypervisors and operating systems, e.g. instruction fetch, exceptions, and address translation, our tool provides a basis for developing concurrency semantics for these. We demonstrate this for the Armv8-A instruction-fetch and virtual-memory models and examples of Simner et al.
... Making relaxed-memory semantics exhaustively executable is essential for exploring their behaviour on examples [67,54,53,20,9,36,66,23,64,49,57]. Handling relaxed virtual memory 8 Experimental testing of hardware 50 brings several new challenges. ...
... There is extensive previous work on "user" relaxed-memory semantics of modern architectures, but very little extending this to cover systems aspects such as virtual memory. We build on the approaches established in "user" models for x86, IBM Power, Arm, and RISC-V, combining executable-as-test-oracle models, discussion with architects, and experimental testing [54,5,7,47,55,53,21,52,46,9,36,31,32,49,65]. ...
Preprint
Full-text available
Virtual memory is an essential mechanism for enforcing security boundaries, but its relaxed-memory concurrency semantics has not previously been investigated in detail. The concurrent systems code managing virtual memory has been left on an entirely informal basis, and OS and hypervisor verification has had to make major simplifying assumptions. We explore the design space for relaxed virtual memory semantics in the Armv8-A architecture, to support future system-software verification. We identify many design questions, in discussion with Arm; develop a test suite, including use cases from the pKVM production hypervisor under development by Google; delimit the design space with axiomatic-style concurrency models; prove that under simple stable configurations our architectural model collapses to previous "user" models; develop tooling to compute allowed behaviours in the model integrated with the full Armv8-A ISA semantics; and develop a hardware test harness. This lays out some of the main issues in relaxed virtual memory bringing these security-critical systems phenomena into the domain of programming-language semantics and verification with foundational architecture semantics. This document is an extended version of a paper in ESOP 2022, with additional explanation and examples in the main body, and appendices detailing our litmus tests, models, proofs, and test results.
... Architecture specifications have two main parts: the sequential and relaxed-memory concurrent aspects of instruction behaviour, each of which have been studied in previous work. For Armv8-A and RISC-V, Armstrong et al. have established full-scale sequential models in Sail [10,15], a domain-specific language for instruction-set architecture (ISA) specification, that are complete enough to boot real-world operating systems such as Linux. For Armv8-A this model is automatically derived from the authoritative Arm-internal specification [24], while for RISC-V it has been hand-written and adopted by RISC-V International. ...
... In general, existing ISA descriptions do not cover this aspect of the architecture well, as they are principally developed only to describe the sequential behaviour. Previous tools have either hand-coded dependency information, which is acceptable for cut-down ISA models but too laborious and error-prone at the scale of the ISA models we use, or used a heavyweight taint-tracking interpreter [15]. Our approach avoids both of these. ...
Chapter
Architecture specifications such as Armv8-A and RISC-V are the ultimate foundation for software verification and the correctness criteria for hardware verification. They should define the allowed sequential and relaxed-memory concurrency behaviour of programs, but hitherto there has been no integration of full-scale instruction-set architecture (ISA) semantics with axiomatic concurrency models, either in mathematics or in tools. These ISA semantics can be surprisingly large and intricate, e.g. 100k+ lines for Armv8-A. In this paper we present a tool, Isla, for computing the allowed behaviours of concurrent litmus tests with respect to full-scale ISA definitions, in Sail, and arbitrary axiomatic relaxed-memory concurrency models, in the Cat language. It is based on a generic symbolic engine for Sail ISA specifications, which should be valuable also for other verification tasks. We equip the tool with a web interface to make it widely accessible, and illustrate and evaluate it for Armv8-A and RISC-V. By using full-scale and authoritative ISA semantics, this lets one evaluate litmus tests using arbitrary user instructions with high confidence. Moreover, because these ISA specifications give detailed and validated definitions of the sequential aspects of systems functionality, as used by hypervisors and operating systems, e.g. instruction fetch, exceptions, and address translation, our tool provides a basis for developing concurrency semantics for these. We demonstrate this for the Armv8-A instruction-fetch model and self-modifying code examples of Simner et al.
... They show that the store buffer semantics of TSO and PSO corresponds to their semantics of "speculations". Gray and Flur et al. [13,22] have established axiomatic and operational models for TSO and their equivalence. Their work is also integrated with detailed instruction semantics for x86, IBM Power, ARM, MIPS, and RISC-V. ...
... In this sense, reasoning about our memory model is even simpler than in the memory abstraction introduced in [47]. It is possible to translate Gray and Flur et al.'s work [13,22] to Isabelle/HOL or Coq code. However, the resulting formal model would rely on the correctness of the translation tool such as Lem [34], which adds one more layer of complication for verification. ...
Article
Full-text available
The SPARC instruction set architecture (ISA) has been used in various processors in workstations, embedded systems, and in mission-critical industries such as aviation and space engineering. Hence, it is important to provide formal frameworks that facilitate the verification of hardware and software that run on or interface with these processors. In this work, we give the first formal model for multi-core SPARC ISA and Total Store Ordering (TSO) memory model in Isabelle/HOL. We present two levels of modelling for the ISA: The low-level ISA model, which is executable, covers many features specific to SPARC processors, such as delayed-write for control registers, windowed general registers, and more complex memory access. We have tested our model extensively against a LEON3 simulation board, the test covers both single-step executions and sequential execution of programs. We also prove some important properties for our formal model, including a non-interference property for the LEON3 processor. The high-level ISA model is an abstraction of the low-level model and it provides an interface for memory operations in multi-core processors. On top of the high-level ISA model, we formalise two TSO memory models: one is an adaptation of the axiomatic SPARC TSO model (Sindhu et al. in Formal specification of memory models, Springer, Boston, 1992; SPARC in The SPARC architecture manual version 8, 1992. http://gaisler.com/doc/sparcv8.pdf), the other is a new operational TSO model which is suitable for verifying execution results. We prove that the operational model is sound and complete with respect to the axiomatic model. Finally, we give verification examples with two case studies drawn from the SPARCv9 manual.
... The two starting points for developing the mixed-size axiomatic model are the existing Flat model [147,148], an operational model with mixed-size support that is part of the rmem tool [149], and Arm's reference model [150,147,151], an axiomatic specification defined in herd [152], without mixed-size support. The two models are based on extensive past research on architectural concurrency for Armv8-a (and related Power), discussion with architects, and experimental hardware testing [147,148,153,154,152,155,156,157,158,149,159,160,161,162,163]. Our mixed-size axiomatic model generalises the reference axiomatic model to mixed-size programs in a way that aims to follow the Flat model's behaviour -Flat has been developed in collaboration with Arm and is extensively experimentally validated, although it is beyond the scope of this work to further investigate the correctness of Flat itself. ...
Thesis
WebAssembly is the first new programming language to be supported natively by all major Web browsers since JavaScript. It is designed to be a natural low-level compilation target for languages such as C, C++, and Rust, enabling programs written in these languages to be compiled and executed efficiently on the Web. WebAssembly’s specification is managed by the W3C WebAssembly Working Group (made up of representatives from a number of major tech companies). Uniquely, the language is specified by way of a full pen-and-paper formal semantics. This thesis describes a number of ways in which I have both helped to shape the specification of WebAssembly, and built upon it. By mechanising the WebAssembly formal semantics in Isabelle/HOL while it was being drafted, I discovered a number of errors in the specification, drove the adoption of official corrections, and provided the first type soundness proof for the corrected language. This thesis also details a verified type checker and interpreter, and a security type system extension for cryptography primitives, all of which have been mechanised as extensions of my initial WebAssembly mechanisation. A major component of the thesis is my work on the specification of shared memory concurrency in Web languages: correcting and verifying properties of JavaScript’s existing relaxed memory model, and defining the WebAssembly-specific extensions to the corrected model which have been adopted as the basis of WebAssembly’s official threads specification. A number of deficiencies in the original JavaScript model are detailed. Some errors have been corrected, with the verified fixes officially adopted into subsequent editions of the language specification. However one discovered deficiency is fundamental to the model, an instance of the well-known "thin-air problem". My work demonstrates the value of formalisation and mechanisation in industrial programming language design, not only in discovering and correcting specification errors, but also in building confidence both in the correctness of the language’s design and in the design of proposed extensions.
... Memory specifications define what values may be read in a concurrent system. Current evaluators rely on ad hoc algorithms [3,6,14] or satisfiability (SAT) solvers [40]. However, flaws in existing language memory specifications [5] -where one must account for executions introduced through aggressive optimisationhave led to a new class of memory specifications [22,20] that cannot be practically solved using existing ad hoc or SAT techniques. ...
Chapter
Full-text available
We present PrideMM, an efficient model checker for second-order logic enabled by recent breakthroughs in quantified satisfiability solvers. We argue that second-order logic sits at a sweet spot: constrained enough to enable practical solving, yet expressive enough to cover an important class of problems not amenable to (non-quantified) satisfiability solvers. To the best of our knowledge PrideMM is the first automated model checker for second-order logic formulae.
... In 2008, for ARMv7, IBM POWER, and x86, this was poorly understood, and the architects regarded even their own prose specifications as inscrutable. Now, following extensive work by many people [36,37,19,18,22,8,31,45,7,46,48,35,6,2,47,13,1], ARMv8-A has a well-defined and simplified model as part of its specification [9, B2.3], including a prose transcription of a mathematical model [15], and an equivalence proof between operational and axiomatic presentations [36,37]; RISC-V has adopted a similar model [52]; and IBM POWER and x86 have well-established de-facto-standard models. All of these are experimentally validated against hardware, and supported by tools for exhaustively running tests [17,4]. ...
... Previous work on operational models for IBM POWER and Arm "usermode" concurrency [46,45,22,18,19,37] has shown, surprisingly, that as far as programmer-visible behaviour is concerned, one can abstract from almost all hardware implementation details of data memory (store queues, the cache hierarchy, the cache protocol, etc.). For ARMv8-A, following their 2018 shift to a multicopy-atomic architecture, one can do so completely: the Flat model of [37] has a shared flat memory, with a per-thread out-of-order thread subsystem, modelling pipeline effects, responsible for all observable relaxed behaviour. ...
Chapter
Full-text available
Computing relies on architecture specifications to decouple hardware and software development. Historically these have been prose documents, with all the problems that entails, but research over the last ten years has developed rigorous and executable-as-test-oracle specifications of mainstream architecture instruction sets and “user-mode” concurrency, clarifying architectures and bringing them into the scope of programming-language semantics and verification. However, the system semantics , of instruction-fetch and cache maintenance, exceptions and interrupts, and address translation, remains obscure, leaving us without a solid foundation for verification of security-critical systems software. In this paper we establish a robust model for one aspect of system semantics: instruction fetch and cache maintenance for ARMv8-A. Systems code relies on executing instructions that were written by data writes, e.g. in program loading, dynamic linking, JIT compilation, debugging, and OS configuration, but hardware implementations are often highly optimised, e.g. with instruction caches, linefill buffers, out-of-order fetching, branch prediction, and instruction prefetching, which can affect programmer-observable behaviour. It is essential, both for programming and verification, to abstract from such microarchitectural details as much as possible, but no more. We explore the key architecture design questions with a series of examples, discussed in detail with senior Arm staff; capture the architectural intent in operational and axiomatic semantic models, extending previous work on “user-mode” concurrency; make these models executable as test oracles for small examples; and experimentally validate them against hardware behaviour (finding a bug in one hardware device). We thereby bring these subtle issues into the mathematical domain, clarifying the architecture and enabling future work on system software verification.