Thomas Sewell’s research while affiliated with University of Cambridge and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (28)


Figure 2: Overview of CakeML and Pancake compiler stack.
Figure 3: Driver verification workflow.
Verifying Device Drivers with Pancake
  • Preprint
  • File available

January 2025

·

16 Reads

Junming Zhao

·

Alessandro Legnani

·

Tiana Tsang Ung

·

[...]

·

Device driver bugs are the leading cause of OS compromises, and their formal verification is therefore highly desirable. To the best of our knowledge, no realistic and performant driver has been verified for a non-trivial device. We propose Pancake, an imperative language for systems programming that features a well-defined and verification-friendly semantics. Leveraging the verified compiler backend of the CakeML functional language, we develop a compiler for Pancake that guarantees that the binary retains the semantics of the source code. Usng automatic translation of Pancake to the Viper SMT front-end, we verify a performant driver for an Ethernet NIC.

Download

Cakes That Bake Cakes: Dynamic Computation in CakeML

June 2023

·

6 Reads

·

1 Citation

Proceedings of the ACM on Programming Languages

We have extended the verified CakeML compiler with a new language primitive, Eval, which permits evaluation of new CakeML syntax at runtime. This new implementation supports an ambitious form of compilation at runtime and dynamic execution, where the original and dynamically added code can share (higher-order) values and recursively call each other. This is, to our knowledge, the first verified run-time environment capable of supporting a standard LCF-style theorem prover design. Modifying the modern CakeML compiler pipeline and proofs to support a dynamic computation semantics was an extensive project. We review the design decisions, proof techniques, and proof engineering lessons from the project, and highlight some unexpected complications.


CN: Verifying Systems C Code with Separation-Logic Refinement Types

January 2023

·

25 Reads

·

13 Citations

Proceedings of the ACM on Programming Languages

Despite significant progress in the verification of hypervisors, operating systems, and compilers, and in verification tooling, there exists a wide gap between the approaches used in verification projects and conventional development of systems software. We see two main challenges in bringing these closer together: verification handling the complexity of code and semantics of conventional systems software, and verification usability. We describe an experiment in verification tool design aimed at addressing some aspects of both: we design and implement CN, a separation-logic refinement type system for C systems software, aimed at predictable proof automation, based on a realistic semantics of ISO C. CN reduces refinement typing to decidable propositional logic reasoning, uses first-class resources to support pointer aliasing and pointer arithmetic, features resource inference for iterated separating conjunction, and uses a novel syntactic restriction of ghost variables in specifications to guarantee their successful inference. We implement CN and formalise key aspects of the type system, including a soundness proof of type checking. To demonstrate the usability of CN we use it to verify a substantial component of Google's pKVM hypervisor for Android.


Verified Security for the Morello Capability-enhanced Prototype Arm Architecture

March 2022

·

49 Reads

·

20 Citations

Lecture Notes in Computer Science

Memory safety bugs continue to be a major source of security vulnerabilities in our critical infrastructure. The CHERI project has proposed extending conventional architectures with hardware-supported capabilities to enable fine-grained memory protection and scalable compartmentalisation, allowing historically memory-unsafe C and C++ to be adapted to deterministically mitigate large classes of vulnerabilities, while requiring only minor changes to existing system software sources. Arm is currently designing and building Morello, a CHERI-enabled prototype architecture, processor, SoC, and board, extending the high-performance Neoverse N1, to enable industrial evaluation of CHERI and pave the way for potential mass-market adoption. However, for such a major new security-oriented architecture feature, it is important to establish high confidence that it does provide the intended protections, and that cannot be done with conventional engineering techniques. In this paper we put the Morello architecture on a solid mathematical footing from the outset. We define the fundamental security property that Morello aims to provide, reachable capability monotonicity, and prove that the architecture definition satisfies it. This proof is mechanised in Isabelle/HOL, and applies to a translation of the official Arm specification of the Morello instruction-set architecture (ISA) into Isabelle. The main challenge is handling the complexity and scale of a production architecture: 62,000 lines of specification, translated to 210,000 lines of Isabelle. We do so by factoring the proof via a narrow abstraction capturing essential properties of arbitrary CHERI ISAs, expressed above a monadic intra-instruction semantics. We also develop a model-based test generator, which generates instruction-sequence tests that give good specification coverage, used in early testing of the Morello implementation and in Morello QEMU development, and we use Arm’s internal test suite to validate our model. This gives us machine-checked mathematical proofs of whole-ISA security properties of a full-scale industry architecture, at design-time. To the best of our knowledge, this is the first demonstration that that is feasible, and it significantly increases confidence in Morello.


Fig. 14. Constraint semantics for records.
Fig. 15. The rules of the subtyping relation viewed as a lattice.
Fig. 17. The value semantics evaluation rules.
Fig. 19. The update semantics evaluation rules concerning pointers.
Fig. 20. The value typing and update/value refinement rules.
Cogent: uniqueness types and certifying compilation

October 2021

·

39 Reads

·

38 Citations

Journal of Functional Programming

This paper presents a framework aimed at significantly reducing the cost of proving functional correctness for low-level operating systems components. The framework is designed around a new functional programming language, Cogent. A central aspect of the language is its uniqueness type system, which eliminates the need for a trusted runtime or garbage collector while still guaranteeing memory safety, a crucial property for safety and security. Moreover, it allows us to assign two semantics to the language: The first semantics is imperative, suitable for efficient C code generation, and the second is purely functional, providing a user-friendly interface for equational reasoning and verification of higher-level correctness properties. The refinement theorem connecting the two semantics allows the compiler to produce a proof via translation validation certifying the correctness of the generated C code with respect to the semantics of the Cogent source program. We have demonstrated the effectiveness of our framework for implementation and for verification through two file system implementations.



High-assurance timing analysis for a high-assurance real-time operating system

September 2017

·

114 Reads

·

34 Citations

Real-Time Systems

Worst-case execution time (WCET) analysis of real-time code needs to be performed on the executable binary code for soundness. Obtaining tight WCET bounds requires determination of loop bounds and elimination of infeasible paths. The binary code, however, lacks information necessary to determine these bounds. This information is usually provided through manual intervention, or preserved in the binary by a specially modified compiler. We propose an alternative approach, using an existing translation-validation framework, to enable high-assurance, automatic determination of loop bounds and infeasible paths. We show that this approach automatically determines all loop bounds and many (possibly all) infeasible paths in the seL4 microkernel, as well as in standard WCET benchmarks which are in the language subset of our C parser. We also design and validate an improvement to the seL4 implementation, which permits a key part of the kernel’s API to be available to users in a mixed-criticality setting.


Refinement through restraint: bringing down the cost of verification

September 2016

·

24 Reads

·

25 Citations

ACM SIGPLAN Notices

We present a framework aimed at significantly reducing the cost of verifying certain classes of systems software, such as file systems. Our framework allows for equational reasoning about systems code written in our new language, Cogent. Cogent is a restricted, polymorphic, higher-order, and purely functional language with linear types and without the need for a trusted runtime or garbage collector. Linear types allow us to assign two semantics to the language: one imperative, suitable for efficient C code generation; and one functional, suitable for equational reasoning and verification. As Cogent is a restricted language, it is designed to easily interoperate with existing C functions and to connect to existing C verification frameworks. Our framework is based on certifying compilation: For a well-typed Cogent program, our compiler produces C code, a high-level shallow embedding of its semantics in Isabelle/HOL, and a proof that the C code correctly refines this embedding. Thus one can reason about the full semantics of real-world systems code productively and equationally, while retaining the interoperability and leanness of C. The compiler certificate is a series of language-level proofs and per-program translation validation phases, combined into one coherent top-level theorem in Isabelle/HOL.


Refinement through restraint: bringing down the cost of verification

September 2016

·

20 Reads

·

9 Citations

ACM SIGPLAN Notices

We present a framework aimed at significantly reducing the cost of verifying certain classes of systems software, such as file systems. Our framework allows for equational reasoning about systems code written in our new language, Cogent. Cogent is a restricted, polymorphic, higher-order, and purely functional language with linear types and without the need for a trusted runtime or garbage collector. Linear types allow us to assign two semantics to the language: one imperative, suitable for efficient C code generation; and one functional, suitable for equational reasoning and verification. As Cogent is a restricted language, it is designed to easily interoperate with existing C functions and to connect to existing C verification frameworks. Our framework is based on certifying compilation: For a well-typed Cogent program, our compiler produces C code, a high-level shallow embedding of its semantics in Isabelle/HOL, and a proof that the C code correctly refines this embedding. Thus one can reason about the full semantics of real-world systems code productively and equationally, while retaining the interoperability and leanness of C. The compiler certificate is a series of language-level proofs and per-program translation validation phases, combined into one coherent top-level theorem in Isabelle/HOL.


A Framework for the Automatic Formal Verification of Refinement from Cogent to C

August 2016

·

59 Reads

·

20 Citations

Lecture Notes in Computer Science

Our language Cogent simplifies verification of systems software using a certifying compiler, which produces a proof that the generated C code is a refinement of the original Cogent program. Despite the fact that Cogent itself contains a number of refinement layers, the semantic gap between even the lowest level of Cogent semantics and the generated C code remains large. In this paper we close this gap with an automated refinement framework which validates the compiler’s code generation phase. This framework makes use of existing C verification tools and introduces a new technique to relate the type systems of Cogent and C.


Citations (24)


... Liu et al. [15] propose an approach inspired by scheduling languages, with proof obligations generated when a program is optimised, for automatic verification using Coq. The Cogent language [20] uses refinement proofs, to be verified in Isabelle/HOL. However, it does not separate algorithms from schedules. ...

Reference:

{\textsc {HaliVer}}$$: Deductive Verification and Scheduling Languages Join Forces
Cogent: uniqueness types and certifying compilation

Journal of Functional Programming

... Specifically, ( ) defines the set of concrete constructs that refine abstraction . The notion of refinement type in the recent works [59,64,65] corresponds to SL predicates in our theory. To emphasize this correspondence and to be intuitive, we introduce the notation ⦂ to abbreviate predicate application ( ), i.e., ⦂ ≜ ( ). ...

CN: Verifying Systems C Code with Separation-Logic Refinement Types
  • Citing Article
  • January 2023

Proceedings of the ACM on Programming Languages

... However, the results have been compelling: MSRC reported more than a two-thirds deterministic mitigation rate for memory-safety vulnerabilities with the deployment of CHERI's referential, spatial, and temporal memory safety. 3. Formal proof of architectural security properties: Formal modeling of the Morello and CHERI-MIPS ISAs has supported formal verification (machine-checked mathematical proof) that the ISAs enforce key properties, such as correctness of capability bounds comparison and isolation of arbitrary code by compartmentalization mechanisms, 12 and formal semantics for CHERI C has clarified its security properties. 13 4. Penetration-testing exercises, ideally performed with a strong attacker awareness of the CHERI model so that attack strategies can take this into account: These exercisers have primarily been performed externally and include an activity by MSRC to consider the impact of CHERI on WebKit JavaScriptCore ( JSC) with CHERI-aware attackers as well as a DARPA-sponsored, crowdsourced penetration activity. ...

Verified Security for the Morello Capability-enhanced Prototype Arm Architecture

Lecture Notes in Computer Science

... Attempts have been made to achieve better systems programming languages by incorporating advanced language features that make certain safety properties hold by construction. For example, Cogent has a linear type discipline that prevents memory leaks [Amani et al., 2016], Rust's borrow checker enforces ownership and lifetimes [Klabnik and Nichols, 2017], and Cyclone incorporates garbage collection and ML-style polymorphism [Jim et al., 2002]. Such advanced features can eliminate whole classes of bugs, or at least reduce bug density, but at the cost of making the language semantics and implementation more complicated. ...

CoGENT: Verifying High-Assurance File System Implementations
  • Citing Conference Paper
  • March 2016

ACM SIGPLAN Notices

... Cogent [16] is a restricted purely functional language with a certifying compiler [16,21] designed to ease creating verified operating systems components [3]. It has a foreign function interface (FFI) that enables implementing parts of a system in C. Cogent's main restrictions are the purposeful lack of recursion or loops, which ensures totality, and its uniqueness type system, which enforces a uniqueness invariant that, among other benefits, guarantees memory safety. ...

Refinement through restraint: bringing down the cost of verification
  • Citing Conference Paper
  • September 2016

ACM SIGPLAN Notices

... Some require a garbage collector, e.g., CertiCoq and OEuf (from Gallina to C) or CakeML (from Standard ML to binary). The Cogent framework [29] (from Cogent to Isabelle/HOL and C) is verified but depends on calls to foreign C functions to perform loops, and Rupicola [28] (from Gallina to bedrock2, a C-like language) has only been tested for small algorithms. The end-to-end co-verification method proposed in the paper instead reuses the existing verification workflow and proof efforts of CertrBPF and CompCert to provide the first, fully verified and resource-efficient, hybrid virtual machine, HAVM. ...

A Framework for the Automatic Formal Verification of Refinement from Cogent to C
  • Citing Conference Paper
  • August 2016

Lecture Notes in Computer Science

... Cogent [Amani et al. 2016;O'Connor et al. 2021] is a domain-specific language equipped with a linear type system. The Cogent compiler produces: C code; a high-level Isabelle/HOL specification; and a proof of refinement from the former to the latter. ...

Cogent: Verifying High-Assurance File System Implementations
  • Citing Article
  • March 2016

ACM SIGARCH Computer Architecture News