At Amazon’s 2025 re:Invent conference, Amazon Web Services (AWS) announced the Nitro Isolation Engine (NIE), a software module tasked with providing resources to AWS clients while ensuring the security of customer data. AWS also announced the formal verification of the isolation engine’s correctness and security guarantees, using a proof assistant called Isabelle/HOL. As the first formally verified cloud hypervisor, NIE sets a new standard for cloud security.
A proof assistant is an automated tool that can help human users develop formal proofs — of mathematical theorems, the validity of hardware or software systems, or anything in-between. Several proof assistants are in common use, and we chose Isabelle/HOL because it struck the right balance among expressiveness, automation, proof readability, and scalability. So what do I mean by that?
Logical reasoning by computer
There is no fixed language of mathematics, but we can create languages for expressing mathematical reasoning, just as programming languages express computational tasks. And just as programming languages involve trade-offs between expressiveness and performance, mathematical languages involve trade-offs between expressiveness and ease of automation. Automation is vital because the construction of a formal proof is both time consuming and extremely tedious, analogous to constructing a ship in a bottle.
The most elementary mathematical language is Boolean logic, the world of the binary operators AND, OR, and NOT. Because this language is so simple, powerful automatic solvers exist for it. In 2016, Carnegie Mellon professor Marijn Heule — now an Amazon Scholar — and his colleagues encoded into Boolean logic an unsolved mathematical question, the Boolean Pythagorean Triples Problem, and used automatic solvers to help create the largest proof ever, 200 terabytes long.
A richer mathematical language called first-order logic allows us to talk about some domain of interest — the integers, say — and to define functions over that domain. And we can go beyond Boolean logic by including the quantifiers "for all" and "there exists" in assertions. In this sort of language, we can express statements such as "every prime number greater than two is odd". We can also prove the following theorem, due to Lewis Carroll: No ducks waltz; no officers ever decline to waltz; all my poultry are ducks. Hence, none of my poultry are officers.
However, most people prefer a still stronger mathematical language, where they can define types, as they do in programming. In higher-order logic, there are even function types, as found in functional programming languages such as Haskell. Higher-order logic is much richer than first-order logic, able to express statements such as Every set containing the number 1 and closed under addition contains all the positive integers. It appears to be rich enough to express most of mathematics.
The richest mathematical languages — called dependent-type theories — even allow types to take arbitrary values as parameters, e.g., T(i), where i is an integer. The best-known such languages are Lean and Rocq.
Powerful automatic theorem provers exist for first-order logic, but for higher-order logic and beyond, full automation is not available. This is the price of expressiveness. A proof assistant allows users to build proofs interactively, supported by partial automation and the possibility of coding their own proof searches. A proof assistant enforces strict compliance with the laws of logic, typically through a kernel architecture that gives only a limited portion of the code the right to create a theorem.
A proof assistant also supports the interactive development of a possibly huge formal-specification hierarchy. For example, verification of the Nitro Isolation Engine (NIE) rests on specifications of the architecture of the Graviton-5 processor, the Rust code of the hypercalls and their functional correctness, and the security properties that are to be proved. These take up much of the quarter of a million lines constituting the formal proof.
Higher-order logic is supported by HOL and HOL Light, two closely related proof assistants, and has been used to verify hardware designs, floating-point algorithms, and pure mathematics since the 1990s. AWS senior principal applied scientist John Harrison developed HOL Light, and he has used it to improve the performance of digital signatures on Amazon’s Graviton2 chip by up to 94%, by verifying an optimized version of the cryptographic algorithms. The code was delicate, and exhaustive testing is not feasible; only a formal verification of full functional correctness would do before the deployment of such critical software. But today we are interested in Isabelle/HOL.
Overview of Isabelle/HOL
The most visible difference between Isabelle/HOL and the other HOL systems — which are all based on higher-order logic — is its specification and proof language. With most proof assistants, users state what they want to prove, followed by lists of commands that replace the original goals with series of subgoals in a kind of whack-a-mole game. In Isabelle and to some extent Lean, the proof language allows desired intermediate goals to be written out explicitly, allowing a better controlled proof process and a more legible proof document. There are plenty of examples online.
Other notable features are as follows:
- a user-configurable parser, which allowed us to embed a significant fragment of the Rust language into our specifications;
- type classes for principled overloading, so that say + can be given its natural meaning, not just for a variety of numeric types but for machine words and in other appropriate contexts;
- locales, a lightweight module system allowing a hierarchy of specifications to be defined and interpreted in various ways, even within a proof;
- powerful built-in automation through simplification and backchaining proof search;
- sledgehammer: one-click access to even more powerful external automation;
- counterexample-finding tools, for identifying claims that are actually false;
- code generation from executable higher-order specifications, which we used to test conformance.
For the verification of NIE, we began by implementing a specialized language called separation logic on top of Isabelle/HOL. Separation logic is designed for verifying program code operating on shared resources. We coded our own proof automation and also used what was built in. We therefore could use separation logic but also plain higher-order logic when we wanted to. Isabelle turned out to be resilient enough and efficient enough to cope with the truly gigantic subgoals. It could run that quarter-million-line proof in half an hour using an off-the-shelf laptop.
Some applications of Isabelle/HOL
The single most impressive application of Isabelle prior to NIE is probably the verification of seL4, a widely used microkernel. This proof was also about a quarter of a million lines when first announced, although it is now much longer. The seL4 developers proved that the microkernel’s C implementation refined the abstract specification, yielding full functional correctness of the core operations. And they have observed no bugs in the verified parts of the code, although testing still plays a role in covering unverified parts and certain assumptions that cannot be formalized.
Isabelle was also used in the following projects:
- to formalize the semantics of the WebAssembly language, to identify errors and, in particular, to prove the soundness of its type system;
- to create a verification framework for the Cogent programming language;
- to prove the correctness of algorithms for conflict-free replicated data types, which are used for distributed editing;
- to formalize numerous results in pure mathematics;
- to verify cryptographic protocols at an abstract level.
Isabelle is free, open source, and available to download. It runs on all the main operating systems on any machine that has enough memory.