--- title: Formal Semantics of Yul in Dafny tags: Yul, EVM, Ethereum, Dafny, Semantics --- [Franck Cassez](https://franck44.github.io), Head of Research at [Mantle](https://www.mantle.xyz). — all views expressed are my own. [![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/franckDownunder.svg?style=social&label=Follow%20%40FranckDownunder)](https://twitter.com/franckDownunder) --- # Formal and executable semantics of Yul in Dafny In this note I present a formal semantics of [Yul](https://docs.soliditylang.org/en/latest/yul.html) in the verification-friendly language [Dafny](https://github.com/dafny-lang/dafny). The Dafny verification engine can be used to _reason_ about Yul code and to provide some (machine-checkable) _guarantees_ about the code. The formal semantics and examples are available in the repository [Yul in Dafny](https://github.com/franck44/yul-dafny). ## What is Yul? [Yul](https://docs.soliditylang.org/en/latest/yul.html) is a programming language that can be used as an _intermediate representation (IR)_ for EVM bytecode: - it is (relatively) _easy_ to translate Yul into EVM bytecode; for instance the Solidity compiler supports the compilation from Yul to EVM bytecode, - Yul is a _structured_ programming language and sits at a higher-level than EVM bytecode: it does not provide the notion of _stack_^[For some reasons unknown to me there is still a `pop(x)` instruction in Yul. The documentation reads "discard value x". :man-shrugging:] or _gas_^[Yul has a `gas()` instruction and also a `pc()` instruction that do not have a meaning in Yul but only in EVM bytecode.], and is much easier to read/understand. As a result it can be an intermediate target for compilers that ultimately generate EVM bytecode. Optionally Yul code can be _optimised_ before the final compilation stage to EVM bytecode. Note that the use of IR is common in compiler infrastructures: for instance, [LLVM](https://llvm.org/docs/LangRef.html) is used as a backend for several compilers (C, C++, Haskell, Rust, ...). What caught my :eyes: in the Yul documentation is one of the [motivations](https://docs.soliditylang.org/en/latest/yul.html?vscodeBrowserReqId=1697058106244#motivation-and-high-level-description) for designing Yul: :::info 2. Control flow should be easy to understand to help in manual inspection, ==formal verification== and optimisation. ::: Very happy to see formal verification mentioned there :thumbsup:. ## What does the Yul formal verification landscape look like? I looked up for projects on _formal verification_ for Yul and could only find a handful:^[If I missed an important project, feel free to let me know.] - [Yul in ACL2](https://github.com/acl2/acl2/tree/master/books/kestrel/yul/language). Semantics of Yul in ACL2 framework. Last updated two months ago. - [Yul in Lean](https://medium.com/nethermind-eth/securing-warp-a-formal-specification-of-the-yul-ir-85bb3bf51c62#:~:text=The%20Yul%20language%20specification%20then,terms%20in%20the%20initial%20states.). A formal specificaiton of Yul in the [Lean](https://lean-lang.org) proof assistant, and a [git repo](https://github.com/NethermindEth/Yul-Specification), last update 2 years ago. - [Yul in Isabelle](https://github.com/mmalvarez/Yul-Isabelle/tree/master). A formal semantics of Yul in the proof assistant [Isabelle](https://isabelle.in.tum.de). Last update 2 years ago. - [Yul in K](https://github.com/ethereum/Yul-K). A formal semantics of Yul in the [K framework](https://kframework.org). Last update 4 years ago. The previous contributions share some common features: - the focus seems to be on defining a _formal semantics_ for Yul. This is understandable as the section [Formal Specification](https://docs.soliditylang.org/en/latest/yul.html?vscodeBrowserReqId=1697058106244#formal-specification) in the Yul documentation does not really provide a rigorous and formal semantics of the language. - the above mentioned formalisations are all _deep embeddings_ of the semantics of Yul: first the syntax of the language is defined, and then the semantics (operational, denotational) is defined. - as a result, the formal semantics proposed above have not been leveraged to perform formal verification of Yul code. I'll be happy to be wrong and would love to see some examples of Yul formal verification that I have possibly missed. - the projects are not updated often or not maintained any more. ## A shallow embedding of Yul in Dafny Recently, I have initiated and contributed to the [Dafny-EVM](https://github.com/Consensys/evm-dafny) project and we wrote about it in [1]. As Yul has a lot in common with the EVM bytecode, I thought it would be nice to re-use some of the libraries developed in the [Dafny-EVM](https://github.com/Consensys/evm-dafny), and the automated reasoning capabilities provided by Dafny. The first step to formally verifying Yul code is to have a _formal semantics_ of Yul. The second is to have a formal semantics that can be used to _automatically reason_ about the code. If the semantics of a Yul program is a Dafny program, then this second requirement is met. :::info It turns out that providing the semantics of Yul in Dafny can be done using a _shallow embedding_ of Yul into (the host language) Dafny. :bulb: ::: Yul has been designed to be be _flexible_ in the sense that _primitive functions_ or types can be customised. A customisation is a _Yul dialect_. At the moment there is only one official Yul dialect that uses _EVM opcodes_ as primitive functions and a single type `u256` (unsigned integers over 256 bits). In other words, Yul itself (non customised) is a sort of _template_ that defines the supported control structures like assignment `:=`, `if`, `forloop`, `switch`, `function` declarations etc but does not prescribe any primitive types or functions. These control structures are common to many languages and the semantics (in terms of control flow) of an `if` or `switch` or `loop` is usually the same in different languages, Yul, Dafny. To provide the semantics of Yul in Dafny, it suffices to write the semantics of the primitive functions, and then we can _re-use_ the control structure of Dafny to provide the semantics of the control flow constructs like`if`, `loop`, `function` etc. This is a _shallow embedding_ of Yul in Dafny as it re-uses the Dafny control structures and the scoping rules to define the semantics of Yul. I have written the semantics of Yul primitive functions in a [Dafny library](https://github.com/franck44/yul-dafny/tree/main/src/dafny/Yul). This library re-uses the core components of the [Dafny-EVM](https://github.com/Consensys/evm-dafny) and provides the functional semantics of the Yul [EVM Dialect](https://docs.soliditylang.org/en/latest/yul.html?vscodeBrowserReqId=1697058106244#evm-dialect). Here is the semantics of the Yul `mstore` given by the Dafny `MStore` function: ```rust= /** * Memory store. Store a u256 into memory. * * @param address The start address. * @param value Value to store. * @param s The state before store operation. * @returns s' such that s'.memory = s.memory except for * [address..address + 31] where value is stored. * * @note Memory is a word-addressable array of bytes. A u256 * is stored into 32 bytes from address to address + 31. */ function MStore(address: u256, value: u256, s: Executing): (s': State) ensures s'.EXECUTING? ensures s'.MemSize() > address as nat + 31 ensures s'.yul.context == s.yul.context ensures s'.yul.world == s.yul.world { // Make sure memory is large enough. var m' := Memory.ExpandMem(s.yul.memory, address as nat, 32); EXECUTING(s.yul.( memory := Memory.WriteUint256(m', address as nat, value))) } ``` It is a function from a Yul (`Executing`) state (and two `u256` parameters) to a Yul state `State`. The Dafny datatype `Executing` (line 13) captures a normal state (non-error, non-return) of the Yul machine. The state represents the current memory and storage states, and has also the `context` of the current transaction, like the `calldataload`. For instance for result of `MStore` returns a new state `s'` because it updates the memory component of the current state. ## A simple verification example As an illustration of how this works and how to formally verify Yul code, here is an example of a Yul function computing the maximum of two `u256` values. It could be for instance provided by a library. The Yul source code is: ```Yul= object "Runtime" { code { function max(x, y) -> result { result := x if lt(x, y) { result := y } } // Use the code to compute max of 8, 3 let x := 8 let y := 3 let z := max(x, y) mstore(0x40, z) // return the result to caller return(0x40, 32) } } ``` The primitive operators (in the EVM Dialect) are: `lt` (lower-than), `mstore` (memory storage) and `return` (return result in a data field to the caller.) The Yul Dafny library provides the semantics for these operators: `Lt` ([Dafny semantics here](https://github.com/franck44/yul-dafny/blob/27e790255ffdcf625d8992af39b112f3891a4c71/src/dafny/Yul/CommonSem.dfy#L66)), `MStore` ([Dafny semantics here](https://github.com/franck44/yul-dafny/blob/27e790255ffdcf625d8992af39b112f3891a4c71/src/dafny/Yul/CommonSem.dfy#L245)) and `Return` ([Dafny semantics here](https://github.com/franck44/yul-dafny/blob/27e790255ffdcf625d8992af39b112f3891a4c71/src/dafny/Yul/CommonSem.dfy#L320)). The rest of the semantics of the Yul code above (the control structures, variables' scopes) is provided by _translating_ the Yul control structures into Dafny like so: ```rust= // Verification helper /** A ghost (verification-only) function. Not part of the translation Yul to Dafny. */ ghost function maxu256(x: u256, y: u256): (r: u256) ensures r == x || r == y ensures r >= x && r >= y { if x <= y then y else x } // Translated code from Yul to Dafny /** Translation of the of the Yul code of max in Dafny. */ method Max(x: u256, y: u256) returns (result: u256) ensures result == maxu256(x, y) { result := x; if Lt(x, y) { result := y; } } /** Translation of Yul code of main in Dafny. */ method Main(s: Executing) returns (s': State) ensures s'.RETURNS? ensures ByteUtils.ReadUint256(s'.data, 0) == 8 { var x := 8; // let var y := 3; // let var z := Max(x, y); // funcall var s2 := MStore(0x40, z, s); // memory store return Return(0x40, 32, s2); // return result } ``` And we now have: :::success 1. [x] a ==formal and executable semantics of Yul== (executable via Yul :arrow_right: Dafny :arrow_right: execute Dafny code). 2. [x] the ==ability to reason== about the Yul code, the semantics of which is given in Dafny: in the example above, Dafny _verifies_ at compile-time that the post-conditions, `ensures` at lines 26--27, hold for **every input state `s`**, and `ensures` at line 16 hold for **every inputs `x` and `y`**. ::: The complete description of `max` code is available [here](https://github.com/franck44/yul-dafny/blob/main/src/dafny/yul-verif-examples/max/README.md). ## Next steps This is work in progress and there are a couple of things to improve, for instance: - implement a translator from Yul to Dafny, - add more Yul primitive functions to the Dafny Yul library, - use the verification technique described above on more complex examples. If you are interested in this kind of work, the [Yul in Dafny](https://github.com/franck44/yul-dafny) repository already contains more examples and in particular: - A simplified _verified_ version of an [ERC-20 token](https://github.com/franck44/yul-dafny/tree/main/src/dafny/yul-verif-examples/erc-20) in Yul, - [a proof of _equivalence_](https://github.com/franck44/yul-dafny/tree/main/src/dafny/yul-bytecode-verif/max) of Yul code and the corresponding EVM bytecode as generated by `solc` the Solidity compiler. ## References [1] Cassez, F., Fuller, J., Ghale, M.K., Pearce, D.J., Quiles, H.M.A. (2023). **Formal and Executable Semantics of the Ethereum Virtual Machine in Dafny**. In: Chechik, M., Katoen, JP., Leucker, M. (eds) _Formal Methods_. FM 2023. Lecture Notes in Computer Science, vol 14000. Springer, Cham. [DOI](https://doi.org/10.1007/978-3-031-27481-7_32) [^fn]: