Franck Cassez, Head of Research at Mantle. — all views expressed are my own.
Learn More →
In this note I present a formal semantics of Yul in the verification-friendly language Dafny. The Dafny verification engine can be used to reason about Yul code and to provide some (machine-checkable) guarantees about the code. The formal semantics and examples are available in the repository Yul in Dafny.
Yul is a programming language that can be used as an intermediate representation (IR) for EVM bytecode:
Optionally Yul code can be optimised before the final compilation stage to EVM bytecode. Note that the use of IR is common in compiler infrastructures: for instance, LLVM is used as a backend for several compilers (C, C++, Haskell, Rust, …).
What caught my
Very happy to see formal verification mentioned there
I looked up for projects on formal verification for Yul and could only find a handful:[3]
The previous contributions share some common features:
Recently, I have initiated and contributed to the Dafny-EVM project and we wrote about it in [1]. As Yul has a lot in common with the EVM bytecode, I thought it would be nice to re-use some of the libraries developed in the Dafny-EVM, and the automated reasoning capabilities provided by Dafny.
The first step to formally verifying Yul code is to have a formal semantics of Yul. The second is to have a formal semantics that can be used to automatically reason about the code. If the semantics of a Yul program is a Dafny program, then this second requirement is met.
It turns out that providing the semantics of Yul in Dafny can be done using a shallow embedding of Yul into (the host language) Dafny.
Yul has been designed to be be flexible in the sense that primitive functions or types can be customised. A customisation is a Yul dialect. At the moment there is only one official Yul dialect that uses EVM opcodes as primitive functions and a single type u256
(unsigned integers over 256 bits).
In other words, Yul itself (non customised) is a sort of template that defines the supported control structures like assignment :=
, if
, forloop
, switch
, function
declarations etc but does not prescribe any primitive types or functions. These control structures are common to many languages and the semantics (in terms of control flow) of an if
or switch
or loop
is usually the same in different languages, Yul, Dafny.
To provide the semantics of Yul in Dafny, it suffices to write the semantics of the primitive functions, and then we can re-use the control structure of Dafny to provide the semantics of the control flow constructs likeif
, loop
, function
etc.
This is a shallow embedding of Yul in Dafny as it re-uses the Dafny control structures and the scoping rules to define the semantics of Yul.
I have written the semantics of Yul primitive functions in a Dafny library. This library re-uses the core components of the Dafny-EVM and provides the functional semantics of the Yul EVM Dialect.
Here is the semantics of the Yul mstore
given by the Dafny MStore
function:
/**
* Memory store. Store a u256 into memory.
*
* @param address The start address.
* @param value Value to store.
* @param s The state before store operation.
* @returns s' such that s'.memory = s.memory except for
* [address..address + 31] where value is stored.
*
* @note Memory is a word-addressable array of bytes. A u256
* is stored into 32 bytes from address to address + 31.
*/
function MStore(address: u256, value: u256, s: Executing): (s': State)
ensures s'.EXECUTING?
ensures s'.MemSize() > address as nat + 31
ensures s'.yul.context == s.yul.context
ensures s'.yul.world == s.yul.world
{
// Make sure memory is large enough.
var m' := Memory.ExpandMem(s.yul.memory, address as nat, 32);
EXECUTING(s.yul.(
memory := Memory.WriteUint256(m', address as nat, value)))
}
It is a function from a Yul (Executing
) state (and two u256
parameters) to a Yul state State
.
The Dafny datatype Executing
(line 13) captures a normal state (non-error, non-return) of the Yul machine. The state represents the current memory and storage states, and has also the context
of the current transaction, like the calldataload
. For instance for result of MStore
returns a new state s'
because it updates the memory component of the current state.
As an illustration of how this works and how to formally verify Yul code, here is an example of a Yul function computing the maximum of two u256
values. It could be for instance provided by a library.
The Yul source code is:
object "Runtime" {
code {
function max(x, y) -> result
{
result := x
if lt(x, y) {
result := y
}
}
// Use the code to compute max of 8, 3
let x := 8
let y := 3
let z := max(x, y)
mstore(0x40, z)
// return the result to caller
return(0x40, 32)
}
}
The primitive operators (in the EVM Dialect) are: lt
(lower-than), mstore
(memory storage) and return
(return result in a data field to the caller.)
The Yul Dafny library provides the semantics for these operators: Lt
(Dafny semantics here), MStore
(Dafny semantics here) and Return
(Dafny semantics here).
The rest of the semantics of the Yul code above (the control structures, variables' scopes) is provided by translating the Yul control structures into Dafny like so:
// Verification helper
/** A ghost (verification-only) function.
Not part of the translation Yul to Dafny. */
ghost function maxu256(x: u256, y: u256): (r: u256)
ensures r == x || r == y
ensures r >= x && r >= y
{
if x <= y then y else x
}
// Translated code from Yul to Dafny
/** Translation of the of the Yul code of max in Dafny. */
method Max(x: u256, y: u256) returns (result: u256)
ensures result == maxu256(x, y)
{
result := x;
if Lt(x, y) {
result := y;
}
}
/** Translation of Yul code of main in Dafny. */
method Main(s: Executing) returns (s': State)
ensures s'.RETURNS?
ensures ByteUtils.ReadUint256(s'.data, 0) == 8
{
var x := 8; // let
var y := 3; // let
var z := Max(x, y); // funcall
var s2 := MStore(0x40, z, s); // memory store
return Return(0x40, 32, s2); // return result
}
And we now have:
ensures
at lines 26–27, hold for every input state s
, and ensures
at line 16 hold for every inputs x
and y
.The complete description of max
code is available here.
This is work in progress and there are a couple of things to improve, for instance:
If you are interested in this kind of work, the Yul in Dafny repository already contains more examples and in particular:
solc
the Solidity compiler.[1] Cassez, F., Fuller, J., Ghale, M.K., Pearce, D.J., Quiles, H.M.A. (2023). Formal and Executable Semantics of the Ethereum Virtual Machine in Dafny. In: Chechik, M., Katoen, JP., Leucker, M. (eds) Formal Methods. FM 2023. Lecture Notes in Computer Science, vol 14000. Springer, Cham. DOI
For some reasons unknown to me there is still a pop(x)
instruction in Yul. The documentation reads "discard value x".
Yul has a gas()
instruction and also a pc()
instruction that do not have a meaning in Yul but only in EVM bytecode. ↩︎
If I missed an important project, feel free to let me know. ↩︎