Executable EVM Semantics (s. 9 Ethereum Yellow Paper)

# Executable EVM Semantics (s. 9 Ethereum Yellow Paper) The objective of this piece is to explain in a more layman manner how we can interpret Section 9 of the Ethereum Yellow Paper into actual executable code. Some parts of the Yellow Paper are quite heavy with mathematical notation and formulas. Here, my goal is to focus on those formal definitions and convert them into intuitive explanations and practical functions, while also pointing out some important but silent assumptions in the Ethereum Virtual Machine (EVM). First, I will familiarize you with some basic assumptions I have made about you, the reader: 1. You are familiar with blockchain not only in the loose "crypto" sense but as an actual chain of blocks maintaining a distributed state. 2. You have a basic idea of what the EVM is and what happens during a transaction's lifecycle on Ethereum. 3. You are deeply interested in digging into the EVM execution layer from first principles and are comfortable with some technical detail. If these assumptions hold, we can proceed. Otherwise, you may want to first review what happens when you send a transaction from your wallet (e.g. MetaMask or TrustWallet) until it gets executed on-chain. That background will help in understanding the EVM internals described here. Now, lfg!!! ## EVM Data Structures: Stack, Memory, and Storage First, it helps to understand the three fundamental data structures that the EVM uses during execution: the stack, memory, and storage. The EVM is often described as a stack-machine, meaning most of its computation happens on a stack. But it also has a volatile memory for temporary data and a persistent storage for long-term data. * Stack: The EVM stack is a last-in, first-out (LIFO) structure that holds data values during execution. Every instruction (opcode) takes its input operands from the stack and often pushes results back onto it. Each stack item is a 256-bit word (32 bytes) in [size](https://www.degencode.com/p/low-level-evm-part-i-opcodes-and-the-stack). The stack has a maximum depth of 1024 [values](https://www.degencode.com/p/low-level-evm-part-i-opcodes-and-the-stack). If a program tries to push more than 1024 items, it will run into a stack overflow error; if it tries to pop an item from an empty stack, that's a stack underflow. These stack overflow/underflow conditions are among the errors that can halt execution (we'll discuss halting conditions later). For now, remember that the stack is the primary workspace of the EVM for runtime values. * Memory: The EVM's memory is a linear byte-array that is [volatile (temporary)](https://ethereum.github.io/yellowpaper/paper.pdf#:~:text=memory%20model%20is%20a%20simple,defined%20initially%20as%20zero). You can think of it as RAM for a smart contract execution. It starts zeroed out and is cleared after the [execution ends](https://www.evm.codes/about). Memory is byte-addressable i.e you can address a single byte and read or write to it, but certain operations read and write 32-byte (256-bit) words at a time. The memory can expand dynamically during execution; it has no fixed size limit except that expansion costs gas (which naturally limits how far you can go). If an opcode needs to read or write to an address beyond the current memory size, the memory will automatically enlarge to accommodate, and a memory expansion fee will be charged. Importantly, all memory addresses that haven't been written to yet are assumed to contain 0 by default. Memory is typically used for holding transient data such as function arguments, return values, or for assembling data to be hashed, etc., during contract execution. It’s accessible with opcodes like MLOAD (load 32 bytes from memory to stack) and MSTORE (store 32-byte value from stack into memory), etc. * Storage: Each account (smart contract) in Ethereum has a storage, which is a persistent key-value store mapping 32-byte keys to 32-byte values. You can imagine storage as a large, sparse array or dictionary that persists between transactions. Unlike memory, storage is non-volatile – it represents the long-term state of the contract. Storage is maintained as part of the global state on the blockchain (the world state). All storage slots are initialized to zero as well. Accessing and especially modifying storage is very expensive in terms of gas, because it results in writes to the persistent state that every node must keep. Only specialized opcodes can interact with storage: SLOAD reads a storage slot, and SSTORE writes a value to a storage slot. Storage writes often have additional gas costs or refunds to incentivize conserving space. **Code**: In addition to the three data areas above, the EVM execution also involves the code being executed. The smart contract's code (the bytecode) is stored in a separate read-only space (often called code storage or ROM) and is not mixed with the general memory or stack. In other words, the EVM is not a von Neumann architecture; you cannot arbitrarily modify your code at runtime as data. The code is immutable during execution, it can only be read, not written to. The program counter will step through this code. There are special opcodes to access code if needed (e.g. CODESIZE, CODECOPY to get the contract’s own code, or EXTCODECOPY to read another contract’s code), but you cannot self-modify the code. This is a design choice that simplifies the model and improves security. Now that we have these basics laid out, let's dive into how the EVM actually executes code using these components. The EVM Execution Model In the Ethereum Yellow Paper, Section 9 defines the Execution Model of the EVM. It describes how a given sequence of bytecode instructions is executed given an initial state and environment. The formal definition can be intimidating, but we can break it down into a more familiar algorithmic description. Let's outline the key components of the execution model and then describe the execution loop in simpler terms: Machine State (μ): At any point during execution, the EVM has a machine state denoted by μ in the Yellow Paper. The machine state is a tuple of six components: 𝜇 =(𝑔,𝑝𝑐,𝑚,𝑖,𝑠,𝑜). These are: the remaining gas 𝑔, the program counter 𝑝𝑐, the memory 𝑚, the active memory size 𝑖 (in words), the stack 𝑠, and the return data buffer 𝑜. In simpler terms: Gas (μg): how much gas is still available for the execution. Program Counter (μpc): the index of the next instruction in the code to be executed. Memory (μm): the current contents of EVM memory (this can grow as needed). Memory Size (μi): the highest word index of memory that has been accessed so far, essentially tracking the size of allocated memory in 32-byte chunks. Stack (μs): the current contents of the stack (LIFO data of 256-bit values). Returndata (μo): a buffer for return data. This holds the data returned by a sub-call or the current execution (set by instructions like RETURN or REVERT). It can be read by the calling context via RETURNDATASIZE and RETURNDATACOPY opcodes. System State (σ): This represents the global state outside the machine (account balances, contract storage contents, etc.). The system state σ is read and written when certain opcodes interact with accounts or storage (for example, SSTORE updates σ, SLOAD reads from σ, value transfers update balances in σ, etc.). Execution Environment (I): This is a set of contextual data for the execution, provided by whatever triggered this execution (could be a transaction or a call from another contract). The environment includes things like the address of the contract being executed, the caller’s address, the call value (ETH being sent), the input data (calldata), the gas price, the current block information, and whether the call is allowed to modify state (this last part is important for static calls). I won't enumerate all of these here, but it's good to know that opcodes like CALLER, CALLVALUE, CALLDATALOAD, COINBASE, TIMESTAMP, etc., pull information from this execution environment. With those definitions, the EVM execution can be described as an iterative cycle that processes one instruction at a time, updating the machine state. A simple pseudocode for the EVM execution loop might look like this: ```javascript // Initialize machine state (μ) and environment (I) pc = 0 // program counter starts at 0 gas = G_init // gas allotted for this execution (from tx or caller) stack = [] // empty stack memory = [] // empty byte array (expandable) mem_size = 0 // active memory size in words returndata = [] // empty return data buffer loop: if pc >= code_length: opcode = STOP // beyond code length, treat as STOP (halting) else: opcode = code[pc] // fetch the next byte as opcode // 1. ensure there is enough gas to execute this opcode cost = calculateGasCost(opcode, args,..., current state) if gas < cost: <halt execution with "out of gas" error> // Exceptional halt break loop // 2. ensure stack has enough items for this opcode (δ = number of items this opcode should pop, α = number it will push) if len(stack) < δ: <halt execution with "stack underflow" error> // Exceptional halt break loop if len(stack) - δ + α > 1024: <halt execution with "stack overflow" error> // Exceptional halt break loop // 3. subtract the gas cost for this opcode gas = gas - cost // 4. the behavior of the opcode switch (opcode): case PUSHx: // push the immediate bytes that follow this opcode into stack value = <next x bytes from code> stack.push(value) pc = pc + 1 + x // advance PC past the PUSH and its data case POP: stack.pop() pc = pc + 1 case ADD: a = stack.pop() b = stack.pop() sum = (a + b) mod 2^256 stack.push(sum) pc = pc + 1 case MLOAD: offset = stack.pop() // expand memory if needed to cover offset+32 bytes <possibly increase memory size and charge additional gas> value = read32Bytes(memory, offset) stack.push(value) pc = pc + 1 case MSTORE: offset = stack.pop() value = stack.pop() <expand memory if needed for offset+32 bytes (gas already accounted)> write32Bytes(memory, offset, value) pc = pc + 1 case JUMP: dest = stack.pop() if not isValidJumpDestination(dest): <halt with invalid jump error> // Exceptional halt break loop pc = dest // set PC to jump destination (do not auto-increment) case JUMPI: dest = stack.pop() condition = stack.pop() if condition != 0: if not isValidJumpDestination(dest): <halt with invalid jump error> break loop pc = dest else: pc = pc + 1 case SSTORE: index = stack.pop() newValue = stack.pop() <perform storage write (updates global state σ)> <charge additional gas: substantial cost, maybe refunds if clearing> pc = pc + 1 case STOP: <halt execution normally> // Normal halt break loop case RETURN: offset = stack.pop() size = stack.pop() returndata = memory[offset : offset+size] // collect return bytes <halt execution normally, with output> // Normal halt break loop case REVERT: offset = stack.pop() size = stack.pop() returndata = memory[offset : offset+size] <halt execution *without* applying state changes> // (Revert is a normal halt that undoes state) break loop case INVALID: // any undefined opcode <halt execution with invalid instruction error> break loop ... (and so on for all other opcodes) ... end switch // loop continues if not halted end loop // After loop: if halted normally, remaining gas may be refunded to caller, // and returndata (if any) is made available to caller. If halted by error, state changes are reverted. ``` The above pseudocode is a rough outline, but it covers the essence of EVM execution in a more familiar way. Let's walk through important aspects of this execution cycle: * Fetching Instructions: The program counter (PC) tells us which byte of code to execute next. If the PC goes past the end of the code array, the Yellow Paper specifies that the EVM should behave as if it encountered a STOP instruction. In other words, reaching the end of code triggers a normal halt. Normally, after executing an instruction, the PC moves to the next instruction. Most opcodes increment the PC automatically by 1 (since each opcode is one byte), but some opcodes alter the PC directly (jumps and calls) or have immediate data that follows the opcode byte (e.g., a PUSH instruction will jump over its immediate data). The EVM handles these details so that PC points to the right place on each iteration. * Gas as Execution Fuel: Gas is a first-class citizen in the EVM state – notice we track gas alongside other components of the machine state. The Yellow Paper calls gas “intrinsically bounding” computation, meaning the EVM is quasi-Turing complete; it would be Turing-complete except that you can’t run forever because you’ll run out of gas. Before executing any instruction, the EVM checks if there is enough gas left to cover the cost of that instruction (including any additional dynamic costs). If not, it triggers an out-of-gas (OOG) error, which is an exceptional halt – the execution stops and all state changes are reverted. Gas is deducted upfront for an instruction, and if the execution of that instruction completes successfully, any leftover gas (from what was deducted) is kept for the next instructions. If the instruction causes an error or halts, any gas paid is consumed anyway (except in the case of REVERT, which refunds the unused portion, but REVERT still costs the gas for the operations up to that point). * Stack Operation and Bounds: Each opcode defines how many items it pops from the stack and how many it pushes onto the stack. In the formal spec, these are given as δ (delta) and α (alpha) for each opcode. For example, the ADD instruction has δ=2 (it consumes two stack items) and α=1 (it pushes one result). The Yellow Paper notes that an opcode will first require its inputs (so those values must be on the stack or it’s an error), then it performs the calculation, and then pushes any results. We must always ensure the stack isn't underflowed or overflowed. If an opcode needs N items but the stack has fewer, that's a stack underflow – an immediate exception. Similarly, if executing an opcode would cause the stack to grow beyond 1024 elements, that's a stack overflow (the formal rule is current stack size − 𝛿 + 𝛼 ≤ 1024 must hold). Either of these conditions results in an exceptional halt (reverting state changes). * Executing Opcode Logic: Now, this is where it gets more interesting: executing what each opcode is supposed to do. In the pseudocode above, I sketched a few cases: * PUSH: This reads the next bytes from the code itself and pushes them as a single 32-byte value on the stack. For example, PUSH1 takes the next 1 byte as a literal and pushes it; PUSH32 takes the next 32 bytes. The PC has to jump over the immediate data so it points to the next actual opcode. * POP: Simple – removes the top item from the stack (δ=1, α=0). * ADD (0x01): Pops the top two 256-bit values from the stack, adds them, and pushes the 256-bit result. * MLOAD/MSTORE: These involve memory. MLOAD takes a byte offset from the stack and loads 32 bytes from memory at that offset into the stack. MSTORE takes an offset and a value, and stores the 32-byte value into memory starting at that offset. If the offset is beyond the current allocated memory size, the EVM will expand the memory. Memory expansion is where gas calculation comes in: the EVM will compute the new required size (rounded up to nearest 32-byte word) and charge gas for the additional memory. * JUMP (0x56): Pops a destination address from the stack and sets the PC to that address. However, not every byte is a valid place to jump to – it must exactly correspond to an opcode’s start (you can’t jump into the middle of a PUSH data, for example). The Yellow Paper defines a set 𝐷 of valid jump destinations (basically the positions of all JUMPDEST instructions in the code). If the destination is not in 𝐷, the EVM will trigger an invalid jump exception and halt. So, JUMP is a bit tricky: it’s only allowed to go to locations marked by a JUMPDEST opcode. * JUMPI (0x57): Similar to JUMP, but it’s conditional. It pops two values: a destination and a condition. If the condition is non-zero, it behaves like JUMP (and requires the destination to be a valid JUMPDEST). If the condition is zero, it does nothing (just moves to next instruction normally). In the case the condition is non-zero but the destination is invalid, it’s again an exception. * SSTORE (0x55): Pops a key and a value, then writes the value into the contract’s storage at that key. Storage operations have complex gas rules: roughly, writing a non-zero value to a slot that was zero costs a lot of gas (because you are using new storage), writing zero to a slot that was non-zero (clearing storage) refunds some gas (because you're freeing space), and writing to an already-used slot (setting a value to another non-zero value) has a cost somewhere in between. In code, an implementation of SSTORE would also have to mark that slot as dirty (for later writing to state), etc. For our purposes, it’s enough to know SSTORE can be one of the most expensive operations and that its gas cost is dynamic based on the circumstances. * CALL (0xF1): This is an example of an opcode that spawns a new execution context (a message call to another contract). CALL has several arguments: gas to forward, address to call, value in wei to send, and memory offsets for input and output. It will reduce the caller’s remaining gas by the amount forwarded (plus intrinsic costs) and initialize a new sub-EVM instance for the callee. The caller’s execution is paused until the callee returns. If the callee runs out of gas or throws, the CALL returns false; otherwise it returns true and you can retrieve return data. CALL is quite involved, but one key point is that the caller must decide how much gas to give to the callee – that gas is subtracted upfront. Any gas unused by the callee comes back to the caller (the EVM reconciles gas on return). This prevents infinite recursion because each nested call must be given a slice of gas from the parent. * Halting: The loop continues until a halting condition is met. There are normal halts and exceptional halts. A normal halt happens when execution reaches a STOP instruction, a RETURN (which halts and yields output), a SELFDESTRUCT (which halts after destroying the contract and possibly sending value), or a REVERT (which halts and signals to revert state changes, but is considered a controlled halt). In these cases, the execution ends gracefully. An exceptional halt occurs when something goes wrong: out-of-gas, stack overflow/underflow, invalid opcode, invalid jump, or other illegal operation. In an exceptional halt, the execution is immediately aborted and none of the changes made during this execution are applied – it’s as if the call/transaction never happened (except that gas is still consumed). The Yellow Paper explicitly states that on such errors, the machine halts and reverts any changes, leaving the state intact as it was before the execution, and reports the issue to whatever called it. (If it was a transaction, the transaction fails; if it was an internal call, the call returns failure.) This distinction is important: successful execution can modify contract storage, balances, etc., whereas a failed execution due to an exception should not modify any state. Originally I titled this "*Executable Semantics and Gas as a First-Class Mathematical Object*", Lol, but since this is getting quite unwiedly and long, we will look into gas in some other paper. WAGMI.