Safer Control Flow for the EVM

--- eip: 3779 title: Safer Control Flow for the EVM description: Ensure an essential level of safety for EVM code. status: Draft type: Standards Track category: Core author: Greg Colvin (@gcolvin), Greg Colvin <greg@colvin.org>, Brooklyn Zelenka (@expede), discussions-to: https://ethereum-magicians.org/t/eip-3779-safe-control-flow-for-the-evm/6975 created: 2021-08-30 requires: 2315, 3540, 3670, 3779, 4200 --- ## Abstract This EIP proposes to validate at initialization time that contracts will not underflow stack, jump to invalid locations, or execute invalid instructions. This further implies that subroutines are always called with and return the same number of argument and that looping will not change the stack height. Linear-time validation algorithms are required and one is provided. ## Motivation ### Safety For our purposes we define a safe EVM contract as one that cannot encounter an exceptional halting state. Unsafe code can attempt to overflow stack, underflow stack, execute invalid instructions, and jump to invalid locations. Unsafe contracts are exploits waiting to happen. From the standpoint of security it would be best if unsafe contracts were never placed on the blockchain. ### Validity We cannot guarantee complete safety -- we must allow valid contracts to overflow stack or run out of gas. But we guarantee that valid contracts cannot execute invalid instructions, jump to invalid locations, or underflow stack. To prevent denial of service attacks all control flow must be validated at initialization time, and in time and space linear in the size of the code. We present an algorithm for doing so below. #### Static Jumps and Subroutines The relative jumps of [EIP-4200](./eip-4200) and the simple subroutines of [EIP-2315](./eip-2315) provide a complete set of static control flow instructions: > `RJUMP` *offset* * Jumps to _IP+offset_. > `RJUMPI` *offset* * Jumps if the top of stack is non-zero. > `RJUMPSUB` *offset* * Pushes _IP+1_ on the return stack and jumps to _IP+offest_. > `RETURNSUB` * Pops an address off the return stack and jumps to it Note that each jump creates at most two paths of control through the code, such that the complexity of traversing the entire control-flow graph is linear in the size of the code. #### *Dynamic Jumps* `JUMP` or `JUMPI` also provide a complete set of control flow instructions. But they are an obstacle to proving validity in linear time -- jumps can be to any destination in the code, creating a quadratic "path explosion". We propose to deprecate `JUMP` and `JUMPI`. ### Performance Validating control flow at initialization time has potential performance advantages, as checking for invalid jumps and stack underflow does not need to be done at runtime. ## Specification This EIP does not introduce any new opcodes. Rather, it restricts the use of proposed control-flow instructions. These restrictions must be validated at contract initialization time – not at runtime – by the provided algorithm or its equivalent. This algorithm must take time and space linear in the size of the contract, so as not to be a denial of service vulnerability. This specification is entirely semantic -- it imposes no further syntax on bytecode. Ethereum Virtual Machine bytecode is just that -- a sequence of bytes that when executed causes a sequence of changes to the machine state. The safety we seek here is simply to not, as it were, jam up the gears. ### Validity > In theory, theory and practice are the same. In practice, they're not. -- Albert Einstein #### *Exceptional Halting States* The *execution* of each instruction is defined in the [Yellow Paper](https://ethereum.github.io/yellowpaper/paper.pdf) as a change to the EVM state that preserves the invariants of EVM state. At runtime, if the execution of an instruction would violate an invariant the EVM is in an exceptional halting state. The Yellow Paper defined five such states. 1. Insufficient gas 2. More than 1024 stack items 3. Insufficient stack items 4. Invalid jump destination 5. Invalid instruction *A program is safe iff no execution can lead to an exceptional halting state.* *We would like to consider EVM programs valid iff they are safe.* *In practice*, we must be able to validate *code* in linear time to avoid denial of service attacks. And we must support dynamically-priced instructions, loops, and recursion, which can use arbitrary amounts of gas and stack. Thus our validation algorithm cannot consider concrete computations -- it only performs a limited symbolic execution of the _code_. This means we will reject programs if we detect any invalid execution paths, even if those paths are not reachable at runtime. And we will count as valid programs that may not always produce correct results. We must check for the first two states at _runtime_: * `out of gas` and * stack overflow. The remaining three states we can check at *validation time*: * stack underflow, * invalid jump, and * invalid instruction. #### *Constraints on Valid Code* * Every instruction MUST be valid. * The `JUMP` and `JUMPI` instructions ARE NOT valid. * Every jump MUST be valid: * The `RJUMP`, `RJUMPI`, or `RJUMPSUB` instructions MUST NOT address immediate data. * The stacks MUST always be valid: * The number of items on the `data stack` MUST always be positive, and at most 1024. * The number of items on the `return stack`MUST always be positive, and at most 1024. * The data stack MUST be consistently aligned: * The number of items on the `data stack` between the current `stack pointer` and the `stack pointer` on entry to the most recent subroutine MUST be the same for each _execution_ of a _byte_code_. The `RJUMP`, `RJUMPI` and `RJUMPSUB` instructions take their destination as an immediate argument, which cannot change at runtime. Having constant destinations for all jumps means that all jump destinations can be validated at initialization time, not runtime. Requiring a consistently aligned`data stack` prevents stack underflow. It can also catch irreducible control flows and calls to subroutines with the wrong number of arguments. Taken together, these rules allow for EVM code to be validated by traversing its control-flow graph, in time and space linear in the size of the code, following each edge only once. ## Rationale If we are to be able to validate contracts in linear time we have two real choices. 1. Deprecate dynamic jumps. This is trivial. 2. Constrain dynamic jumps. This requires static analysis. Unfortunately, the second choice is a nonstarter: we cannot know whether a `JUMP` or `JUMPI` are subroutine calls in linear time. We must deprecate them. There are of course other useful safety properties that could be so validated, but we do not consider them here. ## Backwards Compatibility These changes affect the semantics of EVM code – the use of `JUMP`, `JUMPI` are deprecated, such that some *code* that would otherwise run correctly will nonetheless be invalid EVM *code*. In practice this will not be a problem, as this change applpies only to EOF code. ## Reference Implementation The following is a pseudo-Go implementation of an algorithm for predicating code validity. An equivalent algorithm must be run at initialization time. This algorithm performs a symbolic execution of the program that recursively traverses the _code_, emulating its control flow and stack use and checking for violations of the rules above. It runs in time equal to `O(vertices + edges)` in the program's control-flow graph, where edges represent control flow and the vertices represent _basic blocks_ -- thus the algorithm takes time proportional to the size of the _code_. It uses recursion to maintain a stack of continuations for conditional jumps, the size of which is at most proportional to the size of the _code_ *Note: All valid code has a control-flow graph that can be traversed in time and space linear in the length of the code, while avoiding stack underflow and misalignment. That means that some other static analyses and code transformations that might otherwise require quadratic time can also be written to run in linear time, including linear-time complilers and streaming machine code generators. ### Validation Function For simplicity's sake we assume that _jumpdest analysis_ has been done and that we have some helper functions. * `is_valid_instruction(pc)` returns true if `pc` points at a valid instruction. * `is_valid_jumpdest(dest)` returns true if `dest` is a valid jumpdest. * `immediate_data(pc)` returns the immediate data for the instruction at `pc`. * `advance_pc(pc)` advances to and returns the next `pc`, skipping any immediate data. * `removed_items(pc,sp)` returns the number of items removed from the `data_stack` by the instruction at `pc`. * `add_items(pc,sp)` returns the number of items added to the `data_stack` by the instruction at `pc`. ``` // emulated machine state var code [code_len]byte var avail_items [code_len]int var return_stack []int var data_stack []uint256 // return nil or an error func validate(pc := 0, sp := 0, bp := 0, rp := 0) int, error { for pc < code_len { if !is_valid_instruction(pc) { return pc, invalid_instruction } // if available items on stack for `pc` are non-zero // we have been here before // so return to break cycle if avail_items[pc] != 0 { // invalid if available items not the same if avail_items[pc] != sp - bp { return invalid_stack } } avail_items[pc] = sp - bp if avail_items[pc] < 0 { return stack_underflow } switch code[pc] { // successful termination case STOP: return nil case RETURN: return nil case SELFDESTRUCT: return nil case REVERT: return nil case INVALID: return pc, invalid_instruction // track constants pushed on data stack case PUSH1 <= code[pc] && code[pc] <= PUSH32 { sp++ if (sp > 1023) { return pc, stack_overflow } data_stack[sp] = immediate_data(pc) pc = advance_pc(pc) continue case RJUMP: // check for valid jump destination if !is_valid_jumpdest(jumpdest) { return pc, invalid_destination } // reset pc to destination of jump pc += immediate_data(pc) case RJUMPI: // recurse to validate true side of conditional jumpdest = pc + immediate_data(pc) if !is_valid_jumpdest(pc + jumpdest) { return pc, invalid_destination } max_left, err = validate(jumpdest, sp, bp, rp) if err { return pc, err } // recurse to validate false side of conditional pc = advance_pc(pc) max_right, err = validate(pc, sp, bp, rp) if err { return pc, err } return nil case RJUMPSUB: // check for valid jump destination jumpdest = imm_data(pc) if !is_valid_jumpdest(pc + jumpdest) { return pc, invalid_destination } // will enter subroutine at destination bp = sp // push return address and reset pc to destination rp++ return_stack = append(return_stack, pc + 1) pc += jumpdest case RETURNSUB: // will return to subroutine at destination bp = sp // pop return address and check for valid destination pc = return_stack[--rp] return_stack = return_stack[:rp] if !code[pc - 1] == JUMPSUB { return pc, invalid_destination data_stack[sp++] = immediate_data(pc) pc = advance_pc(pc) default: pc = advance_pc(pc) // apply other instructions to stack pointer sp -= removed_items(pc, sp) if (sp < 0) { return pc, stack_underflow } sp += added_items(pc, sp) } } // successful termination return nil } ``` ## Security Considerations This EIP is intended to improve security by ensuring a higher level of safety for EVM code deployed on the blockchain. ## Copyright Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).