EPF - Erigon - HackMD

# Erigon: RISCV Executable Proof Sourcing Mentor - *Mark Holt* Interested Permissioned Fellow - [Harsh Pratap Singh](https://harsh-ps-2003.bearblog.dev/) #### Background and Motivation Ethereum’s core execution engine currently relies on the 256-bit EVM, but scaling and verifiability needs are driving a shift toward industery-standard VMs for driving its proof set - like RISCV based zkVMs due to it's has simpler instruction sets and suitability for cryptographic proving. [Vitalik has proposed replacing EVM with RISC-V based execution layer as part of a long-term scaling strategy](https://ethereum-magicians.org/t/long-term-l1-execution-layer-proposal-replace-the-evm-with-risc-v/23617). [The Ethereum Foundation's formal verification effort for RISC-V zkVMs further validates this architectural direction](https://verified-zkevm.org/). * Current ZK-EVM implementations spend 50-100x more cycles proving EVM operations compared to native RISC-V execution. Benchmarks show RISC-V could reduce zk-SNARK proof generation times by 50-100x for certain operations * Most ZK provers already compile EVM bytecode to RISC-V internally. Direct RISC-V execution eliminates this translation layer, reducing complexity and attack surfaces * RISC-V's reduced instruction set (RV32IM) enables better JIT compilation and hardware acceleration compared to EVM's 256-bit stack-based architecture At present, most provers EVM code operate by compiling an EVM implementation into a lower-level ISA, then proving the correct operation of the resultant code on that ISA - either Assembly (e.g. CairoVM) or Opcode (e.g. Polygon zkEVM) based. The EVM, which has its own complex instruction set and semantics, is typically implemented in a higher-level language which is then compiled down to a simpler, well-understood ISA like RISC-V or a custom proving-oriented ISA such as Miden Assembly. The prover then generates proofs that the execution of this compiled code (i.e., the EVM implementation running a given transaction or contract) is correct according to the ISA's rules. Given the availibility of various the components necessary to run a stand alone EVM, this has been relatively quick to delivery and provides an operating prover. However this has a number of drawbacks : * *Proof Bloat* - The most significant drawback is that compiling an entire EVM implementation into the instruction set architecture of a zkVM leads to proving a much larger set of executable instructions than what is strictly necessary to execute a simple set of EVM transactions within a block. We are verifying operations that might not even be invoked by the specific transactions being processed. This increases the computational cost and time required for proving. * *Dependency Overhead* - When you compile an entire EVM implementation, you're not just compiling the core EVM logic. You're also bringing in all the underlying libraries, system calls, and other components that the EVM implementation itself relies on. Each of these dependencies then needs to be included in the proof, adding complexity and increasing the proof size. * *Complexity* - Maintaining equivalence between EVM and ISA implementations is challenging The EVM's 256-bit registers are specifically chosen to facilitate Keccak-256 hash operations and elliptic curve computations that are central to Ethereum's cryptographic security model. This design choice means that operations like ADD, MUL, and SUB are atomic at the 256-bit level, with the underlying implementation handling the full precision internally. RISC-V [RV32IM operates on 32-bit registers, creating an 8:1 mapping challenge when emulating EVM operations. Unlike architectures with dedicated carry flags, RISC-V deliberately omits condition codes to simplify out-of-order execution, forcing multi-precision arithmetic to use branch instructions for carry detection](https://stackoverflow.com/questions/70999565/why-does-risc-v-not-have-an-instruction-to-calculate-carry-out). This architectural decision significantly complicates the implementation of efficient multi-precision operations. [Research demonstrates that native execution can provide up to 197x cycle count improvements compared to EVM-based approaches](https://arxiv.org/pdf/2504.14809v5), which means that direct ISA execution offers substantial efficiency gains. [With similar aim in mind, ZKM team chose the harder path - MIPS32r2 instead of RISC-V](https://www.zkm.io/blog/why-zkm-chose-mips32r2-over-risc-v-for-zkmips). ## Proposed Solution - Inline Transpilation Rather than replacing Erigon's core EVM with a new ISA-based execution module (an invasive and complex alternative), our preferred approach is to add an inline transpilation process to Erigon's existing execution module. This allows us to generate an executable equivalent, suitable for proving, without a complete rewrite. We are interested in using the Erigon execution component with instruction level hooks to explore the practicality of the transpilation approach generating an ISA compatable executable object inline with the executions process. Our solution introduces inline transpilation within Erigon's execution client, generating RISC-V executables that precisely capture transaction execution paths. The primary goal is to have an executable equivalent which can be used by the proving system, allowing a zkVM to prove the EVM execution. We'll transform the operations of the EVM into instructions that a RISC-V based zkVMs can understand and prove. The key to achieving this is instrumenting Erigon's interpreter with instruction-level hooks. These hooks will be embedded directly within Erigon's EVM execution logic. As the EVM processes each instruction, these hooks will in real-time generate a RISC-V compatible executable object (an ELF file) or a trace that can be converted into one. This approach creates an "executable equivalent" of the EVM's operations, suitable for a zkVM to generate proofs. So, it simultaneously: * Execute EVM bytecode normally * Emit equivalent RISC-V instructions This creates a 1:1 correspondence between EVM execution and RISC-V program behavior. ```mermaid graph TD A[Erigon EVM] -->|Execution Trace| B(Transpiler) B --> C[RISC-V ELF] C --> D[zkVM Prover] D --> E[Proof] ``` * By generating the executable object inline with the execution process, it potentially avoids the overhead of a separate, offline compilation step for the entire EVM implementation, which was identified as a drawback in the previous discussion. This POC could pave the way for a more gradual integration into the Ethereum ecosystem. * The inline transpilation would Minimizing Disruption by avoiding a complete rewrite or replacement of Erigon's core EVM execution logic. Instead, it adds a new output path. * Instruction-level hooks can potentially create a more direct and optimized mapping from EVM opcodes to target ISA instructions, leading to a smaller and more efficient proof than proving a full EVM implementation. The in-line generation allows for potential optimizations where common EVM patterns could be translated into more efficient RISC-V sequences. Here's a detailed technical comparison of both alternatives across multiple dimensions: | Aspect | Inline Transpilation via Instruction-Level Hooks (Preferred) | Fully ISA-Based Execution Module (Alternative) | |---|---|---| | **Integration with Erigon** | ✅ Non-invasive: Hooks can be added to Erigon's interpreter (`vm/interpreter.go`) with minimal disruption. | ❌ Invasive: Requires replacing the core EVM with a custom-built RISC-V-based execution engine. | | **Disruption to Ethereum Execution Clients** | ✅ Minimal: Preserves current client behavior. Adds a parallel output (zk-compatible trace or executable). | ❌ High: Swaps out EVM core. Risks client compatibility and audit complexity. | | **Implementation Complexity** | ⚠️ Moderate: Requires accurate EVM-to-ISA mapping at instruction level. Ensuring fidelity and determinism is non-trivial. | ❌ High: Rebuilds entire EVM execution semantics in a new ISA-compatible format. Maintains exact bytecode semantics is harder. | | **Proof Performance** | ✅ Optimizable: Allows real-time EVM-to-RISC-V conversion. Can inline common patterns (e.g., `DUP+ADD`) into efficient RISC-V equivalents. | ⚠️ Depends: Direct RISC-V execution may be efficient, but proving full RISC-V programs is still expensive in zkVMs. | | **Proof Soundness** | ✅ Maintains canonical semantics: Since Erigon's existing EVM is used, behavior is easier to verify and match. | ⚠️ Harder to audit: Must ensure fidelity across all corner cases in EVM semantics. Increased risk of consensus mismatch. | | **Proving System Compatibility** | ✅ Composable: Output trace or RISC-V binary can plug into zkVMs like RISC Zero, SP1, or Jolt. | ✅ Fully compatible: zkVM can directly prove the ISA module. But costlier due to full program verification. | | **Development Timeline** | ✅ Shorter: Build hooks incrementally, starting with a subset of opcodes. | ❌ Longer: Requires building an entire EVM reimplementation in a target ISA. | | **Tooling Reuse** | ✅ Reuses Erigon’s tracing, precompiles, and RPC infra. | ❌ Limited reuse: Entire infra around opcode instrumentation, debugging, etc., may need to be ported or replaced. | | **Maintenance Overhead** | ✅ Low: Hooks follow upstream changes in Erigon. | ❌ High: Keeping EVM and ISA-version in sync requires ongoing effort. | | **Example Precedents** | Similar to Geth+ZKTrace, SP1's `wasm2riscv` transpilation, and Starknet’s in-situ Cairo generation. | ZK-EVMs like Scroll, Polygon zkEVM, and Taiko re-implement full EVM circuits (though not directly via ISA). | #### Simplyfying Assumptions : Some simplifying assumtions could for example be: * **All state is provided in the execution model as constants.** All state provided as constants in `.rodata`x. It means the prover doesn't need to reason about dynamic state reads or writes to a Merkle Patricia Trie. The initial state is given as input, and any state changes during execution would be reflected in the output. This is a common simplification in initial zk-rollups or proof systems to focus on the execution logic itself. A full solution would need to handle dynamic state lookups and Merkle tree updates within the zkVM, which is a significant cryptographic challenge. But we can excuse this for now, given this is a PoC. * **Precomiles can be treated as idempotent functions which check the input values and return a constant output.** This simplifies the handling of complex precompiled contracts (e.g., cryptographic operations, precompiles like SHA3/Keccak256 or ECAdd). Instead of proving the internal logic of these precompiles, the assumption is that their output is deterministic and depends only on their input, and that the correctness of this output can be assumed or verified externally by the zkVM's optimized circuits. * **Output value and side effects are pushed to a static constant output function or similar for runtime checking**. All public outputs and side effects (e.g., logs, state diffs) are pushed to a designated, static memory region (`.data` or `.bss`) for runtime checking. This simplifies how the prover verifies the final state and any emitted events. This may require a target supplied library function. Instead of the zkVM directly manipulating a complex output structure, all outputs are marshaled into a simple, verifiable format. The zkVM needs a predefined way to capture the output of the execution that can then be proven to be correct. This could involve writing to a specific memory region that the prover inspects. * **Gas Accounting Omission**: For this initial PoC, gas accounting is omitted. While every EVM opcode has a gas cost, tracking and decrementing gas can be complex to translate. The intention for the PoC is to generate proofs without explicit gas cost verification, assuming a static gas environment or that the zkVM handles overall execution limits. Future iterations could integrate gas metering by storing a gas counter in a RISC-V register and subtracting each EVM opcode's cost. * **Chunking Strategy**: A consistent little-endian 32-bit layout is maintained for all 256-bit values to simplify arithmetic and memory mapping. In reality, each 256-bit EVM addition requires a [carry chain](https://people.ece.ubc.ca/stevew/515/handouts/arith.pdf) across eight 32-bit RISC-V words. A proper 256-bit addition implementation requires approximately 24-32 RISC-V instructions, not the 6-8 that I have outlined in this doc. ``` add t0, a0, b0 # Add low words sltu t1, t0, a0 # Detect carry (t1 = 1 if carry occurred) add t2, a1, b1 # Add high words add t2, t2, t1 # Add carry from previous operation sltu t3, t2, a1 # Detect new carry ``` [This pattern must be repeated for each 32-bit segment, creating substantial instruction overhead in real-world](https://forums.sifive.com/t/best-way-to-do-multi-precision-integer-compares-on-risc-v/484). * **PoC Goals**: Focus on proving correctness and feasibility over performance. Full EVM compatibility (e.g., gas model, dynamic memory expansion, precompile internals) can be deferred. * **zkVM Integration**: `ecall` hooks are designed to interface with external circuits or proof gadgets for precompiles, I/O, and possibly `SSTORE` (state diffs). The intention is the produced output can be assembled and executed by an existing zkVM to prove the process. One option that is considered is to use the Taiko prover framework for this, but *part of the project would be to identify prover candidates to test and intergrate to*. ## Implementation Details The core idea is to hook into Erigon’s existing Go-based EVM interpreter loop, which executes bytecode via a simple fetch-decode loop using a 256-entry jump table. Our proposal is to simultaneously emit RISC-V instructions for each executed EVM opcode. ### Interpreter hooks in Erigon’s VM: Hooks are embedded within Erigon's EVM interpretation logic. When the EVM executes, say, an ADD opcode, the corresponding hook is triggered and it translates the EVM operation into one or more equivalent RISC-V instructions. This translation happens dynamically as the EVM code is executed. These generated RISC-V instructions are collected and assembled into a RISC-V executable object (an ELF file) or a trace format convertible to one. This ELF file represents the exact sequence of operations performed by the EVM for the specific transaction(s) being processed which is later fed to zkVM for proving the execution. ### Export Interface We will provide a Go API in the transpiler for handling each transaction: * `BeginTrace(txHash)` - Initializes the code buffer and locks in input constants (copies necessary state into `.rodata`). * `EmitForOpcode(...)` - Called per opcode as described below. * `EndTrace()` - Appends any epilogue (e.g., code to move results into `output_buf`, then `ECALL` or exit) and finalizes sections. * `GetELF() []byte` - Returns the assembled ELF bytes (or writes them to disk). * `GetOutput() []byte` - For testing, reads back the `output_buf` region to compare with expected outputs. The key is that each transaction yields a unique RISC-V program (`guest.elf`) capturing its execution trace. This binary can then be fed into the chosen zkVM for proving. ### Adding Inline Transpiler This PoC implements inline transpilation in the EVM interpreter loop of Erigon, whereby RISC-V instructions are emitted for each EVM opcode as it is interpreted. This allows us to trace EVM execution and generate a parallel RISC-V code stream, potentially enabling ahead-of-time binary generation or analysis. [Erigon’s EVM interpreter](https://github.com/erigontech/erigon/blob/ab3040b2097127ab753ebcb0be3bec41ccd5ab01/core/vm/interpreter.go#L136) uses a program counter and an [opcode jump table](). In each loop iteration, it [fetches the opcode](https://github.com/erigontech/erigon/blob/43082dbe181327872ac5282aa0e76a3eccd8186d/core/vm/interpreter.go#L303) to [execute the corresponding handler](https://github.com/erigontech/erigon/blob/43082dbe181327872ac5282aa0e76a3eccd8186d/core/vm/interpreter.go#L374) using `JumpTable` - a fixed-size array of 256 handler entries, where each handler implements the function signature : ```go! func(pc *uint64, in *EVMInterpreter, scope *ScopeContext) ([]byte, error) ``` Each handler can inspect and modify the EVM stack, memory, and execution scope. We propose a transparent hook at the dispatch point — right after `in.jt[op]` is retrieved and before `execute()` is called — allowing us to emit RISC-V instructions inline without modifying opcode semantics. ```go! op := code[pc] // opcode fetched from bytecode handler := in.jt[op] // opcode dispatched via jump table handler.execute(...) // our hook wraps this call ``` We do this by wrapping the handlers in the `JumpTable` with transpiler logic during interpreter initialization. For example, during EVM initialization: ```go! // **PoC Modification:** Initialize the transpiler and wrap the jump table. // This simplified initialization assumes we have access to initial state and tx input here. // In a real scenario, obtaining the initial state would be more complex. // We also need a flag to enable/disable transpilation. Let's assume a field in config enableTranspilation := in.cfg.EnableTranspilation // Assuming a config flag var wrappedJumpTable JumpTable originalJumpTable := in.jt // Store the original jump table if enableTranspilation { // Copy the jump table to wrap it wrappedJumpTable = *copyJumpTable(originalJumpTable) in.jt = &wrappedJumpTable // Use the wrapped table during this Run call // Initialize the transpiler // **PoC Simplification:** Pass dummy data for initial state. in.translator = transpiler.NewTranslator(nil, input) // Wrap each opcode's execute function for i := range wrappedJumpTable { opSpec := wrappedJumpTable[i] if opSpec != nil { originalExecuteFunc := opSpec.execute opcode := OpCode(i) // Capture the opcode value opSpec.execute = func(pc *uint64, interpreter *EVMInterpreter, callContext *ScopeContext) ([]byte, error) { // Call the transpiler's emit function // The transpiler instance is available via the interpreter if interpreter.transpiler != nil { // Double check for safety interpreter.transpiler.EmitForOpcode(opcode, callContext) } // Call the original opcode execution function return originalExecuteFunc(pc, interpreter, callContext) } } } } ``` The plan is to implement the hooking mechanism as described: * *Create a `Transpiler` struct* in `vm/riscv_transpiler` : This struct will hold the RISC-V code buffer, manage the ELF layout (text, rodata, bss/data), and provide methods for emitting RISC-V instructions. It will also handle the input constants (`.rodata`) and the output buffer (`.bss`/`.data`). ```go! // Transpiler is responsible for transpiling EVM opcodes to RISC-V instructions. // This is a simplified structure for the PoC. type Transpiler struct { // code will store the generated RISC-V assembly instructions (as strings for simplicity) code []string // constant input - rodata stores the initial state and transaction inputs (as byte slices) rodata []byte // data stores the mutable data section and output buffer (as byte slices) data []byte // StackOffset tracks the emulated RISC-V stack pointer relative to the start of the data section // We'll manage a separate stack region within 'data'. stackOffset uint64 // OutputOffset tracks the start of the output buffer within the data section outputOffset uint64 // pcMap will map EVM program counter to RISC-V label pcMap map[uint64]string // EVM PC -> RISC-V Label labelCounter int // Counter for generating unique RISC-V labels } ``` * *Implement `BeginTrace(txHash)`*: This function will initialize the Transpiler, receive the initial state information needed for the transaction (account data, transaction inputs). It will format this data and store it in an internal buffer representing the `.rodata` section. It will also initialize the code buffer (`.text`) and set up the emulated RISC-V stack pointer and memory layout. * *Implement `EmitForOpcode(opcode, ctx)`*: This is the core transpilation logic. This function will be called from the hook. It will take the EVM opcode and the `vm.Context` (or ScopeContext) as input. Based on the opcode and the current state of the EVM stack and memory (accessible via context), it will generate the appropriate sequence of RISC-V RV32IM assembly instructions (as strings for now) which will be appended to the translator's code buffer. This will involve mapping EVM stack operations to RISC-V register/stack operations, memory operations to RISC-V load/store, arithmetic/logical operations to RISC-V equivalents, and control flow (JUMP/JUMPI) to RISC-V branches. * *Implement `EndTrace()`*: This function will be called after the EVM execution for a transaction finishes. It will append the RISC-V instructions to write the final state of the output buffer to a known location and an instruction to halt (ECALL with a specific code or similar). ```go! // EndTrace appends epilogue and finalizes sections (PoC simplified). func (t *Translator) EndTrace() { t.Emit(" # End of execution") t.Emit(" ECALL") // Ensure an exit instruction is present t.Emit(".data") t.Emit("data_output_base:") t.Emit(fmt.Sprintf(" .space %d", len(t.data))) // Reserve space for data/output t.Emit(".rodata") t.Emit("rodata_state_base:") // Label for the start of the rodata state area // In a real implementation, format and emit the rodata bytes here. // For PoC, we just append the bytes directly in NewTranslator. // So, we just need a label to reference it. // A proper assembler would handle this when generating the ELF. t.Emit(fmt.Sprintf(" .space %d", len(t.rodata))) // Reserve space for rodata t.Emit("data_end:") // Label for the end of the data section, used to set initial SP t.Emit("") // Newline at the end } ``` * *Implement `GetELF() []byte`*: This function will take the accumulated RISC-V instructions and data and assemble them into a standard RISC-V ELF file format in output buffer. This might involve using a Go library for ELF generation or writing assembly to a file and invoking an external assembler (like `riscv64-unknown-elf-gcc`). ```go! // GetELF simulates getting the ELF bytes by just returning the assembly for now. // In a real implementation, this would involve ELF encoding or calling an assembler. func (t *Translator) GetELF() []byte { assembly := "" for _, instr := range t.code { assembly += instr + "\\n" // Use \\n for newline in the string } // Append rodata and data sections as comments or raw bytes (simplified) assembly += "# .rodata section (simplified)\\n" // In a real implementation, serialize t.rodata into assembly directives (.byte, .word, etc.) assembly += "# .data section (simplified)\\n" // In a real implementation, serialize t.data into assembly directives (.zero, .space, etc.) return []byte(assembly) // Return assembly as byte slice for simulation } ``` * *Implement `GetOutput() []byte`*: To read the data from the designated output buffer region within the generated ELF. ```go! // GetOutput simulates getting the output bytes. // In a real implementation, this would read the designated output buffer region from the generated ELF. func (t *Translator) GetOutput() []byte { // PoC simplification: just return the entire data buffer content return t.data } ``` * *Modify the `EVMInterpreter` initialization*: In the NewEVMInterpreter function (or wherever the JumpTable is finalized before execution), iterate through the JumpTable and wrap the original execute function for each opcode with a new function that calls EmitForOpcode before calling the original execute function. The wrapping should happen after the instruction set for the current fork and any EIPs are enabled, but before the Run loop is entered. The `copyJumpTable` function suggests a pattern for safely modifying the jump table without affecting others. We should create a new jump table based on the active one and modify that copy. ```go! type EVMInterpreter struct { *VM jt *JumpTable // EVM instruction table depth int // Add a Translator instance for RISC-V transpilation (PoC) translator *transpiler.Translator } ``` This ensures every opcode execution in Erigon also calls our transpiler. We will keep Erigon’s existing stack and memory updates for correctness of the EVM execution, while separately accumulating the RISC-V code. They are compatible with Erigon, with no opcode or gas cost deviations. Because we assume state is constant (read-only) for the PoC, we don’t need to reconcile state changes during execution – we only need to verify them later via the zkVM output. This hook approach leaves Erigon’s architecture largely untouched (only adding instrumentation) and avoids reimplementing the entire EVM semantics. ### Inline Transpilation to RISC-V: Erigon's opcodes rely on shared state via `ScopeContext`, including stack, memory, and gas tracking. For correctness, the transpiler must: * Emulate stack behavior (push/pop) * Track memory accesses (for MLOAD/MSTORE) * Respect control flow boundaries (JUMPI, STOP, etc.) In our PoC, we start with stack-only opcodes (e.g., ADD, PUSH, POP) and incrementally add support for memory and control flow. This task is about compiling or interpreting EVM bytecode into RV32IM (32-bit RISC-V with integer and multiplication/division support) assembly. Since EVM is a stack-based machine with 256-bit word size, and RV32IM is a register-based 32-bit ISA, the compilation involves non-trivial state modeling and transformation. The translator will map each EVM opcode (and its operands) to one or more RV32IM instructions. * *Arithmetic (ADD, SUB, MUL, etc.)* - Generate equivalent RISC-V instructions operating on an emulated stack. For example, for EVM ADD, we might emit: * *Logical (AND, OR, XOR)* - Similarly use RISC-V bitwise operations. * *Stack operations (PUSH, POP)* - For PUSH1, decrease sp and store the immediate; for POP, increment sp. * *Control flow (JUMP, JUMPI)* - Emit conditional branch jumps `BEQ`/`BNE` or `JAL` for unconditional to implement EVM jumps and PC updates. * *Memory/Storage (MLOAD/MSTORE/SLOAD/SSTORE)* - Use `LW`/`SW` with addresses. Storage reads (`SLOAD`) will load from a known `.rodata` region; storage writes (`SSTORE`) will record to a designated output area (since actual state isn’t updated in-protocol for the PoC). * *Precompiles (e.g., KECCAK256, ECADD)* - Insert a placeholder “call” or assume a special gadget. In zkVMs like SP1, heavy cryptographic opcodes are handled by optimized circuits; our PoC similarly treats them as idempotent black boxes, potentially emitting an `ECALL` as a placeholder, with a specific immediate value indicating the precompile ID. ```go! // EmitForOpcode translates a single EVM opcode to RISC-V instructions. // This is a simplified implementation for the PoC focusing on structure. func (t *Translator) EmitForOpcode(op vm.OpCode, ctx *vm.ScopeContext) { // Add a label for the current EVM PC for jump destinations label := t.getLabel(uint64(*ctx.Contract.PC())) // Use the *actual* PC value t.Emit(fmt.Sprintf("%s:", label)) switch op { case vm.ADD: // EVM ADD: pop a, pop b, push a+b // RISC-V emulation: LW from stack, LW from stack, ADD, SW to stack, adjust SP t.Emit(" # EVM ADD") t.Emit(" LW t1, 0(sp)") // Load top (b) t.Emit(" ADDI sp, sp, 4") // Pop b (assuming 32-bit words for now) t.Emit(" LW t2, 0(sp)") // Load next (a) t.Emit(" ADDI sp, sp, 4") // Pop a t.Emit(" ADD t3, t1, t2") // Compute sum t.Emit(" ADDI sp, sp, -4") // Push result (a+b) t.Emit(" SW t3, 0(sp)") case vm.PUSH1: // EVM PUSH1 <byte>: push byte onto stack // The byte is the next byte in the contract code. // Need to read the immediate byte from contract.Code at pc+1 // and increment PC by 2 (1 for PUSH1, 1 for the byte) // However, the interpreter loop already increments PC *after* the execute call. // So, we need to read the byte at the current PC + 1 within the hook // and the original execute function will advance PC past the immediate. // This requires careful coordination or peeking at the code bytes. // Let's assume we can peek for the PoC. // The actual opcode implementation `opPush1` handles this. // The `ScopeContext` might provide access to the next instruction bytes. // The `vm.Contract` has a `Code` field and `PC()`. pc := *ctx.Contract.PC() if int(pc)+1 >= len(ctx.Contract.Code) { // Handle error: code out of bounds (should be caught by EVM) t.Emit(" # Error: PUSH1 out of bounds") t.Emit(" ECALL # indicate error") // Use ecall for simplicity return } immediate := ctx.Contract.Code[pc+1] t.Emit(fmt.Sprintf(" # EVM PUSH1 0x%02x", immediate)) t.Emit(fmt.Sprintf(" LI t0, %d", immediate)) // Load immediate t.Emit(" ADDI sp, sp, -4") // Push immediate (assuming 32-bit words) t.Emit(" SW t0, 0(sp)") case vm.MLOAD: // EVM MLOAD: pop offset, push memory[offset:offset+32] // RISC-V emulation: Pop offset, LW from emulated memory region, Push result. t.Emit(" # EVM MLOAD") t.Emit(" LW t1, 0(sp)") // Pop offset (assuming 32-bit offset for simplicity) t.Emit(" ADDI sp, sp, 4") // Need to add base address of emulated memory to offset t.Emit(" LA t2, memory_base") // Load base address of emulated memory t.Emit(" ADD t3, t2, t1") // Calculate memory address t.Emit(" LW t4, 0(t3)") // Load word from memory (assuming 32-bit word) t.Emit(" ADDI sp, sp, -4") // Push result t.Emit(" SW t4, 0(sp)") case vm.SLOAD: // EVM SLOAD: pop key, push storage[key] // RISC-V emulation: Pop key, load from rodata section using key as index. // This is a major simplification for the PoC. Actual SLOAD needs to handle 256-bit keys and values. // We assume keys are 32-bit offsets into our simplified rodata state area. t.Emit(" # EVM SLOAD (PoC simplified)") t.Emit(" LW t1, 0(sp)") // Pop key (assuming 32-bit key/offset) t.Emit(" ADDI sp, sp, 4") // Need to add base address of rodata state area to key (offset) t.Emit(" LA t2, rodata_state_base") // Load base address of rodata state t.Emit(" ADD t3, t2, t1") // Calculate address in rodata t.Emit(" LW t4, 0(t3)") // Load word from rodata (assuming 32-bit value) t.Emit(" ADDI sp, sp, -4") // Push result t.Emit(" SW t4, 0(sp)") case vm.SSTORE: // EVM SSTORE: pop value, pop key, storage[key] = value // RISC-V emulation: Pop value, pop key, store to data section output area. // Major simplification for the PoC. Actual SSTORE needs to handle 256-bit keys and values. // We assume keys are 32-bit offsets into our simplified data output area. t.Emit(" # EVM SSTORE (PoC simplified)") t.Emit(" LW t1, 0(sp)") // Pop value (assuming 32-bit value) t.Emit(" ADDI sp, sp, 4") t.Emit(" LW t2, 0(sp)") // Pop key (assuming 32-bit key/offset) t.Emit(" ADDI sp, sp, 4") // Need to add base address of data output area to key (offset) t.Emit(" LA t3, data_output_base") // Load base address of data output area t.Emit(" ADD t4, t3, t2") // Calculate address in data t.Emit(" SW t1, 0(t4)") // Store value (assuming 32-bit value) // Add cases for other opcodes... case vm.STOP: t.Emit(" # EVM STOP") t.Emit(" ECALL") // Use ECALL to indicate end of execution for the PoC case vm.RETURN: t.Emit(" # EVM RETURN") // In a real scenario, need to handle memory offset/size from stack // and copy return data to the output buffer. For PoC, just exit. t.Emit(" ECALL") // Use ECALL to indicate end of execution for the PoC // Handle other opcodes as needed for the subset of transactions default: // For unsupported opcodes in the PoC, emit a placeholder or error t.Emit(fmt.Sprintf(" # EVM Opcode 0x%02x (Unsupported in PoC)", op)) // Optionally, emit an ECALL to signal an unimplemented opcode // t.Emit(" ECALL") } } ``` Each `Emit` call appends the instruction encoding to an output buffer. (In practice, this would involve encoding binary words or assembly lines.) Because this is done inline during EVM execution, the RISC-V trace is built sequentially as the EVM runs. Handling 256-bit EVM words will require splitting into multiple registers or memory. For a PoC, we can operate on 32-bit or 64-bit chunks. The key is that every EVM step corresponds to a fixed RISC-V sequence, ensuring the final RISC-V program faithfully simulates the EVM execution. #### RISC-V Object Layout: We will emit a standard RISC-V ELF (RV32IM) image with the following layout: * `.text (code)`: Contains the translated instructions in order, starting with an entry label. An epilogue will be appended to write outputs and exit after all opcodes are executed. * `.rodata`: Holds constant inputs. All pre-filled state (account balances, storage slots, transaction calldata, etc.), encoded in little-endian words, will be placed here. The RISC-V code will use `LW` to read from these constants. * `.bss/.data`: A region for dynamic data and outputs. A fixed output buffer will be reserved. * Stack - The RISC-V `sp` will be set to point into a suitable region (or we can manage our own stack pointer register in `.bss`). This layout matches what SP1 and RISC Zero expect: a self-contained ELF with code and data sections. After emitting the RISC-V text and data, we will either use a binary encoding library to write an ELF header or dump assembly and invoke a RISC-V assembler (e.g., `riscv64-unknown-elf-gcc -march=rv32im`) to produce `guest.elf`. The result is a complete RISC-V executable whose execution will mirror the EVM’s behavior on that transaction. ### EVM to RISC-V Translation | **EVM Aspect** | **EVM Characteristics** | **RISC-V Representation (RV32IM)** | **Translation Details & PoC Simplifications** | |--------------------|----------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | **Stack** | - 1024-entry stack of 256-bit words. - LIFO semantics. - Used for all computations and argument passing. | - Emulated via contiguous memory in `.bss` section. - Stack Pointer (SP) in `s0` or a dedicated GPR. - Each 256-bit word is stored as 8 × 32-bit words (little-endian). | - Stack operations (`PUSHn`, `POP`, `DUPn`, `SWAPn`) compiled into multiple `LW`/`SW` sequences with `ADDI` for address calculation. - SP updated by ±32 (bytes) for each 256-bit value. - **PoC**: Initial version may limit to ≤128 stack entries to ease implementation. | | **Memory** | - Byte-addressable, zero-initialized. - Dynamic size, accessed via `MLOAD`, `MSTORE`, etc. | - Emulated via fixed-size `.data`/`.bss` segment. - Base register (e.g., `s1`) holds memory base pointer. - Accessed in 32-bit word chunks. | - Memory layout aligned to 32-bit boundaries (ignores byte-level alignment). - `MSTORE`/`MLOAD` translated into loops of 8 `SW`/`LW` ops. - **PoC**: No memory expansion logic (gas-wise or otherwise). Assume fixed-size (e.g., 4 KB). | | **Storage** | - Key-value persistent storage. - Keys and values are 256-bit. - Backed by Merkle Patricia Trie. | - `SLOAD`: Read from `.rodata` section (constant inputs). - `SSTORE`: Write to `.data` or `.bss` section marked as output buffer. | - **PoC**: Skip full MPT emulation. - `SLOAD` uses a lookup into pre-filled memory. - `SSTORE` writes generate diffs as part of zk proof output. - No hashing of keys/values or database interaction. | | **Control Flow** | - Unstructured control via `JUMP`, `JUMPI`, `JUMPDEST`. - PC-based execution model. | - Mapped to RISC-V labels and `BEQ`, `BNE`, `JAL`, `JALR`. - `JUMP`/`JUMPI` become branch/jump to label computed via jump table or direct mapping. | - **PoC**: Transpiler assigns labels to valid jump destinations (`JUMPDEST`). - Jumps checked for validity as per EVM rules. - PC emulated in software via instruction pointer variable when needed. | | **256-bit Words** | - Native to EVM. - All arithmetic/logical operations work on 256-bit values. - Signed/unsigned distinction. | - Emulated via 8 × 32-bit words (little-endian). - Uses arrays of registers or memory chunks for operations. - Carry/borrow logic needed. | - Arithmetic (`ADD`, `SUB`, `MUL`, etc.) requires looped operations across chunks. - Carry propagated manually across registers. - **PoC**: Start with `ADD`, `SUB`, `AND`, `OR` on 64-bit (2 chunk) values for faster prototyping. | | **Gas Model** | - Every opcode has a deterministic gas cost. - Tracks computation cost. | - Emulated via instruction counter variable or predefined gas budget. - Accumulated in a dedicated register (e.g., `s2`). | - **PoC**: Gas accounting can be omitted or faked for now. - Later versions may insert gas metering before each instruction or basic block. | | **Precompiles** | - Hardcoded contracts at addresses `0x01` to `0x09` (e.g., KECCAK256, BN256 add/pairing). | - Represented as `ecall` instructions with arguments passed via memory/registers. - Offloaded to zkVM host. | - Inputs/outputs passed via predefined memory locations. - `ecall` signals zkVM to execute native implementation of the precompile. - **PoC**: Precompiles not emulated in RV32IM — treated as host syscalls. | | **CALL / CREATE** | - Dynamic invocation of other contracts or creation of new contracts. | - Skipped or abstracted in PoC. - Could be modeled as host-side syscalls in future. | - **PoC**: Skip `CALL`, `CREATE`, `DELEGATECALL`, etc. - Alternatively, trap to host with call metadata via `ecall`. | | **Program Counter (PC)** | - Implicit in EVM — points to current bytecode offset. | - Emulated via explicit variable (e.g., `pc`) or inferred from label execution. | - **PoC**: Linearized instruction sequence can avoid dynamic PC emulation. - For `JUMP`, simulate PC updates via label targets. | | **Instruction Decoding**| - Each opcode is 1 byte, may have immediate data (e.g., PUSH1 to PUSH32). | - Compiler handles static decoding. - No runtime decoding logic needed for transpilation. | - Transpiler reads bytecode and emits corresponding RISC-V sequence. - Immediate values from `PUSH` are compiled as data literals. | | **Halting** | - `STOP`, `RETURN`, `REVERT`, `INVALID`, `SELFDESTRUCT`. | - Translated to `ecall` with reason code or direct return from function. | - `STOP`/`RETURN` → `ret` or `ecall` with result. - `REVERT` → `ecall` with error code. - **PoC**: Use host traps for all exits. --- The proposed method of wrapping Erigon's jump table during interpreter initialization has potential race conditions and memory safety issues. The copying and modifying the jump table on-the-fly affects Erigon's concurrent execution model or the cleanup of these modification. ### zkVM Integration: [Taiko is a Type 1 ZK-EVM](https://taiko.mirror.xyz/j6KgY8zbGTlTnHRFGW6ZLVPuT0IV0_KmgowgStpA0K4), which means it aims to be fully Ethereum equivalent (it can reuse Ethereum’s state transition logic without modifications). Their existing prover framework and focus on RISC-V based zkVMs (like RISC Zero, which they heavily use) make them a strong candidate for integration and testing. Taiko uses a RISC-V proof pipeline in its Raiko system, targeting Ethereum equivalence. Their engineers have already integrated SP1 and RISC Zero into their stack. We would align with Taiko by emitting our RISC-V code in the format Raiko expects. Taiko’s approach is to cross-compile Ethereum client code to RISC-V (e.g., Reth to SP1 targets), and they note that “the RISC-V based pipeline and test harness in Raiko can be reused across all zkVM platforms.” We can do the same with Erigon - run Erigon transaction-by-transaction, inline-transpiling to RISC-V, and then feed that code into the Raiko multi-prover. Taiko also replaces certain EVM precompiles with their SP1 circuits; since our model already treats precompiles as black-boxes, we would leverage Taiko’s existing precompile circuits for those operations. The net effect is that our RISC-V trace would slot into Taiko’s Type-1 pipeline just as if it were another Ethereum client’s trace. Here's a consolidated comparison of Taiko, RISC Zero, and SP1 across various technical dimensions: | Feature | Taiko | RISC Zero | SP1 | | :------------------------ | :------------------------------------------ | :--------------------------------------------- | :------------------------------------------------------ | | **System Type** | Type-1 zkEVM | General-purpose zkVM | General-purpose zkVM | | **Goal** | Ethereum bytecode equivalence | Generic provable computation | High-speed recursive computation | | **ISA Target** | Ethereum EVM bytecode (full equivalence) | RISC-V RV32IM | RISC-V RV32IMC (compressed) | | **Proof System** | zkEVM + RISC Zero under the hood (Groth16/STARK) | Custom AIR → recursive SNARK/STARK (Groth16/Nova) | Nova folding over STARK-style circuit + Plonky3 commitments | | **Recursion Support** | Yes, via RISC Zero's recursion | ✅ Built-in (`risc0-recursion`) | ✅ Native via Nova folding | | **Input/Output Model** | EVM-level trace + public calldata | Host-guest syscall bridge (`env::commit`, `env::read`) | Flat memory segments for input/output + zero-copy memory mapping | | **Integration Complexity**| ✅ Easy (if matching EVM behavior) | ✅ Moderate (emit ELF via hooks) | ✅ Moderate (same as RISC0, but tighter control) | | **Modifiability / Low-Level Control** | ❌ Very limited – canonical zkEVM | ✅ High – full zkVM access | ✅ High – modular Plonky backend and zkRISC | | **Proof Size / Speed** | Medium (controlled by full EVM semantics + zk stack) | Moderate (~MBs, depending on cycles) | Smaller (Nova compresses proofs well, faster recursion) | | **Ecosystem & Tooling** | ✅ Taiko, Ethereum L2 native | ✅ Used by Bonsai, Zeth, zkBitcoin | ⚠️ Newer but rapidly gaining traction (Succinct, Layer N) | | **Ideal Fit For** | Production zkEVM rollups | Plug-and-play zkVM workflows, backend-agnostic proofs | Research-heavy zkVMs, zkDSLs, low-level systems optimization | | **Input Format** | Ethereum bytecode + auxiliary witness | RISC-V ELF binary | RISC-V ELF binary | | **Output Format** | zkEVM receipt (EVM-equivalent transition verified) | Receipt with journal, SHA-hashed memory | Public input segment | | **Witness Compression** | Limited to RISC0-based compression | Yes (SHA-256 trace trees) | Yes (lookup tables + folding) | | **FFI / Host Calls** | Abstracted away by zkEVM layer | Syscall bridge (custom `env::commit`) | Memory-mapped input/output, easier | | **Precompile Modeling** | Reused from Taiko EVM circuits | Needs custom modeling or mocked ECALLs | Modular: circuits as coprocessors | | **Public Input Semantics**| Defined by zkEVM circuits | Journal, receipt | `public_input` segment in memory | | **Prover Tooling** | Integrated with zkEVM tooling only | Full Rust SDK, Bonsai API | Modular Rust core, rapid evolution | | **Ecosystem Maturity** | Production-ready zkEVM, limited customization | Most mature general-purpose zkVM | Most flexible zkVM, but younger | | **Optimization Control** | ❌ Not exposed — proving logic owned by Taiko | ✅ Full control over program and circuit | ✅ Full control, low-level optimizable | | **Use** | You want to validate EVM execution for L2 scaling | You want easy-to-integrate RISC-V zkVM | You want modular zkVM optimized for folding/rollups | Our project's immediate output is a RISC-V ELF, not raw EVM bytecode to be fed into a zkEVM directly. Taiko uses RISC-V zkVMs (like RISC Zero and SP1) under its hood as part of its proving pipeline. So, while we might eventually leverage Taiko's broader framework for the overall zkEVM goal, our specific task of generating the RISC-V executable aligns more directly with RISC Zero or SP1 as the immediate target zkVM for proving that RISC-V execution. We can then feed the proofs from RISC Zero/SP1 into Taiko's system, as Taiko. Our goal is an efficient, provable Ethereum execution path that balances implementation effort and proving overhead. ## Security and Performance Consideration The primary security consideration with this approach is ensuring that the transpilation process is correct and faithful to the original EVM execution. An error in transpilation could lead to a RISC-V program that behaves differently from the EVM, making the proof generated by the zkVM invalid or misleading regarding the actual EVM state transition. * *Correctness of Translation*: Every EVM opcode must be translated into a sequence of RISC-V instructions that precisely replicates its effect on the stack, memory, and program counter. Special attention is needed for complex operations, error conditions, and gas accounting (though gas isn't being proven directly in the PoC, the execution path depends on it). We need to ensure that jump destinations match EVM execution exactly and validate all memory accesses stay within bounds. * *Control Flow*: Jumps and conditional jumps must be translated accurately to RISC-V branches and labels to ensure the correct execution path is followed. * *Precompiles*: Treating precompiles as black boxes in the PoC requires assuming their correctness. In a full solution, the precompile logic would also need to be provable, potentially using specific zk-SNARK gadgets or pre-proven circuits. For the PoC, ensure the ECALL or placeholder mechanism is clearly defined and understood by the target zkVM. We also need to take care of performance : * *Performance* - The inline transpilation adds overhead to the EVM execution. The performance impact needs to be measured. The efficiency of the `EmitForOpcode` function is critical. * *Memory Usage* - Storing the generated RISC-V code and the `.rodata` section in memory during execution will increase memory consumption. This needs to be managed, potentially by writing to a temporary file for larger transactions. * *Assembler Dependency* - If an external assembler is used to generate the ELF, this adds an external dependency to the build or execution process. Using a pure Go ELF generation library would avoid this. * *Hot Paths* - We can identify common EVM patterns (e.g., stack operations) for RISC-V optimization and group related opcodes where possible. The hooking mechanism itself, by wrapping the existing, tested opcode implementations, minimizes the risk of introducing subtle bugs in the core EVM logic. The risk lies mainly in the transpilation logic itself. Thorough testing with a wide range of EVM bytecode is crucial. Emitting RISC-V per opcode in the interpretation loop incurs runtime overhead. This tracing model is akin to a naive JIT and may slow down EVM execution. Future Optimization: * Basic Block Grouping: Group linear opcode sequences into basic blocks and emit RISC-V for the entire block. * Code Caching: Memoize previously emitted instruction sequences for frequently used opcode paths. ### Real World Challenges * GNU MP Library Performance Impact - The GNU Multiple Precision Arithmetic Library (GMP), which implements highly optimized multi-precision arithmetic, demonstrates the real-world impact of these limitations. [Recent benchmarks show that RISC-V implementations of GMP are 2-3 times slower than equivalent ARM implementations due to the lack of carry flag support](https://gmplib.org/list-archives/gmp-devel/2021-September/006013.html). * Instruction Set Extension Research - Recent academic research has focused on developing RISC-V instruction set extensions specifically to address multi-precision arithmetic inefficiencies. [These studies propose adding dedicated multiply-add instructions and carry propagation support, acknowledging that the base RISC-V instruction set is inadequate for efficient multi-precision operations](https://dl.acm.org/doi/pdf/10.1145/3649329.3657347). #### Multiplication Complexity For PoC, ignoring the complexity of 256-bit multiplication, which requires implementing the full multiplication algorithm across multiple 32-bit segments would be better. Otherwise a single EVM `MUL` instruction translates to: * 64 individual 32-bit multiply operations (8×8 partial products) * Complex carry and overflow handling across all intermediate results * Approximately 200-300 RISC-V instructions for a complete implementation #### Division and Modulo Operations 256-bit division presents even greater challenges, requiring implementation of multi-precision division algorithms that can consume thousands of RISC-V instructions for a single EVM operation . For PoC we can treat these operations as simple translations. #### Little-Endian Assumptions We have made an assumption of consistent little-endian 32-bit layout oversimplifing memory management complexity for the PoC. EVM operations often require specific byte ordering for cryptographic operations, and the translation layer must handle: * Endianness conversion between EVM and RISC-V representations * Memory alignment requirements for efficient RISC-V operations * Cache-friendly data layout for multi-word operations #### Stack Emulation Overhead The proposed stack emulation using RISC-V memory operations fails to account for the performance implications of constant memory access patterns. Each EVM stack operation translates to multiple memory accesses, creating significant performance overhead compared to native register operations. ### Development Roadmap *This would be a PoC that would deal with enough instructions to process a subset of viable smart contract based transactions.* This seems to be a never-ending project! ## Conclusion The design is modular - Erigon’s interpreter only needs small hooks, the transpiler code is a separate module, and the RISC-V output fits into existing zkVM systems. This approach provides a practical path to RISC-V-based EVM proofs by: * Leveraging Erigon's battle-tested interpreter * Minimizing proof overhead through precise execution tracing * Maintaining compatibility with multiple zkVM targets [zkMIPS implementations show 6x-19x performance improvements through architectural optimizations](https://medium.com/@ProjectZKM/zkmips-1-0-production-ready-performance-optimized-and-open-for-developers-7d6508cea03f), while native RISC-V execution eliminates interpretation overhead that contributes to current zkEVM inefficiencies. The inline transpilation approach should capture similar benefits by generating minimal instruction sequences that directly correspond to executed EVM operations.