EPF Final report: Erigon: RISC-V Executable Proof Sourcing

# EPF Final report: Erigon: RISC-V Executable Proof Sourcing ## TLDR - Implemented and researched a different approach for ZK verifications of blocks using execution hooks for transpilation with a working POC. - POC demonstrates almost 100x speedup for proof generation over traditional zkVM approaches, proving mainnet blocks in minutes rather than hours on CPUs. See [benchmarks](#Benchmarks) for some caveats. ### Links Here are some useful links related to the project. - [The original project plan](https://github.com/eth-protocol-fellows/cohort-six/blob/f8f6806741f28d4dbcd42b4a1ec7e91ab0589fc2/projects/erigon_riscv_proof_sourcing.md) - [The project repository](https://github.com/2xic/erigon-risc-v-executable-proof-sourcing) - [All the project updates](https://hackmd.io/@2xic/Hyg4L1LgZg) ## Project abstract The existing zkVM implementations for block proving are built on top of a full EVM implementation that gets compiled into a target ISA (usually RISC-V). `debug_executionWitness` is used so that the EVM can execute the block without additional state, but still with the full EVM runtime. This research project was to look at the practicality of having an inline transpilation step from EVM instructions to equivalent RISC-V instructions using execution hooks. The hypothesis is that this could allow faster proof generation with much less overhead as we do not need to prove the overhead of executing the EVM runtime environment, just the computational results. At a high level, the flow is as follows with our approach: ```mermaid graph LR T[Custom Erigon Tracer] T --> TM[Transpilation Module] TM --> ASM[RISC-V Assembly] ASM --> ELF[ELF Binary] ELF --> VM[OpenVM] VM --> STARK[STARK Proof] subgraph Erigon ["Erigon Node"] DB[(State DB)] EVM[EVM Execution Engine] end DB -.->|State access| T EVM -.->|Execution hooks| T ``` ## Status of the project We have written a transpilation module which extends Erigon and runs through the tracer interface. This was to allow easier iteration and development. It's able to run on real world blocks as the [benchmarks](#Benchmarks) below show and shows a significant reduction in time for proof generation. That said, we haven't yet implemented a way to do proper state transition between blocks or gas validation, which means it cannot currently be used for chain validation in a secure way. Other caveats also exist like not proving direct transfers (since there is no smart contract invocation), and the full code hasn't been audited or tested so it should not be used in production. However, we do have a functional POC that can in practice generate proofs for entire blocks. One use case where the proof of concept might already be useful is in transaction tracing proofs. ## Benchmarks The benchmark numbers below are a best effort estimate. Setup for the benchmarks and reproducibility is documented [here](https://github.com/2xic/erigon-risc-v-executable-proof-sourcing/blob/main/docs/benchmarking.md). The biggest disclaimer is that our implementation is still POC and the other implementations can already be used in production. Some caution should therefore be applied when reading our numbers, but given the lower overhead compared to the existing implementations we still have a lot of room to grow without it affecting the conclusion. We tested on commit [5adb3113](https://github.com/2xic/erigon-risc-v-executable-proof-sourcing/tree/5adb3113) on a machine with an AMD EPYC 7401P 24-Core and 16 of Samsung M393A4K40CB2-CTD 32GB RAM. We did **not** test with GPUs, mainly due to time constraints. ### Block 23174550 Block contains 186 transactions and 19,131,263 gas used in total. Our approach had an average total speedup of 97x and the average cycle reduction of 219x. | Implementation | Backend | Total Time (s) | Proof Time (s) | Total Cycles | Syscalls | | ---------------------- | --------- | -------------- | -------------- | ------------ | -------- | | **Our Implementation** | OpenVM | **387** | **113** | ~2.7M* | N/A* | | **RSP** | SP1 | 172,986 | 17,298 | ~376M | 459,799 | | **ETHREX** | SP1 | 326,743 | 32,585 | ~844M | 312,742 | | **ZETH** | RISC Zero | 628,167 | 62,816 | ~552M | N/A* | *For our implementation 1 cycle is considered 1 instruction, this is not completely true as some instructions require multiple cycles. Reason for this approximation is that at the time of writing OpenVM does [not implement](https://docs.openvm.dev/specs/reference/rust-frontend#openvm-kernels) `sys_cycle_count` yet. *N/A ~= Not directly exposed metrics / wasn't able to measure at this time. *Average speedup calculated as: $\frac{1}{3} \sum_{i=1}^{3} \frac{T_i}{T_{\text{ours}}}$. ### Block 23791194 Block contains 48 transactions and 8,611,284 gas used in total. Our approach had an average total speedup of 65x and the average cycle reduction of 106x. | Implementation | Backend | Total Time (s) | Proof Time (s) | Total Cycles | Syscalls | | ---------------------- | --------- | -------------- | -------------- | ------------ | -------- | | **Our Implementation** | OpenVM | **249** | **107** | ~2.2M* | N/A* | | **RSP** | SP1 | 7,060 | 7,060 | ~159M | 83,731 | | **ETHREX** | SP1 | 12,698 | 12,659 | ~323M | 89,513 | | **ZETH** | RISC Zero | 25,178 | 25,178 | ~219M | N/A* | _(same clarifications on cycles and N/A as block above)_ ## Current impact and future of the project While immediate real world impact is uncertain, this research showcases that other directions for ZK proof generation might be worth exploring. To the best of my knowledge this research direction hadn't been tried before and now we have tried it. I think it might also be useful for some reflection on what a sufficient block proof might look like. Given how big of a speedup this approach has, this is very much runnable on normal higher-end desktops and could allow more decentralized proof generation. Current approaches need 100s of GPUs[1] to prove a block in realtime, but our approach won't require that much compute.  The future of the project is a bit unclear, I had a lot of fun working on the project and it was a great way for me to get more knowledge about the zkVM proving space and stateless execution. So in that sense the project has served its purpose for me. The problem space was interesting, but at the same time not exactly the kind of problem I _thought_ that I would enjoy working on. That said, I'll keep building in this space, just at a different layer, but still close to the EVM. _[1] [SP1](https://blog.succinct.xyz/sp1-hypercube/) used 160 GPUs for realtime proving. Pico recently did it with only [64 GPUs](https://blog.brevis.network/2025/10/15/pico-prism-99-6-real-time-proving-for-45m-gas-ethereum-blocks-on-consumer-hardware/)._ ## Self evaluation and takeaways In terms of the goals we set back [when starting](https://github.com/eth-protocol-fellows/cohort-six/blob/f8f6806741f28d4dbcd42b4a1ec7e91ab0589fc2/projects/erigon_riscv_proof_sourcing.md#goal-of-the-project) we achieved most of them except removing some of the simplified assumptions. However, my thinking evolved on this point. For instance, I realized (with tips from my mentor) around the middle of the project that we might not need to cover the full semantic meaning of the instructions which makes some of the simplifications okay, the important thing is the computational results, and not necessarily the intermediate representation. ### What went well It was nice that every week I made sure to always provide a weekly update on GitHub even if the week was less productive. Writing weekly updates also helped improve my technical writing and communication skills. Having a more open project without as much mind share might be intimidating, but I think I handled it very well. This made me think deeper about certain things like "what is a sufficient proof?", "do instructions need to behave exactly the same within the transpiled system?" and similar problems as there isn't really a well defined specification for this project proposal unlike some of the other EPF projects. I opened issues in various projects as I discovered some bugs/issues in [OpenVM](https://github.com/openvm-org/openvm/issues/1816), [Erigon](https://github.com/erigontech/erigon/issues/17824), and [Reth](https://github.com/paradigmxyz/reth/issues/19244). I expanded my horizons and improved my knowledge about the ZK space by learning more about various ZK Provers and correcting misconceptions about ZK proofs, for instance that proof size is not actually a constant. I even wrote a blog post on [zkVMs](https://2xic.xyz/blog/zkvm.html) as a way to understand them and their building blocks better. I learned more about Native rollups (as the [idea](https://ethresear.ch/t/native-rollups-superpowers-from-l1-execution/21517) shared some connections to mine) and explored multiple different client implementations. I think the idea of using [Unicorn as point of reference](https://github.com/2xic/erigon-risc-v-executable-proof-sourcing/blob/main/docs/testing_setup.md) for testing against a real EVM implementation was very clever and helped me save a lot of time. Certain meta-skills also improved. I think my planning and execution skills improved, as did my debugging skills and thinking about how to write good tools to allow for faster iteration cycles. In our case the debug transpilation binary allowed us to quickly detect bad transpilation logic which caused issues on the prover side. I also learned more about Erigon's internals. The snapshots, db architecture, and Ottersync to name some. I was surprised to see an execution client also working on the consensus layer with their embedded [Caplin client](https://docs.erigon.tech/fundamentals/caplin). One thing I wasn't as aware of before doing this project is that the entire architecture is optimized for RPCs which is a different target from traditional node implementations. ### Challenges and Lessons Learned I wish I had prioritized doing some upstream contributions earlier like implementing `debug_executionWitness` in Erigon to also get some upstream changes merged. Not for the bragging rights, but for being more part of the development process and getting feedback from other protocol engineers. The research project ended up being a bit more me in isolation than I had anticipated. In retrospect upstream contributions should likely also have been one of the goals in the original roadmap for the fellowship. I initially thought I would have more time to work on the project each week, my employer was not as friendly to the idea of me working less to focus on this so I ended up working long nights and weekends to work on this project. Maybe I would have been able to do more contributions if this was done differently. Conflicting work meetings after August made me miss the second half of the live standups and office hours on Jitsi, though I did watch the recordings afterward. Communication with mentor was not as easy as I thought it would be. I understood he was busy, didn't want to disturb him and it ended up not being as much back and forth feedback during the second half of the fellowship as I had hoped. **He did give me some very good feedback early to midway through the fellowship. Maybe I should have been more outreaching or we could have agreed on a structure earlier.** Mentioned this to Josh during my midway check-in and just decided to try to make the best of it. ### Overall There are things I wish I had done differently, but all in all, I'm very happy with the journey I have been on over these months. I grew and learned a lot and plan to continue working closer to the protocol. ## Feedback for EPF I like the way project proposals are done and that some of them are a bit more research focused. For me it enabled me to do some research outside of the core objective, but still with a north star goal to build towards. On the flip side, I learned that picking a project idea with less mind share might be risky as client teams might not have as much incentive to help depending on how "aspirational" the idea is. Proposals that have stood the test of time are more likely to have mind share from other client teams which makes it easier to discuss ideas across teams and also less risky to take on. When the next cohort of fellows pick projects, this might be something worth keeping in mind. Weekly updates are good forcing functions to help with weekly contributions, I also enjoyed being able to read what other fellows were up to. With the standups, I liked the concept of the breakout room to be able to engage with other fellows, but a few times I had audio issues in Jitsi or landed in breakout rooms where people didn't talk. The times I was in a breakout room without any issues, the experience was really good. The office hours I also got a lot of enjoyment out of. The one with [Giulio](https://www.youtube.com/watch?v=z_Ef0SF-N8w) (unbiased, I swear, I haven't even interacted with him directly) was super good, getting technical and sharing good life advice. Honorable mentions are [Potuz](https://www.youtube.com/watch?v=-N6mry6w7Jk) and [Barnabe](https://www.youtube.com/watch?v=P9eNQJHyf2g). Allowing questions to be submitted through GitHub issues was also very convenient for me as I couldn't always join them live. All in all, I don't have any big issues to complain about. I think Josh and Mario did a great job with the coordination and organization of the program. I'm very happy that I was able to be part of the fellowship. ## Acknowledgements Thanks to Mark Holt for the original project proposal, the good discussions and pointers he provided during the fellowship. Thanks to Josh and Mario for running a well organized program and giving me this opportunity. Finally, thanks to the Ethereum Foundation for running the fellowship. ## Resources I shared multiple external resources on the [GitHub repository](https://github.com/2xic/erigon-risc-v-executable-proof-sourcing?tab=readme-ov-file#resources) with information I found useful while working on this project.