RISCV target - VF

# On the RISCV target ## Motivation Defining the minimum RISCV target triple for zkEVMs has numerous second order effects on client diversity and the complexity of the Ethereum protocol. We should therefore consider the long-term implications of our decision to avoid optimizing ourselves into a local minimum. Of course, we need to consider the trade-offs; optimising only for the long term may be too disruptive in the short term on a system that is already in production. ## Summary In this document, we split the discussion into two categories: - *Extension support*: Which RISCV extensions should we support - *Environment support*: What can we assume about the environment that the program runs in (ie linux support, zkvm precompiles) *Extension support* On extension support, in practice, the minimal instruction set needed for the STF is RV32IM. In order to support the higher level languages that the EL is written in, the extensions in RVA20U64 would be the most appropriate target. This would add ~100 new instructions where half of them are floating points and the other half are compressed instructions. Although the STF itself does not use floating points, the runtime (including garbage collector and scheduler) will. Of interest, RVA20U64 also includes the Zicclsm extension which allows misaligned loads and stores. This is typically not supported by zkVMs and is more expensive. *Environment support* For the STF to be efficient, we need to assume the existence of zkVM precompiles. On Linux support, the STF does not require it. However some ELs are written in languages that assume an operating system like Linux is available. In order for these ELs to be proven, these need to be handled either inside of the zkVM or outside of the zkVM via a translation layer (WASM, toolchain modifications, etc). This has two disadvantages; additional complexity and performance degradation. We argue that this should not be incurred by the zkVM and provide alternative solutions (with tradeoffs) on how to handle these outside of the zkVM. # Systems Programming languages In this section, we outline what extensions and environment assumptions are needed in order to implement the state transition function(STF) in a systems programming language. ---------------- # Extension Support ## RV32I RV32I is the absolute minimal Instruction Set Architecture(ISA) needed in order to support the execution layer's(EL) STF. However, it is limited and will not support features such as trap handling and reading cycles. **Tradeoffs** *Advantages* - zkEVMs will target the absolute minimum amount of instructions needed - The work needed to formally verify a zkVM is kept at its minimum. *Disadvantages* - Multiplication and division procedures will be slow especially without `M` extension. - EL teams will need to target bare metal RISCV. On the last point, we are assuming this is table-stakes for system's programming languages. For example, in Rust, one needs to ensure that their program compiles under _core_ / _no_std_. > Note: Technically, there is also RV32E, however, this is not as widely supported as RV32I at the time of writing and is inconsequential to the discussion. ## RV32IM While the STF can be written in RV32I, multiplication and division instructions will be slow as they need to be implemented using shifts and additions. The `M` extension adds efficient support for multiplication and division. **Tradeoffs** *Advantages* - The U256 type which is used a lot in the EVM will be faster *Disadvantages* - The zkVM will need to support an additional extension. - EL teams will need to target bare metal RISCV. This in practice, is the **minimal target** needed in order to prove the STF. Why not support this minimal target then? The main reason being that only a subset of languages, usually dubbed systems programming languages are able to compile to this target. > Note: RV64IM is also a valid target here and some zkVMs may choose to target 64 bits for performance reasons. # Environment Support Systems programming languages do not add any additional assumptions to the environment. For completeness, we briefly note the mechanisms being used to accelerate common expensive functions in the STF. ## zkVM Precompiles While one can implement the STF using only RV32IM. There are certain functions like keccak and sha256 that are called multiple times and incur a high cost when proving. zkVMs typically expose accelerated versions of these functions, and there are currently two common ways in which the STF can request these: - Via ECALL - Via Control Status Registers(CSR) This however, is an implementation detail of the zkVM and one that will not be exposed to the STF. The STF will define a set of C functions that the zkVM will link to. See [here](https://github.com/eth-act/zkvm-standards/tree/main/standards/c-interface-accelerators). # High level languages In this section, we outline what extensions and environment assumptions are needed in order to implement the STF in a high level programming language. --------------------------- Ethereum's EL is written in a myriad of languages. High level languages like Java and C#: - **Runtime bloat:** Have richer runtimes, due to additional features like garbage collection and scheduling. - **Extension**: Tend to be less configurable and will compile to the most general RISCV target possible. - **OS dependency:** Depend on an operating system(OS) for programs to be ran correctly. # Extension Support | Language | Target ISA | Reference | | ------------------------------ | ----------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Java (GraalVM)** | RV64IMACFD | [source](https://github.com/oracle/graal/blob/1d34199ccb381a441e997fe03a7b8f2ecf1f444c/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/util/CPUTypeRISCV64.java#L61) | | **Go** | RVA20U64 | [source](https://github.com/golang/go/blob/385dc33250336081c0c630938c3efede481eff76/src/cmd/go/internal/help/helpdoc.go#L977) | | **.NET** | RV64IMAFDC + Zicsr + Zifencei + Zbb + Zba | [source](https://github.com/dotnet/runtime/blob/e8812e7419db9137f20b990786a53ed71e27e11e/src/coreclr/jit/instrsriscv64.h#L23) | *Table1:* Table showing the minimum ISA that each of the three languages that currently do not support bare metal targets upstream require. To support all of the above languages without modifications to their toolchains, we take the union of all the target ISAs. This in turn means `RVA20U64 + Zbb + Zba + Zifencei`. For the specifications of RVA20U64 see [here](https://github.com/riscv/riscv-profiles/blob/main/src/profiles.adoc#rva20u64-mandatory-extensions). We list out the extensions needed to be explicit: - **I** (Integer base operations) - **M** (Integer multiplication and division) - **A** (Atomic instructions) - **F** (Single-precision floating point instructions) - **D** (Double-precision floating point instructions) - **C** (Compressed instructions) - **Zicsr** (CSR instructions) - **Zicntr** (Basic counters) - **Ziccif** (Main memory regions with both the cacheability and coherence PMAs must support instruction fetch, and any instruction fetches of naturally aligned power-of-2 sizes up to min(ILEN,XLEN) (i.e., 32 bits for RVA20) are atomic) - **Ziccrse** (Main memory regions with both the cacheability and coherence PMAs must support RsrvEventual) - **Ziccamoa** (Main memory regions with both the cacheability and coherence PMAs must support AMOArithmetic) - **Za128rs** (Reservation sets must be contiguous, naturally aligned, and at most 128 bytes in size) - **Zicclsm** (Misaligned loads and stores to main memory regions with both the cacheability and coherence PMAs must be supported) This adds roughly 100 new instructions, where approximately 50 are coming from floating point extensions(F/D) and 50 coming from compressed extension ( C ). **Compressed instructions** Compressed instructions are great for decreasing the code size and memory bandwidth for instruction fetching. The amount of work added to support compressed instructions is localized to the decoder since all compressed instructions are aliases of their uncompressed variant. Since modern compilers care about smaller binaries, they will automatically emit the compressed instruction variant whenever possible. **Floating point instructions** Although the STF will not use floating points, we have seen partial usage of floating points instructions during the runtime of higher level languages. The most common case being floating point loads and stores for [context switching](https://github.com/golang/go/blob/16705b962ecf35314cd6791348d7a4b00422639a/src/runtime/cgo/gcc_riscv64.S#L64), but also inside of the [Garbage collector](https://github.com/dotnet/runtime/blob/69d8087eecff2a99b122ac80c5ec2a93adc7293d/src/coreclr/gc/gc.cpp#L448) The choice to support all instructions in the float extensions is for practical purposes, where otherwise: - Each language will need to modify their dependencies and or the runtime functions that require floating point instructions, updating it each time a dependency/the runtime changes - Or, we take the union of all of the floating point instructions that need to be supported, and every time a new floating point instruction is needed, all zkVMs will update for support. *Supporting floating point using microcode* Floating points instructions following IEEE754 are complex. There are multiple instructions and many instructions behave differently depending on the rounding mode with various edge cases around normalization. One common way to support floats would be by precompiling a softfloat library like Berkeley into RISCV bytecode and whenever a floating point instruction is called, it will jump to the appropriate function in the softfloat library. This will be slow compared to implementing the floating point operations like any other instructions via circuit modifications, however we don't expect floats to be a significant part of program execution. *Supporting floating point using exceptions* The other common way to support floating point instructions is to install a trap handler that will trap on any floating point instructions, and emulate the floating point instructions, plus registers in software. > TODO(Make this clearer -- also in order to pass riscof we may need to copy in floating point registers into this) > TODO(maybe summarize here and move longer thread on floats into appendix) **Atomic instructions** Since zkVMs run in a single threaded environment, most atomic and fence instructions are NOOPs. **Zicclsm** Many zkVMs do not support unaligned accesses and simply trap. This extension forces the zkVM to handle unaligned accesses, leading to an increase in circuit logic and a decrease in performance since an unaligned access will need multiple memory operations. **Tradeoffs** *Advantages* - The advantage of adding all of the above instructions is that higher level languages that the EL has been implemented with will now be fully supported modulo the environment requirements. *Disadvantages* - The zkVM now needs to account for the extra instruction sets. This may also affect performance since the circuit complexity may increase due to the extra instructions. # Environment Support ## Runtime Bloat Richer runtimes tend to only be a factor in terms of performance and in most cases won't affect the complexity of the zkVM. However, since the runtime cannot/shouldn't be modified by ELs, we cannot assume that since the STF does not require particular instructions, that they will not be used. An example of this is floating point operations being used during garbage collection. ## Operating System Programs written for bare metal targets, ie not assuming an operating system are able to access the hardware directly. This means that, for example, they are able to read and write to memory directly. High level languages(HLL) tend to assume that the program will be restricted in what it can do and will be ran on top of a operating system. It is the operating system that is able to access the hardware directly. Without loss of generality, we can take this to mean Linux. Linux has an API that programs written in higher level languages use to request privileged operations, like memory allocations, fetching randomness, creating threads etc. The set of functions in the API are called system calls(syscalls). When the STF is compiled using a HLL, it will emit these syscalls using the ECALL instruction. There are [hundreds of syscalls](https://man7.org/linux/man-pages/man2/syscalls.2.html), so implementing each of these manually in a zkVM would not be feasible. There are currently two ways to support syscalls: - **Partial syscall emulation:** Figure out the syscalls that are (currently) being called by the STF and support those. Whenever a new syscall is needed, then each zkVM will be updated to support it - **Linux kernel emulation:** Boot Linux in the zkVM, and then run the STF as a process on top of it. All syscalls will be supported because the Linux kernel is inside of the zkVM in supervisor mode. ### Partial syscall emulation As previously noted, Linux syscalls are the API for calling the linux operating system. If we no longer have the operating system, which is the case with partial syscall emulation. We must satisfy the explicit and or implicit assumptions that the syscalls rely on. Examples of these are: *Memory management unit* Linux application assume an isolated virtual address space, with multiple non-contiguous segments (text, bss, heap, stack and memory-mapped regions). zkVMs currently assume a flat and shared address space, so emulating the virtual address space will require a deterministic mapping layer for address translation. A memory management unit will handle translations from virtual addresses to physical addresses, while also handling page permissions and page faults. *Privilege levels* Syscalls implicitly assume a difference in privilege levels between the user program and the operating system that the program is running on top of. The STF is said to run in user mode and linux is said to run in kernel/supervisor mode. There are many reasons for having this separation, isolation and protection being a main motivator. A malicious user program does not have access to other user programs and they must ask the _trusted_ operating system for access to the hardware. For our usecase, we will only ever run a single program at a time, unlike your computer that runs multiple programs at once. *Threading and synchronization support* Many runtimes expect concurrency primitives and preemption via common syscalls like clone and futex. Since zkVMs are by default deterministic, this would essentially require modeling concurrent interleaving deterministically, plus Thread Local Storage(TLS) emulation. > TODO: Add more information on supporting threading -- note that although some runtimes can force single threaded mode, the runtime may launch threads and in some cases, this may require TLS being emulated unless taken out > TODO: Add note on the fact that we need to take the union of every language > TODO: Instead of an appendix, perhaps we can add the currently investigated linux syscalls here **Tradeoffs** *Advantages* - High level languages can be proven using upstream compilers *Disadvantages* - Since all syscalls will not be supported, zkVMs will need to update their supported syscalls whenever any of the supported languages update their dependencies/runtime and a new syscall is being used - Linux syscalls have a simple API, however their internal implementations are complex, see [mmap as an example](https://github.com/torvalds/linux/blob/17d85f33a83b84e7d36bc3356614ae06c90e7a08/mm/mmap.c#L339). For this reason, zkVMs that have added support for linux syscalls have partial implementations of some functions. - Linux syscalls is not a free abstraction, so it will incur performance costs compared to targeting bare metal. ### Linux kernel emulation The idea here is that we boot the linux kernel in the zkVM in supervisor mode, and then run the STF on top of this in user mode. All linux syscalls will be supported because the kernel itself is running in the zkVM. The zkVM circuit would need to provide all of the peripherals that the operating system requires, like timers for preemption. **Tradeoffs** *Advantages* - For supporting linux syscalls, this is the most robust solution. If the user program uses a new linux syscall, then as long as the user program supports the version of linux being run in the zkVM, the syscall will be supported. *Disadvantages* - If you need to run a single program, this is overkill and adds unnecessary overhead when the goal is to run a specific program. Note that this requires full support of the [ratified privilege architecture](https://docs.riscv.org/reference/isa/priv/priv-index.html) # High level languages - Alternative solutions In this section, we discuss alternative solutions for supporting HLLs that do not involve adding support for linux syscalls into the zkVM. ---------- For HLLs, the complexity of supporting linux syscalls need to live somewhere. Above, we discussed placing it in the zkVM which is already an inherently complex component. However this is not the only strategy. Below we describe four alternatives where the zkVM does not need to directly support linux syscalls: - **WASM:** Compile the higher level language to WASM-WASI. - **Toolchain modifications:** Modify the toolchain/compiler to not emit linux syscalls - **Code rewrite:** Reimplement the STF in a language that does not require linux syscalls - **LIBC modification:** Patch the functions that call syscalls in libc All four strategies share the benefits that: - The complexity of linux syscalls is not seen by each zkVM. - Languages that don't require linux syscalls are not affected by linux syscalls due to the increased complexity for the circuit. Or, alternatively, zkVM teams do not need to maintain two circuits; one for the bare metal languages and one for the high level languages. ### WASM The idea here is to compile to WASM-WASI which still assumes some operating system, however the WASM toolchain allows you to fill in the implementation for the operating system (WASI SDK methods). This is a common practice. **Tradeoffs** *Advantages* - One can use WASM-WASI to eliminate the need for linux syscalls. *Disadvantages* - There may be performance degradation when compiling to WASM first and then transpiling to RISCV. - Java(GraalVM) assumes the WASMGC proposal which adds a significant amount of complexity and as of 2025 is poorly supported by WASM tools. ### Toolchain modifications The idea here is to modify the runtime and compiler for the given language to eliminate linux syscalls. **Tradeoffs** *Advantage* - One has full control over what RISCV code gets emitted. *Disadvantage* - Maintaining a toolchain for a language requires compiler expertise, can introduce bugs and can cause issues when one needs to update the compiler to a new upstream version ### Code rewrite Here one would rewrite the STF in a new language that does support bare metal while still having the higher level language call the STF in the new language via bindings, so that one does not need to maintain two EVM implementations. **Tradeoffs** *Advantages* - One is able to use the upstream compiler - The overhead outside of the STF can be reduced to its minimum *Disadvantages* - This is additional work and in the short term, may use up resources that would otherwise had gone towards the current and next hardfork. Note, just implementing the STF is simpler than maintaining an EL because components like networking are not needed. ### LIBC modification Most languages do not call linux syscalls directly, instead they use a library called LIBC which in turn will call linux syscalls. One could instead patch this library to not call linux syscalls. This approach is similar in spirit to the WASM approach. *Advantages* - No changes to zkVM needed *Disadvantages* - Runtimes are often bounded and tested with a specific libc implementation. Given, there are multiple languages with different libc requirements, this is not a feasible solution. We also note that Golang will call syscalls directly instead of via libc, so this solution is also not complete. # Conclusion (Draft) To recap, we draw two tables one for the extension and another for the environment. ## Extension | **Target** | **Description** | **zkVM Complexity** | | ---------------------- | ------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | | **RV32IM** | Minimal set of instructions for supporting the STF | Minimal | | **RVA20U64 + Zbb + Zba + Zifencei** | Full set of instructions need to support all EL languages | High | RV32IM is the minimal ISA needed, while the extensions in RVA20U64 + Zbb + Zba + Zifencei would support C#, Java and Golang. ## Environment | **Environment Model** | **Privilege Levels** | **Syscall Coverage** | **Implementation Cost (in zkVM)** | **Performance** | **Maintenance Complexity for EL devs** | | ----------------------------- | ----------------------------- | ------------------------------------------------------- | --------------------------------- | ----------------------------------------- | -------------------------------------------------------------- | | **Bare-metal (no OS)** | M-mode only | None | Low | **Fast** | **Low** (once working, static) | | **Partial syscall emulation** | M-mode + software “user mode” | Partial (e.g. `write`, `brk`, `mmap`, `futex`, `clone`) | Moderate | **Slow–Moderate** | **High** (updates when runtimes add syscalls) | | **Partial via WASI** | M-mode only | Complete (WASI subset) | Low | **Moderate (can be slower than partial)** | **Moderate** (WASI evolves slowly) | | **Toolchain modification** | M-mode only | None | Low–Moderate | **Fast** | **High** (must track compiler/runtime updates) | | **Code rewrite (STF only)** | M-mode only | None | Low | **Fast** | **Medium** (one-time rewrite, then stable) | | **Linux-in-VM** | M/S/U (or M-mode semantics) | Complete (400+ syscalls) | High | **Slow** | **Low–Medium** (depends on kernel version, but rarely changes) | Each variant listed above comes with its own tradeoffs. Any variant that targets linux syscalls will incur a performance cost compared to targeting bare metal, where the most robust approach would be full linux kernel emulation. Partial linux support still has performance degradation; however we inherit the brittleness of needing to support new syscalls that are added in *any of the higher level languages supported*. The medium term approach would be to support the current languages without adding the extra complexity to the zkVM. This may not be possible for all languages, implying a rewrite of the STF. # Appendix TODO: Add documents on what syscalls are getting called from go and c# with links to Marcins and Tanish's work