From Wasm to rWasm: The 5,000× Leap Toward Proving Performance

# From Wasm to rWasm: The 5,000× Leap Toward Proving Performance ![ChatGPT Image Oct 10, 2025, 02_34_21 PM](https://hackmd.io/_uploads/SJIvkdUale.png) ### Introduction - Why Another VM? Blockchains already have plenty of virtual machines. The EVM stands out due to its compatibility. Wasm excels with its tools. RISC-V focuses on proof systems. Yet, none of these were created specifically for zero-knowledge from the beginning. Each was developed for a different purpose, and their design choices now complicate the goal of making execution provable. **rWasm** — short for *reduced WebAssembly* — started with a simple idea: keep the structured elegance of Wasm, remove what ZK doesn’t need, and rebuild the internals for better performance and proof capability. It isn’t a fork of Wasm, and it isn’t another gimmicky zkVM. It’s a production intermediate representation used in Fluent’s Blended Execution model, where different execution environments — EVM, SVM, or custom runtimes — merge into one provable trace. rWasm is crafted for the realities of zero-knowledge execution, focusing on strict determinism, complete WebAssembly compatibility, compact traces, and true performance on modern hardware. --- ### Core Design Philosophy The main principles behind rWasm are simplicity, determinism, and VM composability, all while keeping full WebAssembly compatibility. **1. Simplicity:** Keep only what’s measurable and provable. We removed dynamic typing, unnecessary metadata, floating-point operations, and unclear traps. Each instruction has a fixed, predictable impact on the stack and memory. **2. Determinism:** Execution is bit-exact. It uses pure integer arithmetic, fixed 32-bit pointers, and all system calls are explicit. There is no hidden state on the host side, no randomness, and no implicit side effects. **3. Composability:** rWasm serves as the universal intermediate representation in Fluent’s Blended Execution system. EVM and Solana-style SVM code can be compiled down to rWasm. The same instruction trace can be executed natively or proven inside a zkVM without losing meaning. --- ### How It Works - Under the Hood rWasm maintains the structure of Wasm but completely rewrites its semantics. **Stack model** Execution relies on a 32-bit slot stack. i64 operations take up two slots. This allows for simple stack-height tracking and enables provers to verify execution without dynamic analysis. Stack frames are static, with no runtime expansion. **Memory model** Memory is linear and page-based, with allocation pools that start at zero. Each memory area includes guard pages that detect out-of-bounds traps without the need for runtime checks. The memory pointer space is fixed to 4GB, sufficient for all smart contract workloads yet compact enough for proof systems. **Instruction set** rWasm keeps about 40% of Wasm’s instruction set (excluding atomic and SIMD extensions), but re-encodes it for constant-time behavior and simpler metering. Arithmetic operations are field-friendly and fully deterministic. Control flow uses direct jump offsets instead of structured block stacks, simplifying trace generation. **Syscalls and built-ins** System functions, like hashing, curve arithmetic, and input/output, are handled through explicit syscalls. Each syscall can run natively or be delegated to a ZK circuit. This makes rWasm a natural link between regular execution and proof generation. The same binary can function in both contexts, using different backends. --- ### The ZK Perspective - Trace-Friendly by Design Most existing VMs are not suited for zero-knowledge systems. They produce large traces filled with unnecessary state. rWasm was designed differently, as a *trace-first* execution engine. * Each instruction results in a fixed number of trace rows. There is no dynamic branching or variable columns. * All values are 32-bit limbs, which align neatly with field elements for proof systems. * Control flow is simplified to deterministic offsets, avoiding an overload of conditional constraints. The outcome is a trace that is up to 5,000× smaller than the EVM’s and still 5-6× smaller than a similar RISC-V proof trace. Provers require less time, less memory, and fewer constraints per step. Even for 256-bit operations, rWasm is 4-6× faster than the EVM. This is not just theory. Fluent’s zkVM directly uses rWasm traces, with benchmarked proving times in the seconds range for realistic contract workloads. --- ### Performance - Benchmarks Performance is where rWasm transitions from research to practical application, delivering impressive benchmark results. * **32-bit stack operations:** 20–22× faster than the EVM, 2× faster than Wasmi. ![image](https://hackmd.io/_uploads/rJCGnMBTxg.png) * **256-bit arithmetic:** 5× faster than the EVM, and still beats Wasmi. ![image](https://hackmd.io/_uploads/S1FNnfHalg.png) The interpreter achieves these results by staying close to the hardware, featuring a small instruction set, fast dispatching, reusable memory allocation, and 64-bit opcode optimizations. P.S.: *In these benchmarks, we aim to maintain consistent instantiation conditions across all VMs, though not all optimizations are supported by each. The comparison with Wasmtime isn’t entirely practical, as store initialization is very costly for workloads with low computational intensity. Hovewer, we wanted to simulate a realistic blockchain scenario, where all applications operate under identical execution conditions. Additionally, Wasmi doesn’t support IR serialization, but for the sake of comparison, we assumed serialization was enabled in our benchmarks.* The rWasm binary structure is tailored for blockchain execution, where quick loading and execution of account bytecode are essential. Therefore, we focus on module loading speed. ![image](https://hackmd.io/_uploads/r1rL3fSTgx.png) For long-running applications, like precompiled system contracts, rWasm can switch back to Wasmi or Wasmtime execution because it retains the original compilation hints. This allows rWasm to be compiled into register-based IR or native machine code using the Cranelift compiler. This feature makes rWasm a universal execution VM that combines the best qualities of major execution systems. --- ### Comparison - Architectural Contrast rWasm shares a heritage with Wasm and RISC-V, but it diverges where those designs struggle with proof requirements. The difference lies not in syntax, but in how each VM translates execution into verifiable state transitions. **EVM** The EVM’s 256-bit stack and variable gas rules make it straightforward to reason about at the contract level but poor for zero-knowledge proving. Each operation changes variable-width words, and each opcode has hidden side effects (memory resize, refund counters, etc.). To prove an EVM trace, a prover must recreate the entire interpreter state, including gas management and stack transitions, resulting in a large, complex circuit. **Wasm** Wasm is structured, but its flexibility becomes an issue. It permits dynamic typing, variable stack sizes, and non-deterministic host functions. While this is beneficial for compilers, it poses problems for proof systems. Every instruction can yield a different number of constraints, and the execution trace can’t be represented as a fixed-width table. Wasm also uses bytecode formats that prioritize compression over verifiability. **RISC-V** RISC-V’s simplicity makes it provable, but at the expense of efficiency. It mimics physical hardware, so even basic program logic creates large traces. Every register read and memory access is explicitly modeled. Circuits grow in line with CPU instructions, not logical operations. It serves well for general zkVMs, but is excessive for smart contract execution. **rWasm** rWasm combines the predictability of RISC-V with the compactness of Wasm. It maintains structured control flow but fixes stack width (32-bit slots), removes type polymorphism, and opts for deterministic syscalls rather than opaque host imports. Each opcode corresponds to a constant number of constraints, and all stack and memory events are reported as structured read/write pairs. This structure makes the trace friendly to fields, circuits, and easy to merge across VMs. In summary: EVM is too erratic, Wasm is overly dynamic, RISC-V is too low-level, and rWasm is *just right* — it runs like Wasm, proves like RISC-V, and **blends** like nothing else. --- ### Blending and Trace Injection The biggest advancement comes from blending — rWasm’s ability to merge multiple execution layers into one verifiable trace. Traditional proving pipelines treat each VM as a separate unit: Wasm operates inside the prover, SP1 logs every instruction, and only then do you create a proof. While this works, it is extremely inefficient — every bytecode action must be recomputed inside the proof circuit. rWasm changes that model. It uses trace injection, which allows pre-verified traces from Wasm execution to be directly included in the rWasm trace. Instead of re-proving basic operations, we leverage deterministic sub-traces that were verified during the Wasm→rWasm compilation phase. This significantly reduces the proving workload. In direct comparisons with SP1+Wasm, blended rWasm achieves between 50× and 10,000× trace reduction, depending on the workload's makeup and syscall frequency. The reasoning is simple: the prover does not re-simulate; it reuses verified computations within one unified trace. Blending changes the proving approach from “simulate everything” to “verify once, compose forever”. It’s the method that enables Fluent’s multi-VM vision to be not only possible but efficient enough to matter. --- ### Blended Execution - The Bigger Picture rWasm is not just a standalone VM; it’s the heart of **Fluent’s Blended Execution** model. In Fluent, multiple environments — EVM, Solana SVM, or custom runtimes — are compiled into rWasm IR, executed by the same runtime, and proven under one unified trace format. Because of the AOT compilation to machine code, these runtimes achieve near-native execution speed while leveraging rWasm for efficient trace proving. This is how Fluent accomplishes what other L2s cannot: interoperability not only at the asset level but also at the *execution* level. A contract made for Ethereum can engage with one created for Solana, all within a single ZK-verifiable state transition. Blended Execution transforms the VM into the protocol's common language. rWasm serves as that language. --- ### Conclusion - A New Baseline for ZK VMs rWasm demonstrates that zero-knowledge execution does not have to sacrifice performance. By simplifying Wasm to its provable essence, it becomes faster to execute and much easier to prove. It retains the developer experience of Wasm and the compatibility of current toolchains, while introducing an execution model designed from the ground up for generating constraints. In the larger context, rWasm is more than just a VM. It sends a clear message: *determinism and simplicity are the true factors for scaling in the ZK era*. This is where blending begins to matter—and quietly changes everything. --- rWasm is open source as part of **Fluentbase** and currently powers the Fluent Testnet. Benchmarks, SDKs, and documentation are available at [fluent.xyz](https://fluent.xyz).