Author: akashin
Created on: 14 September 2023
Status: Reviewed
Related issue: https://github.com/near/nearcore/issues/9514
We're building a Zero-Knowledge prover for execution of WASM programs and are considering to use Cranelift as a compiler backend from WASM to ZK-specific ISA.
We want to understand whether this tech stack is the right choice for ZK WASM project for the next 6 months.
Zero-knowledge proof is a method by which one party (the prover) can prove to another party (the verifier) that a given statement is true, while avoiding conveying to the verifier any information beyond the mere fact of the statement's truth.
There have been a lot of recent exciting developments (e.g. zk-SNARKs) that provide tools to prove statements about execution of general computer programs.
For the purpose of this discussion, zero knowledge prover is a program that takes as an input a tuple:
and produces a short proof (100s of bytes) that the verifier algorithm will accept only if* executing the function with f_index
from the given WASM module with arguments args
will return return_value
/* In reality, the proof is probabilistic with a very low probability of failure */
This might sound like magic, but the catch is the fact that prover is really slow and takes anywhere between 100x - 10000x of the program running time to generate a proof for the statement. But once the proof is ready, anyone can cheaply and non-interactively verify it.
On a high-level, we're looking for a tech stack that would allow us to translate WASM modules into a DSL called zk Assembly (ZK ASM) (see an example below).
The quality of the translation is very important, as some ZK ASM programs are much more efficient to prove than others. We care about both backend-independent compiler optimizations as well as optimizations during lowering.
We do not care about compilation speed and ready to trade it off for higher compilation quality, as the intended usecases will repeatedly prove the result of execution of a small set of programs and the compilation cost will be paid only once and will likely be negligible compared to the proving costs.
There are some notable differences from the usual ISAs:
In the future, we also plan to prove the execution of Rust programs, so a direct conversion from Rust to ZK ASM that bypasses WASM would a plus.
Our current approach to building a ZK prover for WASM programs is based on the Cranelift tech stack. Concretely, we implement a new Cranelift ISA based on Riscv64 which allows us to build ZK circuits through the following pipeline: WASM -> Cranelift IR -> ZK ASM.
The architecture that we are targeting is a purpose-built ZK microprocessor which is programmed using ZK ASM programming language.
We've implemented support for a few basic WASM opcodes and conditionals:
Our immediate goal is to support WASM MVP instruction set and pass the corresponding tests from WASM test suite.
Right now we keep the code in a fork of wasmtime.
This is fine for short term experimentation, but introduces maintenance overhead when rebasing on the upstream and might also complicate crate publishing. What are the alternatives? What would it take to bring the new backend to upstream?
Right now we use Machbuffer::put_data
and that doesn't seem like a great fit as it expects binary data.
ZK ASM only has textual and json representation, but no binary (as there is no need to efficiently store it on disk or feed to a hardware processor). Consequently, jumps take a label or a line number as an argument.
We've built a prototype for this approach: https://github.com/akashin/zkwasm/
Pros:
Cons:
Pros:
Cons:
We've discussed this approach in https://near.zulipchat.com/#narrow/stream/295306-pagoda.2Fcontract-runtime/topic/llvm.20backend.20for.20zk/near/389232792
β¦ But for more niche virtual ISAs, we'd want to consider the use-cases against the maintenance cost: e.g. are we enabling one very specific user or is this something that could enable a bunch of folks to experiment and make use of Cranelift in new ways.
The intention is to have a general library that any user of WASM can use to generate ZK proofs. We expect this to be applicable to a multitude of use-cases.
Another thing to consider with less-common ISAs is what tooling and documentation exists: for all our ports currently, we can use qemu to test, we can build ELF binaries of Wasmtime for linux/<CPU> and test them, we can debug, there are ISA manuals. It'd be good to understand where all of that stands with zkASM as well
There is already a decent amount of tooling around ZK ASM:
We will most likely extend this tooling with Rust implementation/bindings to make testing in Rust-based projects simpler.
add.wasm:
add.zkasm:
Note, that assert_eq
is translated into ASSERT
call - this is a convention in our translator as WASM doesn't have a native assert
opcode.