# Gnark on browsers using WASM
<small>**Authors:** Vocdoni Team ([@p4u](https://github.com/p4u), [@mvdan](https://github.com/mvdan) & [@lucasmenendez](https://github.com/lucasmenendez)) with the help of OpenAI for writing the document</small>.
The goal of this Proof of concept is to validate if [Gnark](https://github.com/ConsenSys/gnark) ZkSnark framework can be used in a browser to generate proofs.
## Context
At **Vocdoni**, we employ a Zero-Knowledge Succinct Non-Interactive Argument of Knowledge (zkSNARK) circuit to ensure that voters, identified by their respective public keys, can cast their votes without disclosing their identities. Moreover, a unique identifier (nullifier) is provided to identify each participant, preventing double voting. The census comprises a Merkle tree that includes the identities of voters and the respective voting power, or weight, assigned to each voter.
Consequently, the proof generation must take place on the client side, which typically occurs within the voter's web browser.
## Motivation
Gnark was created and is currently maintained by ConsenSys, a leading blockchain software technology company. Gnark is an open-source library that facilitates the creation and verification of zk-SNARK proofs, enabling privacy-preserving and efficient computation on the blockchain.
The motivation for exploring Gnark as an alternative to Circom for the Vocdoni project is driven by several factors that are expected to enhance the overall capabilities of the platform.
1. Integration with Go: Gnark is built using the Go programming language, which aligns well with the existing technology stack employed by Vocdoni. This compatibility simplifies the integration process and reduces potential friction or compatibility issues that may arise from using different programming languages.
2. Flexibility: Gnark offers a more modular design, which allows for greater flexibility in implementing various cryptographic primitives and protocols. This adaptability enables the Vocdoni Project to customize and optimize the zkSNARK implementation according to the platform's specific needs, ultimately improving the system's efficiency and security.
3. Community Support: The Gnark framework has an active and growing developer community, which can provide valuable insights, support, and updates to the Vocdoni Project. This engagement ensures that the project remains up-to-date with the latest advancements and best practices in the field of zkSNARKs, contributing to a more robust and secure voting platform.
4. Unit Testing Capabilities: Gnark provides the ability to write unit tests for circuits, enabling developers to thoroughly test and verify the correct functionality of the implemented cryptographic primitives and protocols.
5. Support for Cyclic Curve Groups and zkSNARK Recursivity: Gnark supports a variety of cyclic curve groups, which enables the implementation of recursive zkSNARKs. Recursive zkSNARKs allow for the composition of multiple proofs into a single proof. This feature is particularly beneficial for the Vocdoni Project, as it enhances the scalability and efficiency of the voting platform, allowing for more complex voting schemes and a larger number of users without compromising on performance or security.
6. Unified Codebase for Backend and Frontend: Gnark's potential compatibility with WebAssembly (WASM) allows us to maintain a single codebase for both backend and frontend components. By leveraging WASM, the project can run the same Go code on the browser as well as the backend. This unified codebase approach simplifies development and maintenance processes, reduces the likelihood of inconsistencies between the frontend and backend.
## Steps for the Proof of Concept
1. Port our existing Circom circuit to Gnark
2. Implement missing cryptographic primitives from Circom to Gnark (Poseidon hash and Sparse Merkle Tree verifier)
3. Develop a Proof of Concept (PoC) service to generate a proof that can be compiled into WebAssembly (for browser use)
4. Evaluate the performance of this service by generating proofs in browsers for Plonk and Groth16 over bn254
## Resources
We used the following opensoucre repositories:
- [gnark](https://github.com/ConsenSys/gnark) & [gnark-crypto](https://github.com/ConsenSys/gnark-crypto) for zkSnark proof generation
- [TinyGo](https://github.com/tinygo-org/tinygo) with LLVM to compile the prover service into WebAssembly
All of them have been forked by Vocdoni in order to perform the modifications for achieving the PoC requirements.
- Vocdoni's [gnark](https://github.com/vocdoni/gnark) and [gnark-crypto](https://github.com/vocdoni/gnark-crypto) forks
- Vocdoni's [tinyGo](https://github.com/vocdoni/tinygo) fork
**All the source code we implemented** for performing this PoC (including Makefiles) [can be found here](https://github.com/vocdoni/gnark-prover-tinygo).
The online version of the benchmark tool (for Groth16 and bn254) that includes the WASM based frontend is published at https://vocdoni.github.io/gnark-prover-tinygo
The plonk version can be found at https://vocdoni.github.io/gnark-prover-tinygo/index_plonk.html (currently it does not work due a issue with the memory consumption, we rolled back some optimizations that allowed it to work in the past)
Vocdoni's circuit, once compiled, has the following number of constraints:
- Plonk: 113210
- Groth16: 48395
See [here](https://github.com/vocdoni/gnark-prover-tinygo/tree/main/circuits/zkcensus) our circuit implementation.
Finally, this was the [initial Gnark prover](https://github.com/vocdoni/gnark-wasm-prover) we did by striping many code (currently not being used) and minimizing parallelism and memory consumption.
## Go vs TinyGo
Before concentrating on TinyGo, we conducted tests using the native Go compiler. However, the results were far from ideal. The WebAssembly (WASM) binary produced was large, approximately 6 MiB (using compression), and its performance was notably poor.
On the other hand, TinyGo demonstrated significant improvements, particularly when utilizing optimization flags. The resulting WASM binary was considerably smaller, around 2.8 MiB.
In more detail, we chose TinyGo because:
1. Smaller binary size: TinyGo is designed to optimize for smaller binary sizes, making the resulting WASM files significantly smaller compared to those generated by the native Go compiler. Smaller binaries lead to faster loading times and reduced bandwidth usage.
2. Reduced memory usage: TinyGo focuses on optimizing memory usage, both in terms of the runtime memory footprint and the garbage collection process. This leads to more efficient memory usage, which is especially beneficial for running WASM applications in constrained environments, such as browsers or embedded devices.
3. Better WASM compatibility: TinyGo's LLVM-based compiler backend provides better compatibility with various WebAssembly features and the overall WASM ecosystem. This can help to ensure that the compiled WASM binary works well across different platforms and browsers.
The drawback, however, is that TinyGo does not support all Go packages and is subject to certain limitations.
## WASM or WASI
WebAssembly System Interface (WASI): WASI is a modular system interface for WebAssembly. It provides a set of standardized APIs that allow WebAssembly modules to interact with the host environment (e.g., the operating system) in a secure and sandboxed manner. WASI aims to enable WebAssembly applications to run consistently and securely across various platforms while providing access to essential system resources such as files, network connections, and more. While WASM focuses on providing a low-level virtual machine for code execution, WASI focuses on defining the interfaces that enable WebAssembly applications to interact with the outside world.
TinyGo supports both WASM and WASI; however, WASI appears to offer better support for Go libraries, as it provides improved access to host resources. Despite this advantage, WASI requires a more complex setup within the browser environment.
We conducted some tests with WASI, but it did not yield better performance. As a result, we chose to use WASM, as it is currently more standardized and easier to integrate with existing systems.
## Problems found
We encountered various challenges during the process, as listed below:
1. Compiling the current Gnark code into LLVM using TinyGo
2. Limitations of the TinyGo Reflect package
3. Slow deserialization of data structures in the browser
4. Absence of parallelization for WebAssembly
5. Memory constraints in web browsers
Nevertheless, we were able to devise (partial) solutions for each of these issues.
## Leassons learned
#### Compiling with TinyGo
The current TinyGo release does not successfully compile the Gnark source code. The primary issue lies with the `Reflect` package, which is not fully implemented. Fortunately, TinyGo is under active development, and by utilizing the current development branch, along with some pending Pull Requests and quick fixes, compilation becomes possible.
Our TinyGo version capable of compiling Gnark [can be found here](https://github.com/vocdoni/tinygo).
However, we identified three issues that could not be resolved at the TinyGo level, necessitating changes to the Gnark code:
1. Cbor serialization is not functional, so we replaced it with Gob.
2. The hints registry requires Reflect support for [AssignableTo with interface](https://github.com/vocdoni/tinygo/commit/83900cbbfb3ea967948908facb2cf2e8ba6e164c), which is not supported. Consequently, we had to refactor the hints registry.
3. For some reason, the bw6-761 curve fails to compile due to a `math/big` issue. We removed it to resolve this.
#### TinyGo Optimization Level
Regarding the optimization levels of TinyGo (using the `-opt=` flag), level 1 significantly improves performance and is the final setting we chose. Optimization level 2 causes a runtime panic that is challenging to debug. However, it would be ideal to make level 2 functional at some point, as it could potentially offer even greater performance improvements.
#### TinyGo and Memory Limits
This is arguably the **most critical finding**. The WebAssembly standard dictates that an application must allocate a specific amount of initial and maximum memory. Based on our research, memory expansion beyond the initial allocation does not function properly (whether due to browser limitations, OS restrictions, or TinyGo issues). In our tests, this process took an excessive amount of time (more than 30 minutes) or did not work at all.
In our initial attempts to use TinyGo, the zkSNARK proof was never completed, as the process halted midway through generation due to insufficient initial memory allocation.
To address this issue, we increased the initial memory allocation to the maximum allowed (4 GiB for the current WASM standard, but the in-development WASM64 standard would permit more).
After making this adjustment, the proofs were correctly computed. However, it is essential to determine the appropriate amount of initial memory to prevent devices with limited RAM (such as smartphones) from running out of memory. For Groth16, we found that 2 GiB of initial memory provided the right balance (anything less was insufficient). The behavior observed when the initial memory is not enough, is that the execution stops forever (no error message).
This parameters can be set on the target/wasm.json file of tinygo.
```json
"ldflags": [
"--initial-memory=2147483648",
"--max-memory=4294967296",
"-zstack-size=16384"
],
```
#### Deserialization
Deserializing artifacts can be challenging in a browser environment due to their large size and complex data structures.
We discovered that the most effective approach is to use `//go:embed <artifactFile>`, which incorporates the artifacts into the WASM binary file. While this increases the file size, it simplifies management and accelerates access. See an [example here](https://github.com/vocdoni/gnark-prover-tinygo/blob/main/wasm/g16/main.go#L13).
Initially, we encountered issues when deserializing the Groth16 Proving Key, as it demanded a significant amount of memory, and the process would occasionally fail to complete. We resolved this problem by employing **WriteRawTo()**, which bypasses compression on the proving key. As a result, the output file is slightly larger, but the performance has improved 100-fold. See the implementation [here.](https://github.com/vocdoni/gnark-prover-tinygo/blob/main/cmd/compiler/groth16.go#L47
)
## Proposals for Gnark
In order to make the current Gnark code more performant and friendly to browsers and LLVM, we propose the following:
#### Improve code modularity on elliptical curves
At present, the Gnark code design suffers from a lack of modularity and insufficient use of Go Interfaces. Importing a single curve, such as bn254, results in the importation of all available curves. Additionally, it is challenging to utilize specific portions of the code without inadvertently importing numerous packages and dependencies that are not required. This issue can lead to larger binaries, slower init times, and a higher likelihood of running into TinyGo bugs.
We acknowledge that addressing this issue is a complex task and much of the existing code. As a temporary workaround, we propose modifying the current code generator to enable third parties to fork the code and easily add or remove elliptic curves and constraint protocols. This approach would provide a more flexible and adaptable solution until a more comprehensive refactoring can be undertaken.
We managed to [make some changes](https://github.com/vocdoni/gnark/commit/f286ca8b1e575cbcda3a0af19fc4823a73ca1ab7
) on the Gnark code generator to partially accomplish this objective. But there is still pending work to do.
#### Remove non-essential imports
A repository with fewer dependencies is typically more secure and easier to maintain for several reasons.
First, it reduces the attack surface risk that comes with third-party libraries. Second, using and updating TinyGo becomes easier, as third party libraries tend to have worse support. Third, we often get smaller binaries, as standard library packages tend to be reused.
In particular, the serialization library **cbor** presents challenges. We believe that it does not offer significant benefits compared to the standard package gob. Moreover, its use renders the code incompatible with the current version of TinyGo.
We did that [on this commit](https://github.com/vocdoni/gnark/commit/89e6d390904c7e68e548485751af6d12b8172731).
#### Refactor on the hints registry
We needed to modify the hints registry to ensure compatibility with TinyGo.
The current Hints identifier relies on the reflect package to extract the name of the package and function where the hint is defined. Regrettably, this reflect capability is not yet supported in TinyGo. Moreover, we contend that depending on the module path and package name is not an optimal approach for defining a unique identifier, as it is rather static and can lead to compatibility issues with third-party forks, such as ours.
To address this, we altered the approach for the hint identifier by introducing a new field called "name." This field is now required for every hint function to be registered. This change offers a more flexible and adaptable solution, reducing potential compatibility problems and making it easier to manage and maintain the hint functions within multiple codebases.
The commit introducing this change [can be found here.](https://github.com/vocdoni/gnark/commit/951aa9e8cdd23c3cc4c44afdcc396d06681bcf96)
#### Add TinyGo specific build files
Executing software in a browser using WebAssembly (WASM) presents different requirements compared to running it natively. The primary constraint is memory usage; minimizing it is crucial for achieving better performance in browsers.
Currently, parallelization is not possible in WASM, and attempting it may result in increased resource consumption, especially memory, since extra goroutines add CPU and memory overhead that brings no benefits when WASM limits us to one thread.
Adopting best practices, such as reusing variables and minimizing stored data structures in memory, can lead to better browser integration and performance since allocations seem to be much more expensive on WASM.
On the other hand, when running code natively on computers, parallelization and memory allocation play a significant role in enhancing performance.
To address these different requirements, we propose creating specific files (those most relevant in this context) that utilize the build tag `// +build tinygo` or `//go:build wasm` to implement some of the existing operations optimized for browser execution. This approach allows for tailored solutions that cater to the unique performance demands of both browser and native environments.
[On this commit](https://github.com/vocdoni/gnark-crypto/commit/3e72368bec7e34a7d90672e60aa0977c005d621a) you can see a change we did to avoid parallelization on gnark-crypto.
## Benchmarks
+ URL: https://vocdoni.github.io/gnark-prover-tinygo/
+ Curve: bn254
+ Protocol: Groth16
+ Circuit constraints: 48395
+ WASM compilation: `tinygo build -target=wasm -no-debug -opt=1 -scheduler=asyncify`
| Computer brand/model | Browser | RAM | Time elapsed (ms) |
| -------- | -------- | -------- | -------- |
| Thinkpad X1/Gen10 | Brave | 32 GiB | 9425 |
| Pixel 4a | Fennec | 6 GiB | 23640 |
| LGGram | Firefox | 32 GiB | 17297 |
| Thinkpad P14s (AMD 5800u) | Firefox 112 | 32 GiB | 11300 |
| Thinkpad X1/Gen10 | Firefox | 32 GiB | 7404 |
| Pixel 4a | Firefox 111 | 6GiB | 21900 |
| Samsung Galaxy A32 | Brave | 4 GiB | 30919 |
| Apple Macbook Air M2 | Chrome | 16 GiB | 9208 |
| Apple Mac mini M1 | Chrome | 16 GiB | 11151 |
| iPhone 12 iOS 16.3.1 | Safari | 4 GiB | out of memory |
| OnePlus Nord 2 | Chrome | 8 GiB | 11454 |
| Pixel 4 5G | Firefox | 6 GiB | 20504 |
| Pixel 4 5G | stock Chrome | 6 GiB | does not work |
| Apple Macbook Pro (mid 2015)| Chrome | 16GiB | 18679 |
| Thinkpad x13 Gen3 i7-1270P | Firefox | 32 GiB | 6364 |
| Thinkpad x13 Gen3 i7-1270P | Chromium | 32 GiB | 8045 |
| Thinkpad x13 Gen3 i7-1270P | qutebrowser (QtWebEngine) | 32 GiB | 6696 |
| Pixel 6 | Chromium | 8 GiB | 10767 |
| Thinkpad X1/Gen10 | Brave | 32GiB | 7861 |
| Pixel 4A | Vanadium | 6GiB | 22294 |
| Apple Mackbook Air M1 | Firefox | 8GiB | 6651 |
| Apple Mackbook Air M1 | Chrome | 8GiB | 6114 |