Technical Report: Proof Generation & Verification Benchmark of `bn256` Implementations in the SP1 zkVM

### **Technical Report: End-to-End ZK Proof Benchmark of bn256 Implementations in the SP1 zkVM** This report extends the previous benchmark analyses to cover the entire Zero-Knowledge proof lifecycle, from execution to verification. It evaluates the performance of three distinct `bn256` libraries within the SP1 zkVM, measuring the impact of cryptographic library choice and specialized precompiles on the resource-intensive proving and verification stages. **Date:** August 8, 2025 **Platform:** SP1 zkVM (CPU Prover) **Hardware:** Apple MacBook Pro (M3 Max) ### **Objective** The goal is to benchmark the end-to-end performance of ZK applications built with different underlying cryptographic libraries. This analysis moves beyond execution cycle counts to measure the real-world time required for: * **Proof Generation:** The computationally expensive process of creating a cryptographic proof of the guest program's execution. * **Verification:** The process of validating the generated proof. This provides a comprehensive view of how library optimization and zkVM precompiles affect the practical viability and developer experience of building with Zero-Knowledge proofs. ### **Methodology** The experiment used the same tripartite Diffie-Hellman key exchange guest program as the previous benchmarks to ensure a consistent computational workload. The SP1 toolchain was used to execute each guest program, generate a proof of that execution, and subsequently verify the proof. #### **Experimental Configurations** The five configurations from the updated execution benchmark were used: 1. **bn-pairing:** Standard `substrate_bn` crate, no precompiles. 2. **bigint-pairing:** `substrate_bn` modified to use `crypto-bigint`, no precompiles. 3. **ark-pairing:** Uses the `ark_bn254` crate from the `arkworks` ecosystem, no precompiles. 4. **bn-pairing-patched:** `substrate_bn` with the specialized `bn` precompile enabled. 5. **bigint-pairing-patched:** `crypto-bigint` version with the generic `bigint` precompile enabled. 6. **ark-pairing-patched**: This is Identical to the `ark-pairing` guest program, but the bigint operation swapped with `sp1::mul_mod` precompiles where is can be applied, this should cut do the execution cycle count. ### **Results** The total time for proof generation and verification was recorded for each configuration. The results, derived from the prover and verifier logs, are summarized below. **Shorter times indicate better performance.** | Configuration | Base Crate | Precompile Enabled | Cycle Count | Proof Generation Time | Verification Time | Total Time | | :--- | :--- | :--- | :--- | :--- | :--- | :--- | | `bigint-pairing` | `crypto-bigint` | No | 1,523,558,068 | ~4 hr 50 min | ~2 min 29 sec | ~4 hr 52 min | | `bn-pairing` | `substrate_bn` | No | 1,105,498,339 | ~4 hr 56 min | ~1 min 52 sec | ~4 hr 58 min | | `bigint-pairing-patched` | `crypto-bigint` | Yes (`bigint`) | 518,877,400 | ~3 hr 16 min | ~1 min 19 sec | ~3 hr 17 min | | `ark-pairing-patched` | `ark_bn254` | **Yes** | **422,122,898** | **~1 hr 23 min** | **~44 sec** | **~1 hr 24 min** | | `ark-pairing` | `ark_bn254` | **No** | **428,207,591** | **~1 hr 17 min** | **~44 sec** | **~1 hr 18 min** | | `bn-pairing-patched` | `substrate_bn` | Yes (`bn`) | **40,014,404** | **~40 min 47 sec** | **~21 sec** | **~41 min 8 sec** 🏆 | ### **Analysis & Discussion** This end-to-end analysis confirms and amplifies the conclusions from the execution benchmarks, revealing the profound impact of both library choice and precompiles on prover performance. #### **Without Precompiles: Prohibitively Expensive** Both `bn-pairing` and `bigint-pairing` required an extremely long time to prove, taking nearly **5 hours** each. These times are prohibitively long for almost any practical application or development cycle. Interestingly, `bigint-pairing`, despite having ~38% more execution cycles, proved slightly faster, suggesting that the raw instruction count is not the only factor influencing prover performance; the structure of the execution trace also plays a role. #### **The Impact of Optimizations: A Tale of Two Strategies** The introduction of `arkworks` and SP1 precompiles showcases two powerful but distinct optimization strategies. * **"Fat" `bn` Precompile (The Winner 🏆):** The `bn-pairing-patched` configuration delivered a transformative result. It reduced the total time from nearly 5 hours to just **~41 minutes**. This represents a staggering **~7.2x performance improvement** over its non-precompiled version. This is the gold standard, demonstrating that a specialized, high-level precompile that accelerates the entire cryptographic protocol is the ultimate performance unlock. * **Optimized Library (`arkworks`) 🤯:** The `ark-pairing` result is the most significant new finding. With **no precompiles**, it generated a proof in just **~1 hour and 17 minutes**. This is over **3.8x faster** than the other non-precompiled libraries. This proves that a well-engineered, algorithmically-optimized cryptographic library can drastically reduce the proving burden on its own. #### **Key Insight: Optimized Library vs. Generic Precompile** The most crucial comparison is between `ark-pairing` and `bigint-pairing-patched`. * The `ark-pairing` program (**no precompile**) was approximately **2.7x faster to prove** than `bigint-pairing-patched` (**with a generic `bigint` precompile**). This result is unequivocal: a superior, general-purpose library is far more valuable than a less-optimized library augmented with only low-level, generic precompiles. Accelerating just the basic integer math is not enough; the high-level optimizations within the `arkworks` library had a much larger impact on reducing the overall complexity of the execution trace for the prover. ### **Conclusion** This end-to-end analysis proves that both specialized precompiles and highly-optimized libraries are essential for the practical application of Zero-Knowledge proofs. 1. **Specialized Precompiles Are Mandatory for Peak Performance:** For complex cryptographic workloads like `bn256` pairings, the SP1 `bn` precompile is not just an optimization—it's what makes the technology feasible on consumer hardware. It reduces proof generation from a multi-hour commitment to under an hour, crossing a critical threshold for usability. 2. **A High-Quality Library is a "Game-Changer":** In the absence of a specialized, high-level precompile, the choice of cryptographic library is the single most important factor. The `arkworks` library provided a massive performance boost that **surpassed even the gains from a generic, low-level precompile.** For developers building in the SP1 zkVM, the recommendation is clear: prioritize the use of specialized precompiles like `sp1-patches/bn` whenever possible. If one is not available for your specific use case, selecting a modern, highly-optimized library like `arkworks` is the next most critical step to ensure manageable and efficient proof generation.