PVSS transcript verification

# PVSS transcript verification ## Design study - Note: Calldata cost will be reduced by EIP-4844 which can potentially significantly reduce on-chain verification cost - Note: [EIP-2537 ("bls12-381 EIP")](https://eips.ethereum.org/EIPS/eip-2537) will **not** take effect after the Shanghai fork. The timeline for this EIP is unknown. ### Vanilla Solidity, on-chain verification - bls12-381 addition written in Soldity - Note that [precompiles on Ethereum are not activated yet](https://eips.ethereum.org/EIPS/eip-2537#proposed-addresses-table) - Simply post all transcripts and add then verify it on chain on-chain - Pros - Simple - Cons - Costly computation - Calldata cost - Computing cost - Initial research - No Solidity implementation for bls12-381 found - Addition implementation will require 2-limb uin256 numbers (256 bits is not enough for 381-bit group elements) ```mermaid sequenceDiagram autonumber loop For n validators validator->>dkg.sol: submit(transcript) dkg.sol-->dkg.sol: aggregate = aggregate + transcript end validator->>dkg.sol: verifyAggregate() dkg.sol-->>validator: ok ``` ### ZK coprocessor (Risc0), off-chain verification - Variant 1 - Post transcript data on-chain - Compute and verify transcript aggregations on-chain - Trading an increased latency (proving takes time) fora constant in a gas cost calculation - Would involve a zk coprocessor (like Risc0) - Constant in security parameters, instead of linear - Pros - Lower computing (gas) cost - Const - Calldata cost remains high - Still have to pay for proof verification ```mermaid sequenceDiagram autonumber loop For n validators validator->>dkg.sol: submit(transcript) end validator->>dkg.sol: verifyAggregate() dkg.sol->>dkg.sol: emitNewJobEvent() worker->>worker: listenForEvents() worker->>worker: transcripts = recoverTranscriptsFromCalldata() worker->>zkvm: aggregateAndVerify(transcripts) zkvm-->>workerworkerworker: (proof) workerworkerworker->>dkg.sol: submitProof(proof) dkg.sol->>dkg.sol: verifyProof(proof) ``` - Variant 2 - Just like Variant 1, but we post transcript hashes instead of complete transcripts on-chain - We side-load actual transcripts into a prover off-chain, and we verify integrity of transcript data (vs on-chain hash) in addition to aggregation etc. - Pros - Reduced calldata cost and computing cost - Cons - Requires an ability to side-load data to prover runtime - Paying for proof Off-chain verificaion could be computed using scaling solutions. We could further divide them into ones with optimistic assumptions (Cartesi rollups with interactive fraud proofs) or zk-verifiable computations (Risc0, see their Bonsai offering). ```mermaid sequenceDiagram autonumber loop For n validators validator->>dkg.sol: submit(transcriptHash) validator->>sidechannel: submit(transcript) end validator->>dkg.sol: verifyAggregate() dkg.sol->>dkg.sol: emitNewJobEvent() worker->>worker: listenForEvents() worker->>sidechannel: getTranscripts() sidechannel-->>worker: transcripts worker->>zkvm: aggregateAndVerify(transcripts) zkvm-->>worker: (proof) worker->>dkg.sol: submitProof(proof) dkg.sol->>dkg.sol: verifyProof(proof) ``` #### bls12-381 G1 addition in Risc0 X-axis - number of G1 elements Y-axis - time in seconds ![](https://i.imgur.com/nwzAfpH.png) #### To Do - [ ] Benchmark proof gas verification cost - [ ] Add pairing to zkvm benchmark ### Optimistic verification - A generic protocol for optimistic verification - Instead of DKG verification, Participants post hashes of their aggregate - During some time window, a Challenger may create a Challenge against a particular hash - That essentially means challenging a subset of Participants that submitted a certain aggregate, Challenged Participants - Once the Challenge is creates, Challenge Period is started - During this period, at least one of the Challenged Participants has to post a proof of transcript correctness - If the proof checks out, Challenged Participants may continue as usual, and the rest of the Participants are slashed - If the proof doesn't check out, Challenged participants are slashed, and the rest of the Participants continue as usual - The challenge can be settled by providing a proof - Either by opening a polynomial directly on-chain - Or providing a proof of a polynomial opening and verifying the proof on-chain - Depends on which one is cheaper ### KZG commitments - Use a temporary [Powers of Tau set of parameters for BLS12-381 from Ethereum KZG ceremony](https://seq.ceremony.ethereum.org/info/current_state) - Various implementations of KZG10 (bls12-381) in Rust - https://github.com/sifraitech/rust-kzg - https://github.com/ralexstokes/kzg - No Solidity implementation for bls12-381 curve, and no precompiles activated (yet) for EVM on Ethereum - This is a blocker for deploying this into prod ![](https://i.imgur.com/AOcdBDS.png) - Verifier is on-chain, Prover is off-chain Notes on [KZG commitment scheme and it’s use in Ferveo](https://hackmd.io/KVxrZGRORomdaAGgPLsVcg) #### Gas cost - According to [this Solidity implementation](https://github.com/weijiekoh/libkzg#solidity-contract) (note that it supports Baby Jubjub curve, not bls12-381): | Coeffs/Points | `commit()` Gas | `verifyMulti()` Gas | Total Gas | Cost (USD) @ Ethereum | Cost (USD) @ Matic| |-|-|-|-|-|-| | 1 | 34,929 | 193,084 | 228,013 | 7.7520 | 0.0397 | | 8 | 22,889 | 405,237 | 428,126 | 14.5554 | 0.0746 | | 16 | 223,476 | 760,973* | 984,449 | 33.4693 | 0.1715 | | 32 | 424,786 | 1,521,946* | 1,946,732 | 66.1850 | 0.3392 | | 64 | 827,980 | 3,311,057* | 4,139,037 | 140.7190 | 0.7212 | *Extrapolated from benchmark data - Using gas cost data from [this calculator](https://www.cryptoneur.xyz/gas-fees-calculator): - 15th March - ETH @ 1699.9 USD - MATIC @ 1.21 USD - Expecting gas cost to be 40-60% higher for bls12-381 #### Notes on KZG implementations - Goals - Replace some of the Ferveo checks ("#3 check") with opening of a polynomial commitment - Use KZG impl. with bls12-381 support - No such impl. in Solidity - Precompiles are not activated yet - Use Ethereums KZG trusted setup - Use batched commitments ## Naive optimistic DKG - In this design we are trading-off robustness for maximum simplicity - The basics are similar to optimistic verification - The difference is that challenge period is replaced with a failover - Let's see an example: - Nodes are exchanging transcripts and verifying them locally (p2p verification) - An honest node receives a bad transcript and triggers a failover - A failover, unlike a challenge period, can't be recovered from and results in a DKG ritual failure - So in this naive approach there is no verification of transcripts. - We are assuming that we're dealing with an honest majority of nodes - What if a malicious node triggers a failover? - This and other issues related to quality of the cohort, slashing, trust assumptions etc. may be conveniantly ignored if we assume honest supermajority/absolute honesty of nodes - This assumption makes sense in the context of "staker slots". See below: - We may have a permissioned cohort of nodes for the purposes of an early launch with this type of design. It would later be replaced with a more robust solution. ## Trade-off spaces in DKG verification design - Eager vs lazy - Eager - produces verification result "instantly" on-chain - Lazy - produce verification on-chain after some delay, within some time window - Off-chain vs on-chain - Off-chain - less gas intensive, may produce a significant latency - On-chain - more gas intensive, has no latency overhead |-|Eager|Lazy| |-|-|-| |On-chain|Vanilla Solidity implementation|Optimistic verification| |Off-chain|-|ZK coprocessor| |-|High-trust|Low-trust| |-|-|-| |On-chain|Eager verification|Optimistic verification| |Off-chain|Trustless 3rd party|p2p verification| ## Alternative designs - Running your own L1 ## Local (client-side) verification - Client downloads transcript data from the DA layer and performs a local transcript verification - Approach 1 - We're assuming no code changes in Ferveo, we're just exposing existing logic to the client-side API - `PubliclyVerifiableSS` contains `{ coeffs: Vec<E::G1Affine>, shares: Vec<E::G2Affine>, sigma: E::G2Affine }`, let's call it "transcript" - A transcript aggregate is just a bunch of transcripts added together (collapsed into a single `PubliclyVerifiableSS` instance) - So we can store transcripts on-chain and the client should be able to download them, create an aggregate, and run all the checks that a server would usually do: - `verify_optimistic` - Uses PVSS transcript - `verify_full` - Uses PVSS transcripts (coeffs) - Uses DKG (domain (for FTT), encrypted shares, validator PKs) - The goal is compute this pairing: `E::pairing(dkg.pvss_params.g, *y_i) == E::pairing(a_i, ek_i)` - `verify_aggregation` - Uses PVSS transcripts (coeffs) - Uses DKG params to call `verify_full` - To perform `verify_full`, we also need public keys of all validators involved - What else do we need? - We need to prepare commitments by doing `dkg.domain.fft_in_place(&mut commitment)`. So we need a `domain`, which we can instantiate on the client side if we know `shares_num`. - In theory, we just need to have the aggregate on chain instead of posting all of the transcripts - Does that offer sufficient security? - On a completely different note: then, how do we say which transcript was bad and which - Raw notes: - "But local verification hints at another development. We've been talking about the "fail-over" or other "error-modes" for the ritual, where an honest node rings an alarm if verification fails on their end. Maybe we could use client-side verification failure as another signal to fail-over, i.e. we need at least one node and client to raise the alarm at the same time." - "If we formulate the "fail-over condition" as having "at least one honest node signal AND one client signal", then I'd argue we don't need a separate period for the client: If at least one honest node signals an issue, we can "mark" DKG ritual as "challenged" after the secret-sharing period (the "transcript posting" period?). And then, the client acts as a final arbiter and can trigger a fail-over. Arguably, the client should always perform local verification - there's a marginal possibility that there is no single honest node in the cohort. In that case, the client may decide not to use the DKG but they will not get their money back and it's not possible to slash bad nodes. Which is pretty bad, but without trustless arbiter this is the trade-off that we're given. Side note: Since we don't have slashing etc. at the moment, the fail-over is only relevant for auditability of nodes. And so since there are no economic incentives certain assumptions about the honesty of the nodes or the client are different. " - - Can we do better than Approach 1 using KZG? ## User flows Which user/staker flows require on-chain transcript verification or some other gas-intense operation? - DKG ritual initialization - Requires a transcript verificaiton - DKG key recovery - Requires 1..n off-chain computations (one per recovered key) - Usually followed up by a single key refreshing - DKG key refreshing - Requires a transcript verification - Stake unbounding - Requires a key recovery to replace a node in each of it's active rituals - Should be followed by key refershing to revoke staker from their DKGs ## Proposed design for the PoC - An optimistic verification using KZG - Short protocol description: - Nodes are verifying transcripts p2p - An honest node receives a bad transcripts and starts a dispute period - Nodes are placing commitments on-chain - Optimistic verification period is started - Any node can challenge a commitment and then submit a proof - The proof in question is either polynomial opening on chain or a proof of the opening computed off-chain and verified on-chain, whichever is cheaper - In this protocol, KZG commitments replace Feldman's commitments in Ferveo PVSS scheme ```mermaid sequenceDiagram Note right of validator: p2p verification autonumber loop For n validators validator->>validator: transcript = deal() validator->>sidechannel: submit(transcript) validator->>validator: verify(transcript) end Note right of validator: validator triggers a dispute validator ->> dkg.sol: dispute() autonumber loop For n validators validator->>dkg.sol: commitment = commit(coefficients) end validator->>dkg.sol: settle() alt verify commitments on-chain validator->>dkg.sol: verifyMulti(commitment, proof) else verify commitments off-chain worker->>sidechannel: getTranscripts() sidechannel-->>worker: transcripts worker->>zkvm: proof = prove(verifyMulti(commitment, proof)) zkvm-->>worker: (proof) worker->>dkg.sol: submitProof(proof) dkg.sol->>dkg.sol: verifyProof(proof) end dkg.sol-->>validator: (ok) Note right of validator: dispute is settled ``` ### Notes on this design - Replacing (eager) verification problem with an optimistic (lazy) verification and a data availability (DA) problem - Nodes are verifying each other (p2p) with an on-chain dispute resolution - Honest majority assumption - DA for the protocol ("sidechannel") - p2p for the green path - Nodes share and verify transcripts - Distributed storage for red paths - Dispute resolution requires transcripts availability - Leads for DA - IPFS, Arweave, Filecoin, Kyve (Arweave), Chia - Celestia, Polygon Avail - [Notes on DA](https://hackmd.io/2ghVcdCgQQCFU9kIKRbzWA) - Proof verification cost - Verifier gas cost vs fee structure of Bonsai (Risc0 rollup architecture) ### The Proposed reseach roadmap In order of priorities: - Risc0 - Should be the easiest and the quickest way to get it done - Cost structure uknown, requires a prover - Solidity - Usefull for benchmarks - Run it on Polygon or zkSync Era, or any other EVM-compatible rollup/zkvm - Probably still too costly for EVM, but not a lot of technical risk - zk-rollups (EVM incompatible) - Require bls12-381 impl in a specific SC language (like Cairo) - Halo2 - Difficult, requires deployment of aggregation infra ### To Do - [ ] How to reuse Ethereums ceremony? - [Notes](https://hackmd.io/KkuNdDiETvK_iDfjUfz6Ow - [x] Sequencer client for Rust - [ ] **Reusing trusted setup with KZG10 or one of it's variants** - [x] Polynomial degree mismatch issue research - Fixed by enforcing polynomial degree SonicKZG10, MarlinKZG10 and some fortified variants of KZG10 - [ ] Benchmark on-chain opening vs off-chain opening verification - [ ] Estimate the on-chain verification cost (Solidity, bls12-381) - Maybe we don't have to implement it - guesstimate it using other implementations - [ ] Research using coprocessor (Risc0, Axiom, etc.) - **Risc0** - [ ] Consuming results from Bonsai rollup - Nodes/users can read data from Bonsai just from like any other side-channel - [ ] Sketch a design for that; we need to sketch the coprocessor architecture as a whole anyway - [ ] Verify that we can sideload data into prover (coprocessor) without posting it on-chain; [Risc0 Discord](https://discord.com/channels/953703904086994974/975108547589324910/1090170351490703420 ) - [ ] Some issues with arkworks crates (`ark_serialize@0.3.0`); [Risc0 Discord](https://discord.com/channels/953703904086994974/975108547589324910/1090170351490703420 ) - [x] Tried implementing serialization by hand, runs out of memory (64GB) - [ ] Use zero-copy implementation instead - **EC curve support and BigNumber support and not here yet, so computing a KZG opening is not possible yet** - Awaiting for PRs to be merged: - [#466](https://github.com/risc0/risc0/pull/466) - [ ] Research in-circuit verification of KZG opening - Complicated, tech heavy approach - [ ] **Halo2** - Need a Solidity verifier for Halo2 proofs. There is some work done by [Scroll](https://github.com/scroll-tech/halo2-snark-aggregator) - we need to use the [PSE Halo2 fork](https://github.com/privacy-scaling-explorations/halo2) that uses KZG* instead of IPA; - *note that this is not the in-circuit circuit verification case - here we simply mean the SRS, or the trusted setup - Similarly, [Axiom has released](https://www.axiom.xyz/blog/open-source) their aggregation & verification tools for [Halo2](https://github.com/axiom-crypto/) - - It's based by aforementioned work by PSE - I was told that the work by Scroll is not finished yet and their Halo2 aggregation doesn't work yet. IDK what is the Axiom status. - Someone else told me it works, so I guess I jsut need to try it for myself and see. - **Axium (`axium-crypto`) released a promising set of tools to work with it, so I'm going to be looking at that** - **They actually supported someone who implemented KZG opening, but with bn254** - The author expressed interest in supporting development for bls12-381 version - Shallow research on other approaches - ZK-rollups - ZkSync Era ([1](https://twitter.com/krzKaczor/status/1641454252504797195), [2](https://twitter.com/_bfarmer/status/1641499616523812866)) - Makes sense to try all EVM-compatible zk rollups etc. ([any type should work](https://vitalik.ca/general/2022/08/04/zkevm.html)) - Starkware (Cairo) - I have a lead for [someone](https://github.com/keep-starknet-strange/garaga) who may be interested in that - https://twitter.com/apolynya/status/1644164719475826689 - [Succint](https://docs.telepathy.xyz/protocol/circuits) has broke some ground on BLS12-381, althought they are primarly interested in [signature aggregation](https://github.com/succinctlabs/telepathy-circuits) - [ ] Assess usefullness - [ ] Add optimistic verification design to the DKG design doc - [ ] Describe in detail how KZG commitments are used to replace Feldman's commitments - Focus on ensuring feasibility - [ ] Use this description to update "Proposed design for the PoC" - [ ] Add pessimistic case handling - [ ] Add a sequence diagram for the complete DKG process, including sharing aggregate with the user - How does that happen, p2p? - [ ] Note any blank spaces in the design and add questions to ask later - [ ] Design p2p verification in Ferveo - What do nodes verify? `verify_full`? - [ ] Sketch a protocol for on-chain commitment and verification that could replace Ferveo check #3 - **Researching KZG implementations** - Using arkworks implementations - [x] Create a stand-alone KZG example - [x] With a single commitment - [x] With batched commitments - [ ] Perform a trusted setup - [x] Locally generated - [ ] Using Ethereums KZG ceremony - [ ] Use KZG to commit PVSS transcript material on-chain - [ ] Verify KZG commitment - [ ] Off-chain - [ ] On-chain