The Pipeline of ZKP (Scroll talk notes)

> Talk: [Scroll workshop at Jessy's hacker house](https://drive.google.com/file/d/12-e1g8Ad7q0avIOge-NELNBaDlpmk0TV/view) > Author of this post: [0xSachinK](https://twitter.com/0xSachinK) > If you're joining from Twitter, please note that this is not meant to be a complete, polished article. The writing is concise, with few stop words, and is designed to serve as quick reference notes for the second part of the above mentioned Scroll talk. Modern modular proving systems can be broken down into Frontend and Backend. This post attempts to explain all the frontend and backend constructions available to us. And then goes onto explaining how these are stacked together to create our most commonly used proving systems today. **Frontend** Computation ----- *Arithmetization* -----> Constraint System **Backend** Constraint system ---- *Information-theoretic compiler* -----> Proof system ---- *cryptographic compiler* ----> Argument # Frontend 4 different types of frontend arithmetizations - R1CS (linear combination) - Plonkish (custom gate + lookup + permutation) - AIR (transition constraint + boundary constraint) - CCS (Customizable Constraint System - R1CS + Plonkish) ## Arithmetization ### R1CS - Addition is free - Each multiplication is one constraint - z = [1, x, w] is the witness vector - Az * Bz = Cz - where * is Hadamard product (Basically elementwise multiplication) - A, B, C are fixed matrix for the circuit - Dimensions of A, B, C are same - Dimensions of Az, Bz, Cz are same - Az is A matrix multiplied by witness vector z, same for Bz and Cz - Number of rows in the matrices equals the number of constraints in the circuit - Each row, encodes the following constraint: - (a1*w1 + a2*w2 + ... + an*wn) * (b1*w1 + b2*w2 + ... + bn*wn) = (c1*w1 + c2*w2 + ... + cn*wn) - Proves that - Given x - I know w - Such that the above holds ![](https://hackmd.io/_uploads/Sy9gbrco2.jpg) ![](https://hackmd.io/_uploads/By8q4Yii3.png) ### Plonkish - Custom gate + lookup + permutation - Witness: (w1, w2, w3, ... wn) - Not just one dimension like R1CS, but layout is in two dimensions (a table) - Can define arbitrary shape of relationsip over cells in the table - And define arbitrary number of multiplications in each custom gate - Custom gate: (w_1 + w_2) * (w_k+1 + 2*w_2) * (w1) = 1 - Every new format of custom gate will introduce some overhead - Lookups: - (w1, w2) belongs to (T1, T2) - Replaces range check using lookup - Permutation: - Unlike R1CS, you can't access multiple witness values in the same dimension. - Need to use permutations to link different witness values. - w1 = wk-1 = w2k - Ql * a + Qr * b + Qa * c + Qm * a * b + Qc = x - All are only vectors - Permutation is used to link vectors a, b, c - Proves that: - Given x - I know a, b, c - Such that the above holds ![](https://hackmd.io/_uploads/rJ82xBcsn.jpg) ![](https://hackmd.io/_uploads/ryUhVYjoh.png) #### Plonkish Vs R1CS: - R1CS additions is free; In plonkish you need to define custom gates for it. - R1CS witness vector is smaller; But in plonkish they are longer because we need to permute different variables - Plonkish has custom gates and can define gates with more than degree 2; R1CS is limited to degree 2 - **For** **use cases** - R1CS is good for linear combination and general - Plonkish/AIR is more uniform and customized - For example, you massively reuse the same operation then turn that into a custom gate ![](https://hackmd.io/_uploads/Hy8JZB5s3.jpg) # Backend **Backend** Constraint system ---- *Information-theoretic compiler (Poly IOP)* -----> Proof system ---- *cryptographic compiler* ----> Argument > There are other schemes in the cryptographic compiler, but Polynomial commitment scheme (PCS) is the one that that affects the metrics that we care about the most. ## Polynomial IOP (Interactive oracle proofs) - Abstract. No concrete implementaiton. Just a model of how prover and verifier interact with each other. - There is an oracle that prover sends a polynomial to and verifier queries that oracle - Polynomial IOP because proofs sent from the prover to the verifier are related to polynomials - ChatGPT: what is a polynomial IOP in the context of zksnarks? - In essence, the prover commits to a polynomial and provides the verifier with a way to make queries about this polynomial.T - The prover can make multiple such commitments over the course of the interaction. - The verifier can't directly access these polynomials, but they can make queries to them, and the prover has to respond consistently. - Concrete replacements to implement a polynomial IOP and make it real, we replace abstract concepts with cryptographic primitives - Example, for randomness, we use **fiat shamir transformation**, we generate randomness by hashing previous transcripts - For Polynomials, we replace the imaginary oracle with **polynomial commitment schemes** (PCS) ![](https://hackmd.io/_uploads/B18V8Ysjh.png) ## Polynomial commitment schemes PCS influences concrecre properties a lot - Trusted setup, security assumption - Prover efficiency, Proof size, verifier efficiency ### KZG - Universal trusted setup - Security releies on Discrete logarithm problem (not quantum secure?) - Prover is doing MSM (Multi scalar multiplication) - MSM invovles multiplying each of a set of group elements by a respective scalar ([*alpha*]G), and then summing up the results - Verifier is doing pairing - Pairing is a bilinear operaiton that can be performed on two group elements. - Small proof ### FRI-Based - Used in STARKs - No trusted setup - Doesn't rely on elliptic curves; Only relies on hash collisions - ECs are limited to choice of large fields generally 254 bit long - Hash functions are more relaxed - Prover is doing hashes & FFTs - FFTs have very irregular memory requirements - Verifier is doing hashes - Large proofs ### IPA (Inner product argument) - No trusted setup - Relies on discrete log problem for security - Prover is doing MSM - Verifier is doing MSM - Medium size proof ### Multlinear PCS (for sum-check based constructions) - Spartan, Nova etc. Became famous 5 months ago. - Doesn't require elliptic curve. Hence relax choice of fields. - Linear time prover. Need very few field operations. - Has large proof. ## Examples of zk protocols #### Commonly used | Proving System | Arithmetization | Polynomial IOP | PCS | |----------------|-----------------|----------------|--------------| | Halo2 (Zcash) | Plonkish | Plonk IOP | IPA | | Halo2 (PSE fork)| Plonkish | Plonk IOP | KZG | | Plonky2 | Plonkish | Plonk IOP | FRI | | STARK | AIR | STARK IOP | FRI | - Groth 16 is another commonly used zkp. It is a non-modular proving system based on linear PCP and it can't be broken down into above modules. - STARK vs SNARK - Main difference is just different PCS. - STARK uses FRI and SNARK uses non-FRI techniques. #### New protocols based on multilinear PCS Of course, here is the updated table including Spartan and Hyperplonk: | Proving System | Arithmetization | Polynomial IOP | PCS | |-------------------|-----------------|----------------|-----------------| | Spartan | R1CS | Spartan IOP | IPA derived | | Hyperplonk | Plonkish | Hyperplonk IOP | KZG/FRI derived | - Nova, SuperNova, HyperNova, Protostar etc. ## Another view to look at different zk protocols > Note: I still don't get this part completely. So take everything I have written below with a bag of salt. Generic idea: - Prover knows some witness information - Prover commits to the witness polynomials & sends to verifier - Verifier send prover some random challenge - Prover then commits to some combination of those witness & sends to verifier - Verifier opens all the polynomials ### Plonk / Stark - Prover commits to A(x), B(x), C(x) and D(x) witness vecotrs - Verifier sends randomness - Prover commits to Z(x) witness vector - Prover commits to vanishing polynomial h(x) - Verifier opens all polynomials ![](https://hackmd.io/_uploads/Syz0Xqojh.png) ### HyperPlonk - Based on sumcheck protocol - Prover commits to A(x0, x1, ...), B(x0, x1, ...), C(x0, x1, ...) and D(x0, x1, ...) - Replaces PLONK's univariate polynomials with multivariate polynomials - Benefits: - Linear time prover - Avoids FFTs. To multiply two univariate polynomials you need FFTs. Whereas, multivariate polynomials can be multiplied without FFTs. - Verifier sends randomness - - Prover commits to Z(x0, x1, ...), ... witness vector - Prover commits to vanishing polynomial h(x) - Verifier opens all polynomials ![](https://hackmd.io/_uploads/BJnBLcoi2.png) ### Spartan / Superspartan - Prover commits to only Z(x0, x1, ...); cause R1CS - Prover then commits that Az * Bz - Cz = 0 satisfies - Prover then commits that Ax was generated correctly - Then offline memory check for opening those multilinear polynomials ![](https://hackmd.io/_uploads/S1MGL5jih.png) ### Folding - Instead of running the whole process m times, the idea behind folding is you that you can separate first part and second part. - All the A's, B's and C's (aka witness) commitments are folded together by random linear combining together. - Later, opening, which is also quite extensive, is only done once. ![](https://hackmd.io/_uploads/BJGED5ssh.png) ## Open Questions - What is spartan IOP? - In Plonkish arithmetization: Every new format of custom gate will introduce some overhead. What does this mean?