Nova - HackMD

# Nova Nova is a prover system. The proof produced by a nova prover is not a SNARK, it is not succint. But the cool property of nova is the fact that has the "folding" property backed into the prover system. ### Intro to Recursion The main use-case of Recursion is IVC (Incrementally Verifiable Computation). Incremental Computation ![](https://hackmd.io/_uploads/SJGZuUb22.jpg) - $F$ is a function that repeats itself - $z_0$ is the starting input - $z_1$ is the output of F that is also feed into the next step of the computation - $w_i$ is the extra witness added to the function that is also used, together with $z_i$ to create the next $z_{i+1}$ Incrementally **Verifiable** Computation ![](https://hackmd.io/_uploads/HkDDF8-nn.jpg) At each step you can generate a $π_{i}$, this is a succint proof that provides computational guarantee of the correct execution of the Computation up to the step i. The cool part is that if you want to generate $π_2$ you don't have to regenerate the whole chain but you can simply add a step to $π_1$ ### Snark Based IVC ![](https://hackmd.io/_uploads/SJRG28W3n.jpg) - Note that you can also add $w_{i}$ to the previous construction Regarding the size, $π_{i}$ should remain constant size at every step, they shouldn't grow with the number of steps being performed in the IVC. Snark Verifier maybe for example a pairing check in the case of Groth16, or similar to the Snark Verifier Library, an Halo2 verifier The Snark Verifier box is also named as **recursion overhead** and it can be very expensive. For example, for groth16 verification inside a snark, pairing is very expesive, so it a huge recursion overhead! Ideally, you want $π_{i}$ to be constant size or you want it to not grow faster than n where n is the size of your computation. ### Nova IVC Nova can be seen as a machinery that allows you to do IVC. But what is IVC? In order to get to Nova we'll first make a step back #### BLS Signatures BLS Signatures are an elliptic curve based signature algorithm. Its main advantage comes for system that use aggregate signatures. The key mathematical operation for BLS signatures is a bilinear pairing. - [BLS Signature Aggregation: Under the Hood ](https://mirror.xyz/0x6afeB3d9E380787e7D0a17Fc3CA764Bb885014FA/D3g-4UPRLkAnug-p6AZYfjgXWo-psaTulyu3SaL35vg) - [Pairings or bilinear maps](https://alinush.github.io/2022/12/31/pairings-or-bilinear-maps.html) The bilinear property guarantees that $e(P^{a}, Q^{b}) = e(P, Q)^{ab}$ where $a$ and $b$ are scalars and P belongs to G1, Q belongs to G2. $e$ is the operation defined as bilinear pairing. ![](https://hackmd.io/_uploads/rJP5dgGTn.jpg) You can sort of move the scalar between points and create new operations.A practial explanation on Bilinear Pairing can be found at paragraph [3.6.1 - Why and How zk-SNARK Works](https://arxiv.org/pdf/1906.07221.pdf) Here are the tricks that can be performed with pairing ![](https://hackmd.io/_uploads/HyUSo-G63.png) Reminders: - Scalar Multiplication of a point is a composition of point addition. For example, a public key is the result of applying a scalar multiplication on the base point by the secret value. The bilinear property ensures that exponentiation in one group corresponds to multiplication in another, enabling many cryptographic protocols. Core operations of BLS: - **keys**: $sk = α$; $pk = g^{α}$ - **signature**: $∂ = H(m)^{α}$. $m$ is the message and $H(m)$ is a point on the curve - **verification** $e(∂, g) = e(H(m), g)^{α} = e(H(m), g^{α}) = e(H(m), pk)$ The usage of bilinear pairing also allows for efficient aggregation. First of all, let's take a step back and define homomorphic commitments. A commitment means committing to a certain value while keeping it hidden with the ability to reveal the commited value later. A homomorphic commitment scheme has the additional property that you can perform certain operations on the commitments, and these operations correspond to operations on the hidden values. For example, if we consider $C(a)$ as the commitment for value $a$ and $C(b)$ as the commitment for value $b$, if the commitment scheme is homomorphic we can compute $C(c)$ where $c = a + b$ just starting from the sealed commitment. In the context of BLS signatures, the homomorphism is achieved using pariings and it allows for efficient signature aggregation. Any number of signatures on the same message can be verified in constant time, which is the time it takes to verify 2 pairings. In particular a signature can be seen as a commitment to a tuple $m, sk$. Let's start by aggregating the public keys. We need an aggregated public key to verify an aggregated signature. We get this simply as point addition within the elliptic curve group. Note that an aggregated public key is mathematically indistinguishable from a non-aggregated public key. $pk_{agg} = pk_{1} + pk_{2} + ... + pk_{n} = g^{sk_{1}} + g^{sk_{2}} + ... + g^{sk_{n}} = g^{sk_{1} + sk_{2} + ... + sk_{n}}$ So now let's perform the verification on this aggregated public key. $e(H(m), pk_{agg})$ $e(H(m), pk_{1} + pk_{2} + ... + pk_{n})$ $e(H(m), g^{sk_{1} + sk_{2} + ... + sk_{n}})$ $e(H(m), g)^{sk_{1} + sk_{2} + ... + sk_{n}}$ $e({H(m)}^{sk_{1} + sk_{2} + ... + sk_{n}}, g)$ $e(∂_{1} + ∂_{2} + ... + ∂_{n}, g)$ $e(∂_{agg}, g)$ Note that BLS signatures are efficient for many signature verficiations but are significantly slower than other alogs such as ECDSA when it comes to single signature verification. **The core thing to notice here is that this efficient aggregation is possible because Pairing Groups are homomorphic**. No matter, how many signatures you want to verify, you can do that by performing 2 pairing checks (costant verification time). This is possible due to homomorphic commitments. In particular, additive homomoprhic commitments. The main goal now is to extend this idea of performing a constant verification for any general function, not only BLS signature. In that case the idea of aggregation is related to many runs of the same function. You can start to see how it fits into the idea of IVC, in which each run is a step of performing the same function. Carrying over this analogy, you would be able to show that you performed a whole IVC correctly by just showing the last element of the IVC. #### Simple Relation: Dot Product Goal here is to generate a succint proof of performing a valid Dot Product operation. The core tricks used here are: - Random Linear Combinations - Homomorphic Commitments RLC works by starting with a set of values/vectors $v_{1}, v_{2}, ..., v_{n}$. You also have a random source of randomness. You compose the values with the randomness in a linear combination, in which the randomness is used as coefficients to the values. ![](https://hackmd.io/_uploads/HkygbEGah.jpg) ![](https://hackmd.io/_uploads/SkpSX4G6n.jpg) Step B4 is possible thank to additive homomorphic commitments. Given 2 commitments (or more) and performing addition and scalar multiplication operations on this commitments, the resulting commitment mantain the same algebraic relations as the operation would have been done on the uncommitted value. In the example, that's the case that $\overline{z_{3}}$ cannot be derived from a $z_{3}$ that doesn't satisfy the relation $az_{3} = z_{1} + rz_{2}$ The cool thing is that each commitment of each vector is order 1 size. It is almost that we generated a succint proof for showing dot products. Going back to the previous anology, now we have a bag of dot products, and we can convert it into one simple a succint (aggregate?) proof. Dot products are not sufficient to do cool stuff tho. In order to work in SNARK you at least want the ability to do addition and multiplication, this would allow you cover NP class problems. Everything in NP is Turing Complete. #### Get to Turing Completeness and NP Let's start with [R1CS](https://medium.com/@VitalikButerin/quadratic-arithmetic-programs-from-zero-to-hero-f6d558cea649). R1CS constraint system allows us to generate constraints for any type of NP relation. ![](https://hackmd.io/_uploads/B1gDiEfan.jpg) ![](https://hackmd.io/_uploads/rkPM6VzT2.jpg) The computation is transformed into a R1CS structure at compilation time ``` A [0, 1, 0, 0, 0, 0] [0, 0, 0, 1, 0, 0] [0, 1, 0, 0, 1, 0] [5, 0, 0, 0, 0, 1] B [0, 1, 0, 0, 0, 0] [0, 1, 0, 0, 0, 0] [1, 0, 0, 0, 0, 0] [1, 0, 0, 0, 0, 0] C [0, 0, 0, 1, 0, 0] [0, 0, 0, 0, 1, 0] [0, 0, 0, 0, 0, 1] [0, 0, 1, 0, 0, 0] ``` The witness is provided by the prover and should satisfy each gate `[1, 3, 35, 9, 27, 30]` Given the previous goal, we could do the same thing we did before for dot products with R1CS constraints of the form $A{z} * B{z} = C{z}$ where $z$ is the witness and $A, B, C$ are squared matrices. If we did the same thing we did before, using RLC. The math would end up in cross terms and the folded version would end up having both $z1$ and $z2$, and the proof would need to have more data than it is necessary. Alternatively, we define a slight better version of R1CS, defined as **relaxed R1CS** ### Finally getting to Nova You can think of Nova as a pre-processing step for snarks. Nova prover doesn't generates Snark. It works in combination with a general purpose snark system. You basically get the best of both world: very fast proving time and very cheap and fast verification (given by the succintness of the proof). Nova as proving system relies on the same principles of BLS signature. In fact BLS signatures allows to easily aggregate a big set of signatures as long as these all signed the same message and only have to (cheaply) verify the aggregate. Following over the metaphore, each BLS signature is a statement "I control a private key behind a public key and I signed over a specific message". With Nova we can kinda of extend the any type of NP statement. But this NP statement needs to be the same for each incremental step. With Nova, rather than having to verify each statement individually, you can fold them together and then you only have to verify the last folded instance. In particular, you verify the last folded instance inside a snark to leverage the succintness of the proof. In such way we feed less work to the snark prover. The constraint of Nova (compared to STARK for example and analogously to BLS signature) is that every incremental statement has to have the same structure. The concept of Nova is also highly related to VDF (Verifiable Delay Function). These functions are a concept in computer science that was added to simulate the time. These are function that requires sequential (and always equal in terms of computation to be performed) steps and these steps cannot be parallelized. The Verifiable part is important because we need a verifier to be able to verify that these delay steps were performed correctly without having them to go through each of them again. For example, let's say that we have a verifier that is bounding the prover to perform an action that takes 5 minutes to perform. We want a function that bounds the prover to some time delay (it needs to be sequential and not parallelizable), and we want the verifier to be able to verify its correct execution without having to go through the 5 minutes of computation all over again. Let's start with an example of proto VDF. For example, if we are working in a finite field, we can take the fifth root of a number and do this sequentially. Finding a fifth root of a number equals to find which number, when raised to the power of five, gives you the original number. This is called the fifth root. Let's say that finding the fifth root of a number in a finite field takes 100 operation. So the proving part takes 100 operations. The cool part is that to verify that this operation was performed correctly it takes only 4 operations. Namely multiplying the result four times! ![](https://hackmd.io/_uploads/S1Uk-uS5h.png) Now we want to make it even faster to verify such VDF, in particular we don't want the verification time to be linear to the number of steps. To do so we compact the verification inside a snark. Of course proving the computation "finding the fifth root" in the snark would be very expensive. Instead, what we do is proving the computation in the other order (from right to left) inside the snark. And what we do here is to take batches of these fifth root calculation and fold them together such that at the end we only have to verify a single batch. The concept of Nova is very relevant to EVM. The role of a EVM is to take a batch of transactions add them to its state and, eventually, update its root hash. The role of the node is verify the validity of the transactions added to the state of the EVM. These process is distributed across different nodes to reach a consensus. A Snark proof system would help us in this case because, together with the computation of passing a batch of transaction to the state of the EVM we also generate a zkSNARK proof π that this computation was performed according to the rules. This proof can be easily verifiable in a few milliseconds rather than naively going through all the computation all over again. In such scenario, each node has only to verify the proof to reach consensus. In the case of a L2 Rollup, the verification of such proof can be actually performed programmatically inside a smart contract on L1. Since the EVM has a fixed structure that works with fixed cycles, we can use this technique of folding of NOVA to pre process the transactions (namely verify its correct execution), and eventually feed them into a ZKSnark. This is very similar to what Polygon Hermez does. The only difference is that they use a STARK prover for the pre processing phase. With Nova, we don't really need to process each transaction inside a snark. Instead we can leverage the folding property of Nova to reduce the size of the statement that we are trying to proof and, eventually, verify this statement inside a snark. Tools at your disposal: - Snarks, very expensive in terms of proving, it gives you compression to prove size - Nova, not very expensive in terms of proving thanks to the folding property. You get the best of both worlds: - Fast proving thanks to Nova folding - Cheap verification thanks to succintess proof generated by the snarks Why Nova is so cheap compared to the other Snark solutions? - There's no FTT (operations used to build polynomial commiments) but only multiexponentiations. FTT are a much harder computation to be perform and it consumes a lot of memory within your machine. If we want to prove the entire state transition of a EVM, it is unfeasible to generate such proof on a laptop because we have some rigid memory constraints. When generating nova proofs we don't have such memory consuptions so we might be able to generate such proof even on our device!. - Since we don't need FTT, doesn't limit our choices of elliptic curve. (how is this possible?). These curves also don't need to be pairing friendly. - For example you can use Secp curve inside Nova! You can actually tune in the curve that you prefer! What you need is actually a cycle of elliptic curves. For secp you get the dual friend which is secq. These two curves compined form a cycle that we can use in Nova. ### What is a cycle of elliptic curves An elliptic curve has a base field `Fq` and a scalar field `Fp`. In particular the base field is the field in which the point of the curve live on. The scalar field is the field in which operations such as additions and multiplications are performed. A cycle of elliptic curves is when you are able to use 2 elliptic curve such that the base field of a curve is the scalar field of the other (and viceversa). ### Recursion and Folding In general we can define recursion as the ability to verify a snark proof inside another snark proof. Recursion and folding are not the same thing! When you fold couple of statements together in order to generate a folded proof to be verified efficiently, you also need to verify that the folding was done properly! This can be done very cheaply actually. You can do it as you go along in your proof generation circuit. In order to verify that the folding was done correctly you are working in the base field, but the constraint system is using the scalar field! The take away here is that we are not really compressing and folding the nova proofs all at once but we have a sort of recursive chain of folding, where at each step we check that the statement to be proven is correct **and** that the folding in the previous step was performed correclty. ![](https://hackmd.io/_uploads/ryfUDYr92.png) - The line in black represent the running instance, this is where the statetement progressively folded will be accumualted - The running instance starts empty - At step 1, the running instance and the result of the function F are folded together to product the next iteration of the running instance. - At step 2, it is also necessary to prove that the folding happened properly. At this step I'm gonna augment F to F' that will also verify that the folding performed in the previous step was performed correctly. - This also applies to step 3 - Eventually the π generated after step 3 tells us that the repeated function F was performed correctly **and** that the folding at each step was performed correctly. ### Performance Several time faster than plonk and groth 16 proving systems. That's because: - There's no FTT - The relation between proving time and the number of constraints n is a factor of 2 (the constant is 2 basically). You basically have to do 2 multiexponentiation per constraint (is that correct?) - You don't have to use a pairing friendly curve! - The recursion overhead (namely proving that the folding was performed correctly) is very tiny! - It provides constant size performance ### What can be achieved Holy trend of snark is recursion - Split a large computation/statement into smaller chunks - Delegate the computation of such smaller proofs. Distributed provers You basically unlock parallelism and distributed prover! In general Nova (and other folding schemes) gives us **IVC** namely that at every step the full computation up until that point can be verified with a constant size proof. At every step you prove: - that the incremental statement is true - that the batching of the previous accumulator into the new accumulator was performed correctly - The moment in which you verify the proof is delayed until the very end. ### Other advantages of Nova - There's no trusted setup! It's a completely transparent proving system ### Resources - [ZK Witheboard Sessions w/ Justin Drake](https://www.youtube.com/watch?v=SwonTtOQzAk) - [Nova Scotia - GitHub](https://github.com/nalinbhardwaj/Nova-Scotia). Uses nova as a prover system for a R1CS compiled circuit (and the related witness) generated using circom. - [Workshop - Nova Scotia](https://youtu.be/N6RW_YhLMNw) ### High level question with Nova - Do we have a Halo2 Nova Verifier? - Could you build a ZK-EVM/VM using Nova as pre-processor? Or do you need the flexibility of starks? - What makes it possible to perform this whole thing without a Trusted setup? Is this because there are no polynomial commitments involved? - What is Paranova? - Can you explain the intuition behind the need of using cycles of curves? And why this curve doesn't have to be ZK friendly? ### Working with Nova Scotia - notes - Should test it with Cpp prover. Prover should be even faster - I don't see a particular benefit of further compressing the proof, the verification doesn't shrink a lot to me! ### Working with Nova Scotia - questions - The first step is about creating the public params for the circuit artifcats is slow. Altough this has to be done only once so it doesn't need to be executed at proving time. - What is the difference in the cycle of curve used? Between Pasta and bn254? in terms of performance - What's the difference between primary and secondary circuit? - What is the second 0 value added there? - Is there a way to incrementally add a new step to the circuit? It might be delayed in time. All the private inputs are not known now.