In this note we describe Halo, a recent zero-knowledge proof system. The goal is to walk through the scheme step by step and explain the cryptographic primitives at play.
We look at the problem of proving a very long sequence of statements.
For instance a blockchain-rollup operator needs to accumulate as many transactions as possible before providing a proof of correct execution of all those transactions to the blockchain. The proof has to be succint, verified in constant time (ideally) no matter the number of operations in the sequence.
Halo relies mainly on the Pedersen commitment scheme, along with an opening technique borrowed from bullet proof (with a little modification from its original description).
Let be prime numbers, an elliptic curve defined over such that contains a group of rational points of order . Let be an integer and a list of points in .
Let a polynomial.
The Pedersen commitment to is
Binding property: Suppose and are different polynomials, with .
Since are in a cyclic group they can be written as where is a generator and are in . implies that , so is orthogonal to , therefore belongs to an hyperplan of . The probability that is in this hyperplan by luck is .Hiding property: The version that we described is not hiding. To be hiding is masked by adding a random component R is a random point on the curve and is a random number. We stick to the non hiding version to not overload notations.
There is a way to open a Pedersen commitment (adapted from a more general technique in bulletproofs) which involves logarithmic (in the degree of the polynomial) communication complexity between a prover and a verifier.
We suppose that the degree of the committed polynomial is a power of , equal to .
If and are the coefficients of a polynomial and the points used for computing Pedersen commitments, we write for .
We also add the subscript and to indicate that we take the left part or the right part (of size ) of a list of elements.
Opening a Pedersen commitment at
Let .
Completeness: After receiving , the prover will divide the size of the vectors by , by folding them like this:
Observe that
If the procedure stops here, the prover sends , of size , to the verifier, the verifier folds as the prover did and checks that
The full procedure is a repetition of this step times. The fully folded versions of , and are respectively , , and , where .
Soundness: The soundness comes form the fact that are sent prior to receiving the .
If the claimed is false, call this value , the verifier starts with . To make the proof works, during the first round the prover sends , computed correctly, and after receiving , the prover hopes that , where are the correct folded version of . If this happens, the prover continues to send the correct and he is saved since the folded commitments will coincide after this event. The probability of such a collision is , from the binding property of the Pedersen commitment. Since the prover can hope for such a collision at each round, he has chances to win, which is close to none.
It is important to note that Pedersen commitments are homomorphic for : we have
This property allows a prover to prove a batch opening of several commitments at one point with one proof:
Let a list of Pedersen commitments. To prove to a verifier that the commitments open to at :
The completeness follows from the homomorphic property of the Pedersen commitment scheme. The soundness comes the fact that the commitments and purported values are sent prior to receiving , so if one of the is wrong, the prover has chances to win from the soundness proof of the bullet proof opening argument.
A SNARK (Succint Non interactice Argument of Knowledge) allows a prover to convince a verifier that he knows (private), such that , where is public (think of as a hash function for instance). The prover doesn't provide , and verifying a proof is much cheaper than computing .
We will indifferentially use the term "circuit" or "constraint system".
We specify some useful notations:
Example: if one has circuits , and associated public/private inputs such that . We create the proof attesting of this fact like this: . Verifying both claims is done by running .
We can now give an informal definition of what is meant by a SNARK in this post:
We suppose that a SNARK is a tuple , with the following properties:
The third point means that is cheap to run as long as there is small.
For example, if is a circuit, and we have associated public/private inputs such that . If , then checking and attest in both cases that satisfies the circuit, but running takes a few seconds and is very cheap in memory, while checking may take a long computational time, but more importantly is extremely memory greedy.
Given a Snark as it was defined in the previous section, say we have a large set of circuits with associated inputs , and we want to output a proof that for all .
The first solution is to feed all the circuits at once and compute .
The issue is that is bounded in memory, so brute-force proving all the statements at once will eventually exhaust the memory of . Moreover the complexity (in time) grows linearly with the cumulated number of constraints of all circuits , so the time to compute a gigantic circuit needs a big computational power (typically it cannot be ran on a laptop).
We know that is an algorithm that is cheap to implement. The idea is to express this algorithm as a r1cs, the language supported by our SNARKs, and see as a particular satisfiability problem (constraining so that it returns ) to be fed to the prover algorithm . If one has a proof , and associated public input such that returns , then (there are no private inputs) outputs a proof attesting that , otherwise said is a proof attesting that a proof is correct.
An important observation is that does not take a lot of memory when fed to by assumption, since is cheap and grows linearly in memory with the size of .
Having observed this, the idea is to sequentially handle the circuits to output a proof that , and that the proof of the previous statement is correct. More precisely, if is a proof of a previous statement, with associated public input , then at step , we run
The new generated proof attests that and that . So we have a way to build a proof that not only attests the validity of a circuit, but also attests recursively that all previous proofs are correct. It's an example of Proof Carrying Data (PCD) infrastructure. There are several different constructions of PCD, the most general one is that instead of having a sequence of circuits, we have an acyclic graph. Here we stick to the sequential version.
There is a last problem that arises when doing that. When accumulating proofs, the list of public inputs grows with the number of proofs. Here is an illustration of that, starting with :
We see that the number of public inputs increases at each step, for instance verifying is done by running .
Since the CPU time and memory consumption of grow linearly with the size of the public input this will quickly become problematic. The solution to counter that is to hash all the public inputs using , and add a circuit checking that gives the expected hash result . becomes the public input and become private inputs. We name this circuit for "Correct Hash", and returns if .
Here is the final construction:
We see that after , the number of public inputs does not increase anymore, this number only depends on the number of public inputs of the -th circuit.
Halo is the application of the construction described in the previous section on a version of PLONK in which the Kate commitment scheme has been replaced by the Pedersen commitment scheme, along with the bullet-proof opening argument.
Let's recall roughly the workflow of PLONK between a prover and a verifier, where the proof is in fact a list of commitments:
For the construction described in the previous section to work, the verification procedure needs to be cheap (at most logarithmic in the size of the circuit).
The computation of does not match this property, since the cost is linear in .
To circumvent that, we outsource the computation of to the prover:
The verifier now no longer has to compute , but the trade off is that for the final checks, he does not know if is correct!
To check that is correct, a solution would be to ask the prover to open it, but an opening proof yields another Pedersen commitment to compute from the verifier… But remember that we prove a sequence of circuit, and the prover's job is exactly to provide opening proofs. So this check is forwarded to the next proof, where and become public inputs. The workflow between the prover and the verifier in the next proof becomes, at step :
The computations of are therefore forwarded from proofs to proofs. The soundness comes from the fact that the prover does not know in advance, so if he sends a random he has very little chances to match the check of the opening proof.
With the construction from the previous section, and this forwarding system, one can obtain a chain of arbitray length of proofs, in which the verification of the -th step guarantees that all the previous proofs were correct!
If we use Fiat-Shamir, we can write a compact version of the scheme:
A circuit in PLONK reasons about variables which live in a field . If one wants to deal with variables in another field , one needs to encode arithemetic in a language where variables are in , and it is extremely inefficient.
However, from the previous section, we saw that a proof consists of:
So a proof is splitted in components living respectively in and .
And a the verifier's job consits of:
So is splitted in components reasonning respectively in and .
So we see that writing in a language which does not support operations is problematic.
The solution is to find another curve defined on and which contains a group of order , and bounce back and forth between the curves, by carefully separating the proofs and verification circuits in their respective field.
Here is a description step by step of a full sequence, where we omit the inputs to not overload notations, and write for generating the proof of circuit :
Author: Thomas Piellard.
I'm part of a research team at ConsenSys. If you are interested in our work (fast finite field arithmetic, elliptic curve pairings, and zero-knowledge proofs), give us a shout.