Special thanks to Dankrad Feist, Karl Floersch and Hsiao-wei Wang for feedback and review.
Perhaps the most powerful cryptographic technology to come out of the last decade is general-purpose succinct zero knowledge proofs, usually called zk-SNARKs ("zero knowledge succinct arguments of knowledge"). A zk-SNARK allows you to generate a proof that some computation has some particular output, in such a way that the proof can be verified extremely quickly even if the underlying computation takes a very long time to run. The "ZK" ("zero knowledge") part adds an additional feature: the proof can keep some of the inputs to the computation hidden.
For example, you can make a proof for the statement "I know a secret number such that if you take the word 'cow', add the number to the end, and SHA256 hash it 100 million times, the output starts with 0x57d00485aa
". The verifier can verify the proof far more quickly than it would take for them to run 100 million hashes themselves, and the proof would also not reveal what the secret number is.
In the context of blockchains, this has two very powerful applications:
But zk-SNARKs are quite complex; indeed, as recently as in 2014-17 they were still frequently called "moon math". The good news is that since then, the protocols have become simpler and our understanding of them has become much better. This post will try to explain how ZK-SNARKs work, in a way that should be understandable to someone with a medium level of understanding of mathematics.
Note that we will focus on scalability; for reasons that will be explained later on, in most of these protocols adding privacy is a surprisingly easy footnote.
Let us take the example that we started with: we have a number (we can encode "cow" followed by the secret input as an integer), we take the SHA256 hash of that number, then we do that again another 99,999,999 times, we get the output, and we check what its starting digits are. This is a huge computation.
A "succinct" proof is one where both the size of the proof and the time required to verify it grow much less slowly than the computation to be verified. If we want a "succinct" proof, we cannot require the verifier to do some work per round of hashing (because then the verification time would be proportional to the computation). Instead, the verifier must somehow check the whole computation without peeking into each individual piece of the computation.
One natural technique is random sampling: how about we just have the verifier peek into the computation in 500 different places, check that those parts are correct, and if all 500 checks pass then assume that the rest of the computation must with high probability be fine, too?
Such a procedure could even be turned into a non-interactive proof using the Fiat-Shamir heuristic: the prover computes a Merkle root of the computation, uses the Merkle root to pseudorandomly choose 500 indices, and provides the 500 corresponding Merkle branches of the data. The key idea is that the prover does not know the hash until they have already "committed to" the data. If a malicious prover tries to fudge the data after learning which indices are going to be checked, that would change the Merkle root, which would result in a new set of random indices, which would require fudging the data again… trapping the malicious prover in an endless cycle.
But unfortunately there is a fatal flaw in naively applying random sampling to spot-check a computation in this way: computation is inherently fragile. If a malicious prover flips one bit somewhere in the middle of a computation, they can make it give a completely different result, and a random sampling verifier would almost never find out.
If tasked with the problem of coming up with a zk-SNARK protocol, many people would make their way to this point and then get stuck and give up. How can a verifier possibly check every single piece of the computation, without looking at each piece of the computation individually? But it turns out that there is a clever solution.
Polynomials are a special class of algebraic expressions of the form:
i.e. they are a sum of any (finite!) number of terms of the form
There are many things that are fascinating about polynomials. But here we are going to zoom in on a particular one: polynomials are a single mathematical object that can contain an unbounded amount of information (think of them as a list of integers and this is obvious). The fourth example above contained 816 digits of tau, and one can easily imagine a polynomial that contains far more.
Furthermore, a single equation between polynomials can represent an unbounded number of equations between numbers. For example, consider the equation
And so on for every possible coordinate. You can even construct polynomials to deliberately represent sets of numbers so you can check many equations all at once. For example, suppose that you wanted to check:
12 + 1 = 13
10 + 8 = 18
15 + 7 = 23
15 + 13 = 28
You can use a procedure called Lagrange interpolation to construct polynomials (12, 10, 15, 15)
as outputs at some specific set of coordinates (eg. (0, 1, 2, 3)
), (1, 8, 7, 13)
on those same coordinates, and so forth. In fact, here are the polynomials:
Checking the equation
You can even check relationships between a large number of adjacent evaluations of the same polynomial using a simple polynomial equation. This is slightly more advanced. Suppose that you want to check that, for a given polynomial
As polynomials,
Why is this the case? It is a nice corollary of polynomial long division: the factor theorem. We know that, when dividing
Going back to our example, if have a polynomial
Where
You can calculate
Now, step back and notice what we did here. We converted a 100-step-long computation (computing the 100th Fibonacci number) into a single equation with polynomials. Of course, proving the N'th Fibonacci number is not an especially useful task, especially since Fibonacci numbers have a closed form. But you can use exactly the same basic technique, just with some extra polynomials and some more complicated equations, to encode arbitrary computations with an arbitrarily large number of steps.
Now, if only there was a way to verify equations with polynomials that's much faster than checking each coefficient…
And once again, it turns out that there is an answer: polynomial commitments. A polynomial commitment is best viewed as a special way to "hash" a polynomial, where the hash has the additional property that you can check equations between polynomials by checking equations between their hashes. Different polynomial commitment schemes have different properties in terms of exactly what kinds of equations you can check.
Here are some common examples of things you can do with various polynomial commitment schemes (we use
It's worth noting that these primitives can be constructed from each other. If you can add and multiply, then you can evaluate: to prove that
And if you can evaluate, you can do all kinds of checks. This is because there is a mathematical theorem that says, approximately, that if some equation involving some polynomials holds true at a randomly selected coordinate, then it almost certainly holds true for the polynomials as a whole. So if all we have is a mechanism to prove evaluations, we can check eg. our equation
As I alluded to earlier, we can make this non-interactive using the Fiat-Shamir heuristic: the prover can compute r
themselves by setting r = hash(com(P), com(H))
(where hash
is any cryptographic hash function; it does not need any special properties). The prover cannot "cheat" by picking P
and H
that "fit" at that particular r
but not elsewhere, because they do not know r
at the time that they are picking P
and H
!
There are there major schemes that are widely used at the moment: bulletproofs, Kate and FRI.
To be honest, they're not that simple. There's a reason why all this math did not really take off until 2015 or so.
In my opinion, the easiest one to understand fully is FRI (Kate is easier if you're willing to accept elliptic curve pairings as a "black box", but pairings are really complicated, so altogether I find FRI simpler).
Here is how a simplified version of FRI works (the real protocol has many tricks and optimizations that are missing here for simplicity). Suppose that you have a polynomial
Let
Notice that
We ask the prover to provide Merkle roots for
We pseudorandomly sample a large set of indices (using the already-provided Merkle roots as the seed for the randomness as before), and ask the prover to provide the Merkle branches for
If we do enough checks, then we can be convinced that the "expected" values of
Notice that
From here, we simply repeat the game with
As in the previous examples, "Bob" here is an abstraction, useful for cryptographers to mentally reason about the protocol. In reality, Alice is generating the entire proof herself, and to prevent her from cheating we use Fiat-Shamir: we choose each randomly samples coordinate or r
value based on the hash of the data generated in the proof up until that point.
A full "FRI commitment" to
Each step in the process can introduce a bit of "error", but if you add enough checks, then the total error will be low enough that you can prove that
Also, you can check carefully that the total number and size of the objects in the FRI commitment is logarithmic in the degree, so for large polynomials, the commitment really is much smaller than the polynomial itself.
To check equations between different polynomial commitments of this type, simply randomly select many indices, ask the prover for Merkle branches, and verify that the values actually line up at those positions.
The above description is a highly inefficient protocol; there is a whole host of algebraic tricks that can increase its efficiency by a factor of something like a hundred, and you need these tricks if you want a protocol that is actually viable for, say, use inside a blockchain transaction. In particular, for example,
In the descriptions above, there was a hidden assumption: that each individual "evaluation" of a polynomial was small. But when we are dealing with polynomials that are big, this is clearly not true. If we take our example from above,
We redefine all of our arithmetic operations as follows. We pick some prime "modulus" p
. The % operator means "take the remainder of":
The above rules are all self-consistent. For example, if
More complex identities such as the distributive law also hold:
Division is the hardest part; we can't use regular division because we want the values to always remain integers, and regular division often gives non-integer results (as in the case of
With modular math we've created an entirely new system of arithmetic, and it's self-consistent in all the same ways traditional arithmetic is self-consistent. Hence, we can talk about all of the same kinds of structures over this field, including polynomials, that we talk about in "regular math". Cryptographers love working in modular math (or, more generally, "finite fields") because there is a bound on the size of a number that can arise as a result of any modular math calculation - no matter what you do, the values will not "escape" the set
Let's say we want to prove that, for some polynomial
We can construct a proof for this with the following polynomial equations (assuming for simplicity
The latter two statements can be restated as "pure" polynomial equations as follows (in this context
The idea is that successive evaluations of 1
, 3 is 11
, 6 is 110
, 13 is 1101
; notice how
But there is a problem: how do we know that the commitments to
There is some good news: these proofs are small proofs that can make statements about a large amount of data and computation. So in general, the proof will very often simply not be big enough to leak more than a little bit of information. But can we go from "only a little bit" to "zero"? Fortunately, we can.
Here, one fairly general trick is to add some "fudge factors" into the polynomials. When we choose
p
)