A really cool event is ongoing now in the Zero-Knowledge community - the ZK-Hack (details: https://zkhack.dev). Every week a workshop about one of the ZK technologies being developed today is given (details). The first one, given on 26/10/2021 was an introductory workshop about the field of Zero Knowledge Proofs with a great historical introduction.
The event is ongoing for the following seven weeks, with each week a new puzzle is published after the workshop of that week is given.
In this post I'll share a write-up of the first puzzle (link) I was solving with Elichai and Shalev.
So before the puzzle we are given introductory material to two topics in cryptography:
We are also given a referral to the documentation of arkworks library.
The challenge is available on github, here.
First let's talk briefly on those two cryptographic topics.
BLS is a signature scheme, named after Boneh, Lynn and Shacham. It is based on pairings which is a unique algebraic construct based on bilinear maps. Before anyone panics I'll explain what that means.
Definition: Let be two groups. A function is a bilinear-map if:
- For any scalar and for any and for we have:
- For any we have:
- For any we have:
In a sense, it just means a scalar can be moved freely between the any of the arguments or even out of the function evaluation to the output. If you're really interested how the magic happens I'd recommend reading about Weil-Pairings.
So the BLS signatures are signature schemes that provide the following the functions of KeyGen, Sign and Verify. Let's see how those are performed, please notice that BLS Signatures have multiple variants, I'll be discussing a simplified one that is sufficient to understand the problem.
Setup - Let be a generator of group of prime order , and a bilinear map.
This is all nice but in real life scenarios our message to sign is an arbitrary stream of bits and not a group element. Can we find a way to map arbitrary bit-streams into group elements? The answer is yes, and is composed typically of two steps.
It is important that along the way we create as little bias as possible to retain the security of the signature scheme.
One such hash-to-curve technique Pedersen Hashes, which is our next subject.
As mentioned, using Pedersen Hashes we can map the output of a hash function to a group element. Since those groups are typically elliptic curve points, we will use "group element" and "curve point" interchangeably in this section.
So, the Pedersen hash scheme setup is based on a set of group elements , which we assume we don't know the discrete-log of each with respect to any other . In other words, for each pair () we can't efficiently find a value such that .
Given an -bit output of of a hash function (each is a single bit), the value of the pedersen hash of is . By that we get an element in the group.
In the challenge we are given 256 messages () and those messages' signatures signed by some unknown private key () which its corresponding public-key is given as well ().
Each message is signed using the BLS signature scheme where messages are mapped to group elements by first employing a blake2s hash and then using pedersen hash on its output.
It's important to mention that the output of blake2s is 256-bits wide.
The group elements are picked arbitrarily.
Notice that the prime size of the group in our challenge is .
Therefore the private key is a number in (between and ).
We are told that these signatures were published and someone managed to sign some previously unsigned message, and we're asked how can this be done? The solution is in the form of signing ourselves our username as a proof to show we know how can this be done, this basically means that Existential-Unforgeability isn't a property of the signature scheme in this puzzle.
If you haven't tried yet tackling this challenge, I highly encourage you to do so, this is the best way to really understand what happens and to get a better grasp of the underlying concepts who take part in the challenge.
Either way, let's see how this can be solved. We begin with a few notations to make the solution simpler.
As for notation, we will denote the blake2s of a message using . We will also denote the bit of using . Therefore, the pedersen-hash of each message is:
Thus, we can view each blake2s output as a column-vector of 1's and 0's:
So, we can also define basic arithmetic of blake2s hashes using vector arithmetics. For two message we have:
Where addition is over ( is the size of the group ).
For a scalar and for some message we define the scalar multiplication as:
The solution is based on the fact that the signature itself is linear. In the end of the day, we are signing (i.e. multiplying our private key by) a group elemenet. We'll show that the signature of the sum of two group elememts is the sum of the signatures. Let's see what does it mean:
If we take two messages and blake2s them their pedersen hashes are:
Thus, their signatures are:
It's very easy to tell that the signature of is:
Not only that, but the pedersen hashing is also linear!
Given the vector arithmetics defined previously for the blake2s outputs we can tell that first summing the blake2s outputs and then performing the pedersen hash transformations or first doing the pedersen transformation for each blake2s output and then adding the results yields the same output. In algebtraic terms it means that:
Ok, with these two linearity properties, we are ready to solve the puzzle!
We have message we want to sign, we compute its blake2s hash .
If we are able to find constants in such that:
Then we can generate signature for by following both linearity properties.
Remember that is a vector equation, holding for each entry in the vector . Which means that for all :
Let's see why holds:
Great, so now the only question is - how can we find those constants ?
Well, let's look at again, as we already said this is a vector-equation, so it's actually a linear equation system with 256 linear equations (since the vectors are of size 256) and with 256 variables ().
We can express this system of linear equations in matrix nontation:
Such that the variable of the equation is:
The vector is the bits of the blake hash of our target message :
And the matrix is the bits of the blake hash of all messages . Each message has its own column:
The solution to the equation is:
Luckily enough, matrix is invertible so can easily sign the message following our previous equation:
In this section we give the basic tools we used to solve the problem in code. While the input data and the whole rust program is written in rust, we have written our solution in the sage programming language.
Sage is a mathematical-oriented programming language with many built-in tools for computation in the fields of algebra, statistics, combinatorics, graph theory and more. It may look similar to python to some readers.
We have also used some rust code to preprocess the input to the sage program and postprocess its output.
First we took the messages blake2s hashes from the rust file using the following functions:
The write_msgs_to_file
function takes a vector of blake2b hashes of the input messages and writes each hash as a string of 0s and 1s representing the binary form of the hash.
Each hash is written in a separate line to the output file.
This file will be passed to our sage code.
Our sage code is quite short and powerful.
First we read the input and parse it as a list of lists (therefore - a matrix) of 0s and 1s, each binary digit in its own cell of the matrix.
Next, we define to be the order of the curve, so scalar that will later be multiplied by the generators in the Pedersen-has-to-curve scheme will be taken from the field defined right after.
Next, we define the matrix GA
to be simply the list-of-lists A
we previously defined, where each entry (0/1) is considered as an element in field F
. We transpose it because if you pay attention in the A
matrix we defined in the previous section each column (and not row) should contain the bits of a hash of a specific message.
Finally, we compute GAinv
which is the transpose of our matrix.
Next, we computed the blake2s value of our message – this is our vector from previous section where we will look at each bit as an element in . Finally, we have to compute .
The output we get is a vector of elements from such that the signature for our message will be where is the signature of the message. This part will be done in rust.
Our signature generation is also done using rust and is available here as the full code:
https://gist.github.com/elichai/7401f5423c2693960677ba4f8a9fab14#file-computing_the_sig-rs-L55
Here we'll give some explanation on snippets out of it.
First we define the selectors
array, a very long array such that selectors[i]
is the field element we have obtained using our sage program.
Next, we multiply each selector[i]
by the signature of message (sigs[i]
).
This is done using arkworks library.
Next we make a curve element out of the sum
And we verify we actually got the correct signature.
Assuming we do, we output the signature so it is ready for submission and we're done!