ZK-Hack Puzzle #2 Writeup

# ZK-Hack Puzzle #2 Writeup ## Intro ZK-Hack is a 7-week long event that is still active right now. The event includes cool workshops to introduce people to the field of Zero-Knowledge Proofs and their uses in the real world! It also includes fun puzzles that help to really understand the given material by playing around with the underlying mathematical structures used in some ZK Proof systems (the event is hosted at https://www.zkhack.dev/). This post is a writeup of the [second puzzle](https://www.zkhack.dev/puzzle2.html) that I solved together with [Matan](https://twitter.com/MHamilis) and [Elichai](https://twitter.com/Elichai2), and is a follow-up post to the solution of the first puzzle ([available here](https://hackmd.io/@matan/zkhack_1)). ## Group Dynamics > If it's small, it should not be here > This is the clue we are given as the challenge begins. We are also referred to a paper on [prime order subgroups](https://people.cispa.io/cas.cremers/downloads/papers/prime_order_please.pdf). And as usual the documentation of the [arkworks library](https://docs.rs/ark-algebra-intro/0.3.0/ark_algebra_intro/). The challenge is hosted on Github [here](https://github.com/kobigurk/zkhack-trusted-setup). First, let's talk about prime order subgroups and elliptic curve cofactors. ### Prime Order Subgroups and the Pohlig-Hellman Algorithm The Pohlig-Helman algorithm allows us to compute the [discrete log](https://en.wikipedia.org/wiki/Discrete_logarithm) of some element in an [abelian group](https://en.wikipedia.org/wiki/Abelian_group#Finite_abelian_groups) efficiently, given that its order has small prime factors. For full details read [here](https://en.wikipedia.org/wiki/Pohlig%E2%80%93Hellman_algorithm). We will describe a simplified version to focus on the key points. Given two group elements (using additive notation) $G$ and $x \cdot G$ where the order of $G$ is $n \in \mathbb{N}$ s.t. $n = \prod_{i=1}^{r}p_i$ where $p_i$ are the prime factors of $n$. We wish to compute $x$ (solving the Discrete Log Problem[^discrete_log]). Note that $x \in \mathbb{Z}_n$. We know, by the [Chinese Remainder Theorem](https://en.wikipedia.org/wiki/Chinese_remainder_theorem), that if we can find $x_i = x\ mod\ p_i$ for all $i$, then we can efficiently compute $x\ mod\ n$, which is $x$ since $x \in \mathbb{Z}_n$. Now, we can compute $\bar p_i := \frac{n}{p_i}$. Then, $\bar p_i \cdot G$ generates a subgroup of order $p_i$.[^lagrange] Assuming $p_i$ is small enough, we can iterate over $\alpha \in \mathbb{Z}_{p_i}$ and find $\alpha$ s.t. $\alpha \cdot (\bar p_i \cdot G) = \bar p_i \cdot (x \cdot G)$. Note that $\alpha = x\ mod\ p_i = x_i$. However in reality, this is done using the [baby-step giant-step](https://en.wikipedia.org/wiki/Baby-step_giant-step) algorithm, taking only $O(\sqrt{p_i})$ time. Now we use the Chinese Remainder Theorem to compute $x$. Overall, the time complexity of solving the Discrete Log Problem becomes: $O(\sqrt{p})$ where $p$ is that largest prime factor of $n$. This is why it's "bad" to have small prime order subgroups in a group where we wish the Discrete Log Problem [^discrete_log] to be hard. ### Elliptic Curve Cofactors Elliptic Curves are complex objects with a lot of interesting properties, we will try here to focus specifically on the properties that are relevant to the solution of this challenge. Full details [here](https://hackmd.io/@benjaminion/bls12-381#About-curve-BLS12-381). We are working with the BLS12-381 Curve. However these are actually 2 curves: $E$ is defined by : $y^2 = x^3 + 4$ over $\mathbb{F}_q$ . $E_2$ is defined by: $y^2 = x^3 + 4(1 + u)$ over $\mathbb{F}_{q^2}$. However, the orders of these groups have small prime factors (example below). And we just learned why this is bad. So what is usually done, is working in a subgroup on the curve that is of a large prime order (denote as $r$). This is done by multiplying every point on the curve by a **cofactor**. The **cofactor** is computed as the product of all small prime factors of the order of the group. Example: ```sage= factor(E.order()) > 3 * 11^2 * 10177^2 * 859267^2 * 52437899^2 * 52435875175126190479447740508185965837690552500527637822603658699938581184513 # pick r = the large prime, so: cofactor = 3 * 11^2 * 10177^2 * 859267^2 * 52437899^2 hex(cofactor) > 0x396c8c005555e1568c00aaab0000aaab ``` And this is indeed the [cofactor](https://tools.ietf.org/id/draft-yonezawa-pairing-friendly-curves-02.html#rfc.section.4.2.2) used for $E$. The subgroup of order $r$ is normally denoted as $G1$. (note the $G_1$ in the challenge is different) A similar process has been done to compute the cofactor for $E_2$ to work in the group $G2$ of order $r$ also. ### Some Intuition for the Challenge Ahead Already we can develop some intuition (based on the clues given), that in the challenge we should be able to find an opportunity to compute the discrete log of some group element in a scenario that allows the Pohlig-Hellman algorithm to be utilized. ## The Challenge >Alice has computed a trusted setup for a Groth16 proof scheme. >She decided to use a 128-bit long secret, and she swears that she does not know the secret s needed to get this setup. >The trusted setup is constructed as follows using two additional scalars α and β: >* [s^i] G1 for 0 ⩽ i ⩽ 62, >* [α s^i] G1 for 0 ⩽ i ⩽ 31, >* [β s^i] G1 for 0 ⩽ i ⩽ 31, >* [s^i] G2 for 0 ⩽ i ⩽ 31. >Can you recover the secret anyway? > We then find the corresponding $s^i \cdot G_1$ and $s^i \cdot G_2$ [here](https://github.com/kobigurk/zkhack-trusted-setup/blob/76d366b6304c8eb34c9f2244bb96a63fd61f60ba/src/data.rs#L5). Note that $G_1$ is a point on the BLS12_381 Curve we denoted $E$ above. And $G_2$ is a point on the BLS12_381 Curve we denoted $E_2$. To solve the challenge, all we have to do is find $s$ , or more specifically find $s'$ such that $s' \cdot G_1 = s \cdot G1 \wedge s' \cdot G_2 = s \cdot G_2$. This means that we essentially want to compute the discrete log of either $s \cdot G_1$ or $s \cdot G_2$. ## The Solution Note that for $i = 0$ we have $G_1$ and $G_2$ directly. And for $i = 1$ we have $s \cdot G_1$ and $s \cdot G_2$. Keeping in mind our intuition for this challenge, it seems we must be able to utilize the Pholig-Hellman algorithm here in some way. ### Investigation In order to use the Pohlig-Hellman algorithm we must first know the prime factorizations of the orders of both $E$ and $E_2$, and the subgroups generated by $G_1$ and $G_2$ (of $E$ and $E_2$, respectively). We start with computing these for $G_1$ and $E$. Using [sage](https://www.sagemath.org/) (and defining the curves $E$ and $E_2$ using [this](https://gist.github.com/HarryR/eb5ad0e5de51633678e015a6b06969a1) gist), we compute these to be: ```sage= factor(E.order()) > 3 * 11^2 * 10177^2 * 859267^2 * 52437899^2 * 52435875175126190479447740508185965837690552500527637822603658699938581184513 G1 = E(/*coordinates of ts1_0 in rust code*/) factor(G1.order()) > 3 * 11 * 10177 * 859267 * 52437899 * 52435875175126190479447740508185965837690552500527637822603658699938581184513 ``` Now we can see that the order of the subgroup generated by the point $G_1$ indeed has small prime factors of sizes $3, 11, 10177, 859267, 52437899$. Also, note that given the closure of groups, $s \cdot G_1$ is also in the same subgroup. ### Partially Finding $s$ We can now almost use the Pohlig-Hellman algorithm, the only problem is that $G_1$ and $s \cdot G_1$ are in a group where the big prime $524358...$ (denote as $r$) is also a factor of the order. However, exactly as cofactors are used on curve points to _project_ them into a "safe" subgroup of a large prime order, we can use the same method to project the points into an "unsafe" subgroup: If we multiply $G_1$ by $r$, the order of the subgroup generated by $r \cdot G_1$ is exactly $n_1 := 3 \cdot 11 \cdot 10177 \cdot 859267 \cdot 52437899$. Because of the closure of groups then $s \cdot (r \cdot G_1)$ is also in the same subgroup, and $s \cdot (r \cdot G_1) = r \cdot (s \cdot G_1)$. This means we can run the Pohlig-Hellman algorithm to find $s_1 := s \ mod \ n_1$. Luckily, sage has a [built-in implementation](https://doc.sagemath.org/html/en/reference/groups/sage/groups/generic.html#sage.groups.generic.discrete_log) of Pohlig-Hellman that's as simple to use as: ```sage= s_times_g1 = E(/*coordinates of ts1_1 in rust code*/) discrete_log(s_times_g1,G1,operation='+') > 2335387132884273659 ``` However, $\log_2(s_1) \approx 61$ and we know that $s$ is about 128-bits long. This means we are missing some information (roughly 67-bits of it), about $s$. ### Partially Finding More of $s$ We now come back to $E_2$ and $G_2$ and find their relevant subgroup orders. However, since the order of $E_2$ is roughly the square of the order of $E$ we will need to use publicly available parameters to compute this. (Running `E2.order()` directly won't return a result anytime soon). We know that the cofactor of $E_2$ is: `0x5d543a95414e7f1091d50792876a202cd91de4547085abaa68a205b2e5a7ddfa628f1cb4d9e82ef21537e293a6691ae1616ec6e786f0c70cf1c38e31c7238e5` (taken from [here](https://github.com/arkworks-rs/curves/blob/461e4190b19c15099e5ca86b62b3de6155c62ed5/bls12_381/src/curves/g2.rs#L31)) We also know that the order of the "safe" subgroup of $E_2$ is $r$ (must be the same as in $E$ for the pairing to work). Therefore the order of $E_2$ is precisely $r \cdot cofactor$ by definition of the cofactor. Let's take a look at the factors of $cofactor$: ```sage= factor(E2_cofactor) > 13^2 * 23^2 * 2713 * 11953 * 262069 * 402096035359507321594726366720466575392706800671181159425656785868777272553337714697862511267018014931937703598282857976535744623203249 ``` We see it's just $n_2 :=13^2 \cdot 23^2 \cdot 2713 \cdot 11953 \cdot 262069$ multiplied by some big prime (denote as $bp$) If we use the same _projection_ trick as before and compute $bp \cdot r \cdot G_2$ and $bp \cdot r \cdot (s\cdot G_2)$, these should both be in a subgroup of order $n_2$. However, it is not certain that $bp \cdot r \cdot G_2$ generates a subgroup of order $n_2$ (it could generate a smaller subgroup). So we need to find the order of $bp \cdot r \cdot G_2$. We want to find the smallest number $n'_2$ such that $n'_2 \cdot bp \cdot r \cdot G_2 = \mathcal{O}$, $\mathcal{O}$ being the point at infinity. By omitting prime factors of $n_2$ one at a time and multiplying by $bp \cdot r \cdot G_2$ we can find this easily to be $2713 \cdot 11953 \cdot 262069$. We therefore define our base point to be $base:= 13^2 \cdot 23 ^2 \cdot bp \cdot r \cdot G_2$. And the exponent point to be $exp := 13^2 \cdot 23 ^2 \cdot bp \cdot r \cdot (s \cdot G_2)$. To compute $s_2 := s \ mod \ n'_2$: ```sage= discrete_log(exp, base, 2713 * 11953 * 262069, operation='+') > 6942769366567 ``` Note that $\log_2(s_2) \approx 43$ so can't use this alone to construct our 128-bit integer $s$. ### Putting It All Together We now have $s_1 = s\ mod\ n_1$ and $s_2 = s\ mod\ n'_2$. We should also note that $gcd(n_1, n'_2) = 1$. This means that we can apply the [Chinese Remainder Theorem](https://en.wikipedia.org/wiki/Chinese_remainder_theorem) to compute $s' = s\ mod\ (n_1 \cdot n'_2)$ : ```sage= crt([s_mod_n1, s_mod_nt2], [n1, nt2]) > 62308043734996521086909071585406 ``` Yet again, note that $\log_2(s') \approx 105$. This means that $s' \neq s$, at least if the challenge description is correct. We can also run the [assertions](https://github.com/kobigurk/zkhack-trusted-setup/blob/76d366b6304c8eb34c9f2244bb96a63fd61f60ba/src/bin/verify-trusted-setup.rs#L16) in the challenge code with our $s'$ to verify and see that indeed they do not pass. We're still missing something! Well, if you think about it, knowing that $s' = s \ mod \ (n_1 \cdot n'_2)$ just means that for some $k \in \mathbb{N}$, $s = k\cdot(n_1 \cdot n'_2) + s'$. And since $\log_2(n_1 \cdot n'_2) \approx 106$, and the fact that we know $s$ is only about 128-bits long, $k$ can only be about as large as $2^{22}$. This means we can brute force $k$ and find $s$ ! With a little Rust this time: ```rust= let (_ts1, _ts2) = puzzle_data(); let n1_times_nt2 = Fr::from_str("128602524809671824928355010578973").unwrap(); let s_tag = Fr::from_str("62308043734996521086909071585406").unwrap(); for k in 0..4000000 { let s = n1_times_nt2*Fr::from(k) + s_tag; if _ts1[0].mul(s) == _ts1[1] && _ts2[0].mul(s) == _ts2[1] { println!("{}", s); return; } } ``` After 893,754 iterations, this prints out: `0x56787654567876541234321012343210` In decimal: `114939083266787167213538091034071020048` We run: ```rust= let s = Fr::from_str("114939083266787167213538091034071020048").unwrap(); assert_eq!(_ts1[0].mul(s), _ts1[1]); assert_eq!(_ts2[0].mul(s), _ts2[1]); ``` The asserts pass! We did it! :clap: [^lagrange]: By [Lagrange's theorem](https://en.wikipedia.org/wiki/Lagrange%27s_theorem_(group_theory)). [^discrete_log]: Given a cyclic group $G$, $g$ a generator, and $g^x$, find $x$.