# ZK Hack - Group Dynamics - WriteUp
In the second ZK-Hack puzzle, we explore an important attack affecting an essential element in zk-SNARKs. As you all know, SNARKs are a part of the broader zkp framework alongside many other technologies such as STARKs, zkEVM, zkTLS, and more. SNARKs are used extensively in DeFi and blockchain protocols to provide anonymity and privacy, with well-known examples like TornadoCash, specifically the GROTH16 proving and verifying scheme is one of the most known low-size proofs used in SNARKs. One important step in GROTH16 is the trusted setup or Power of Tau ceremony, which is the main focus of this article. Essentially, this article will explain a critical issue that can happen during the trusted setup phase in GROTH 16.
[Puzzle link](https://zkhack.dev/events/puzzle2.html)
## Introduction
Before moving to the explanation of this attack and why this attack can happen, lets first look at the puzzle description to give us more hints about what we are working on exactly.
```rust=
pub const PUZZLE_DESCRIPTION: &str = r#"Alice has computed a trusted setup for a Groth16 proof scheme.
She decided to use a 128-bit long secret, and she swears that she does not know the secret s needed to get this setup.
The trusted setup is constructed as follows using two additional scalars α and β:
* [s^i] G1 for 0 ⩽ i ⩽ 62,
* [α s^i] G1 for 0 ⩽ i ⩽ 31,
* [β s^i] G1 for 0 ⩽ i ⩽ 31,
* [s^i] G2 for 0 ⩽ i ⩽ 31.
Can you recover the secret anyway?"#;
```
In other terms, we need to retrieve the secret used by Alice which is 128 bits long denoted as $s$ used to compute the trusted setup. Since we are dealing with elliptic curve points due to the presence of $G_1$ and $G_2$ in the computation, this means we will need to solve the Discrete Log Prob in order to retrieve $s$ in this puzzle, otherwise it would be impossible to find the solution. Another important note is that we have 127 points using $G_1$ (63 + 32 + 32) and 32 points using $G_2$, additionally for the index i = 0 the expression $[s^i] G_1 \quad \text{for} \quad 0 \leq i \leq 62$, will be G_1 (since s^0 = 1) this point is considered the generator of the first curve in Alice system. Similarly the expression, $[s^i] G_2 \quad \text{for} \quad 0 \leq i \leq 31$ will be $G_2$ which is the generator of the second curve, these points will help us later in determining the exact curve order using Sage.
You can see all the points computed in this trusted setup in [data.rs](https://github.com/kobigurk/zkhack-trusted-setup/blob/main/src/data.rs)
## Groth 16
The Groth16 proof scheme is a protocol used in zk SNARK to prove that a statement is correct without revealing the statement itself. The process of making a statement secret is complex to explain, so we will focus on what is essential for this ctf: the **trusted setup phase** or **power of tau ceremony**. This phase of Groth16 is important because it generates cryptographic parameters that will be used in both proving and verifying the correctness of a statement. This ceremony consists of computing elements from two groups $G_1$ and $G_2$, respectively $g_1$ and $g_2$, as generators, and 3 secret scalars noted as $s$, $\alpha$, and $\beta$. A lot of SNARK protocols decide to use MPC ceremonies to generate these elements collaboratively, ensuring that no one has complete knowledge of $s$, $\alpha$, or $\beta$.
A simple approach to illustrate how elements of a trusted setup are computed is:
$$
\begin{align*}
& [s^i] g_1 \quad \text{for} \quad 0 \leq i \leq n \\
& [\alpha s^i] g_1 \quad \text{for} \quad 0 \leq i \leq n \\
& [\beta s^i] g_1 \quad \text{for}\quad 0 \leq i \leq n \\
& [s^i] g_2 \quad \text{for}\quad 0 \leq i \leq n
\end{align*}
$$
**The upper bound value $n$ has some underlying mathematics to determine it but since this article is for beginners, understanding the above notation is more than sufficient.**
## EC
To solve this challenge, we need to explain some theorems and terms that will assist in solving the problem. Note that understanding this terms will help you understand other elliptic curves as algebraic structures, you will be able to determine and understand possible attacks on various curves and not just BLS, i am using BLS here as an example because we use it in this puzzle.
### BLS curve & groups
#### EC As Algebraic Group
An elliptic curve is defined by an equation mostly in the form $y^2 = x^3 + ax + b$ (short Weierstrass form) over a finite field $\mathbb{F}_q$. This is the field over which the elliptic curve is defined with $q$ usually a large prime number in cryptographic dedicated curves, all calculations on the curve are done modulo this field ensuring a finite set of elements to work with.
The equation $y^2 = x^3 + ax + b$ has solutions that forms points on the $(x,y)$ plan , each solution $(x, y)$ where both $x$ and $y$ are in $\mathbb{F}_q$ is a point on the curve. In addition to these points, there is also a unique **point at infinity** which serves as a neutral element. Curves are equipped with mainly two operations **point addition** and **scalar multiplication**. These operations has properties that allow us to create a group structure:
- **Point Addition**: If you have two points on the curve $P = (x_1, y_1)$ and $Q = (x_2, y_2)$, you can define a new point $R = P + Q$ that is also on the curve.
- **Scalar Multiplication**: By repeatedly adding a point to itself you can calculate $kP$, which means $P + P + \dots + P$ (added $k$ times) where $k$ is an integer. This operation is a very important in ECC.
#### EC Group Properties
Since scalar multiplication is essentially repeated point addition, it is safe to say that all operations on elliptic curves revolve around the addition property. Most EC in cryptography are the set of points on the curve that forms not just a **group** but a **cyclic abelian group** under the addition operation, this means that any EC used in real life such as BLS12-381, Ed25519 and secp256k1... satisifies the following properties:
- **Closure**: For any two points $P$ and $Q$ on the curve, the result of $P + Q$ is also a point on the curve. (We need this property in solving the challenge)
- **Associativity**: For each three points $P$, $Q$ and $R$ the assosiciatuve property holds meaning $(P + Q) + R = P + (Q + R)$.
- **Identity Element**: There is an identity element known as the **point at infinity** often denoted $O$. For any point $P$: $P + O = P$.
- **Inverses**: For each point $P$ on the curve there exists an **inverse point** $-P$ such that $P + (-P) = O$, the point at infinity.
- **Commutativity**: For any two points $P$ and $Q$, the result of $P + Q$ is the same as $Q + P$ which is the commutative property. This is essential for the group to be **abelian**.
- **Generator**: The generator point $G$ is a special point that every point on the curve can be expressed as a scalar multiple of $G$ meaning for any points $P$ on the curve: $P = kG$. This means the point $G$ form a cyclic group which is crucial.
#### BLS curve
We can define the BLS curve in a more pristine and fancy way with the notions we got above. BLS curve can be represented as the set of solutions $(x,y)$ for the equation $y^2 = x^3 + 4$ defined over the field $\mathbb{F}_q$ with q being a 381 bit prime. This set of solutions that forms the curve $E$ is called $G1$, they form a cyclic group under $g_1$ as generator.The BLS12-381 curve also has another curve defined over by using a sextic twist, where the equation is slightly modified to $y^2 = x^3 + 4(1+i)$ under the field $F_{q^2}$ we'll call this curve $E_2$, the group of solutions for this curve $G2$ will also form a cyclic group under $g_2$ as generator.
### BLS Order and Cofactors
#### Order & Cofactor of the Curve
The order of an elliptic curve is the **total number of points** on the curve including the point at infinity, this order is often a large prime or a composite number with a large prime factor. For a curve defined over a finite field $\mathbb{F}_q$, the order of the curve $E$ denoted $|E|$ where $r$ is a large prime number that represents the order of the **prime subgroup** and $h$ is the **cofactor** of the curve is:
$$
|E| = h \times r
$$
We can define the cofactor as the integer $h$ that when multiplied by the prime-order subgroup’s order $r$ gives the full curve order. A cofactor greater than 1 implies that the curve contains multiple subgroups.
The main two type of order in the curves are
- **Prime-Order Curves**: In a prime-order curve the total number of points $|E|$ is a prime number. This means the curve has only one subgroup, as there are no cofactors other than 1 and the prime subgroup order itself ($h=1$). Prime-order curves eliminate the risk of multiple subgroups where dlp is easier to solve.
- **Composite-Order Curves**: In a composite-order curve, the total number of points $|E|$ is a composite number, typically of the form $h \times r$ with $h \neq 1$. Composite-order curves contain multiple subgroups, in cryptography operations are often restricted to the prime-order subgroup to prevent attacks.
#### BLS Curve
For the BLS12-381 curve, the main subgroup prime order of the curve $r$ and the cofactor $h$ are specified as follows:
- The prime order subgroup $r$ of the curve is:
$$
r = 0x73eda753299d7d483339d80809a1d80553bda402fffe5bfeffffffff00000001
$$
- The cofactor $h$ for the curve is:
$$
h = 0x396C8C005555E1568C00AAAB0000AAAB
$$
This cofactor is large meaning that the full curve order is composite. As a result, the BLS curve have multipple subrgoups additional care is needed to ensure cryptographic operations are performed in the **prime-order subgroup**.
### Subgroups and Order of point
#### Subgroups of the Curve
From our previous discussion, we know that if the order curve $|E|$ is a composite number because the cofactor $h > 1$, this implies that the curve has multiple subgroups including:
- A **prime-order subgroup** of order $r$, which is essential for cryptographic operations.
- Additional smaller subgroups, resulting from the cofactor $h$.
#### Order of a Point
The order of a point $P$ on an EC is the smallest positive integer $m$ such that:
$$
m \cdot P = O\quad \text{the point at infinity}
$$
To find the order of a point $P$ on the curve in a very simple way we can differentiate two cases mainly, we will calculate a new point $P'$ with the help of the full curve order $r\times h$
$$
P' = r \cdot P
$$
1. If $P'$ is the point at infinity, this means $P$ lies in the **prime-order subgroup** of order $r$.
$$
P' = h \cdot r \cdot P
$$
2. If $P'$ is the point at infinity, this means $P$ lies in one of the smaller subgroups associated with the cofactor $h$ and **not in the prime-order subgroup** of order $r$.
### Factorizing the Curve Order and Solving the DLP
#### Factorizing the Curve Order
When we have a curve with composite order in order to solve DLP using Pohlig-Hellman, we need to factorize the order and break it up into smaller factors:
$$
p_1 \times p_2 \times \dots \times p_n
$$
#### Solving the DLP with Pohlig-Hellman
The Discrete Logarithm Problem on elliptic curves asks for the scalar $k$ in the below equation, where $Q$ and $P$ are valid EC points:
$$
Q = k \cdot P
$$
The Pohlig-Hellman algorithm is efficient when the order can be composed of smaller factors, here’s how it works in simplified steps:
1. Break the DLP into Smaller Problems, using each factor $p_i$ we reduce the DLP for $Q = k \cdot P$ to several smaller DLPs modulo each $p_i$.
3. Solve each of these smaller DLPs independently, since the order of each subgroup $p_i$ is smaller solving the DLP in these subgroups is easier.
4. After solving the DLP for each smaller subgroup, we combine the results to obtain the final solution for $k$ using Chinese Remainder Theorem. The CRT allows us to reconstruct the global solution $k$ from the solutions in each of the smaller subgroups.
## Subgroup Attack in ECC
Using all the terms and theorems discussed in this article, we can craft a dangerous attack in elliptic curve cryptography known as the **subgroup attack** which is the goal of this puzzle. Imagine we are working on a curve with a composite order that can be factorized into smaller factors (like the BLS curve), if we compute a point $Q = k \cdot P$ we need to always ensure that this point is in the **prime-order subgroup** by multiplying it with $r$ and checking whether the resulting point is the point at infinity. If it is not in the prime-order subgroup, and this point is used in a critical cryptographic operation such as GROTH16 trusted setup or in deriving a public key from a private key for a wallet, then we can exploit this by breaking the DLP and find the scalar $k$ using the Pohlig-Hellman algorithm combined with the Chinese Remainder Theorem.
## Challenge Solution
We can now start solving the challenge since we understand all the necessary elements to apply this attack. For the main solution, I used Sage because it is powerful and allows you to create elliptic curves with all the elements in an easy and straightforward way.
### Rust
The first part of our solution is to extract some points from [data.rs](https://github.com/kobigurk/zkhack-trusted-setup/blob/main/src/data.rs) to use in Sage, we only need four points two $G_1$ points ($G_1$ and $sG_1$) and two $G_2$ points ($G_2$ and $sG_2$). Remember that elliptic curves are cyclic groups which means they follow the closure property of groups, this implies that if we can solve the discrete logarithm problem for $G1$, we can solve it for any element $kG1$ since it belongs to the same group as $G1$ (this is also true for $G2$).
```rust=
println!("G1 x-coord in hex: {}", _ts1[0].x);
println!("G1 y-coord in hex: {}", _ts1[0].y);
println!("sG1 x-coord in hex: {}", _ts1[1].x);
println!("sG1 y-coord in hex: {}", _ts1[1].y);
println!("G2 x-coord in hex: {}", _ts2[0].x);
println!("G2 y-coord in hex: {}", _ts2[0].y);
println!("sG2 x-coord in hex: {}", _ts2[1].x);
println!("sG2 y-coord in hex: {}", _ts2[1].y);
```
This step is executed after running Sage to verify if the value found from our algorithm is correct or not.
```rust=
let s = Fr::from_str("114939083266787167213538091034071020048").unwrap();
assert_eq!(_ts1[0].mul(s), _ts1[1]);
assert_eq!(_ts2[0].mul(s), _ts2[1]);
```
### Sage
Let's start by defining the BLS curve in Sage, this definition does not cover all aspects of the BLS curve such as pairings, because they are not needed for the solution to this problem.
```python=
# Prime field for the BLS12-381 curve
p = 0x1a0111ea397fe69a4b1ba7b6434bacd764774b84f38512bf6730d2a0f6b0f6241eabfffeb153ffffb9feffffffffaaab
# Prime order of the BLS12-381 subgroup
r = 0x73eda753299d7d483339d80809a1d80553bda402fffe5bfeffffffff00000001
# Define the base field Fp
Fp = GF(p)
# Define the elliptic curve E over Fp y^2 = x^3 + 4
E = EllipticCurve([Fp(0), Fp(4)])
# Ensure the curve order is a multiple of the subgroup order r
assert E.order() % r == 0
# Calculate the cofactor of the curve E
cofactor_E = E.order() // r
# Define the polynomial ring for the quadratic extension
FpT.<T> = Fp[]
# Define the quadratic extension field Fp2
Fp2.<u> = GF(p**2, modulus=T**2 + 1)
# Define the sextic twist of the elliptic curve over Fp2 y^2 = x^3 + 4(1 + i)
E2 = EllipticCurve([Fp2(0), Fp2([4, 4])])
# Ensure the curve order in Fp2 is a multiple of the subgroup order r
assert E2.order() % r == 0
# Calculate the cofactor of the curve E2
cofactor_E2 = E2.order() // r
print(E)
print(E2)
```
Using the curve setup above we can begin testing all the abstract notations we discussed earlier
```python=
# Define the G1 point found from the Rust script
G1=E(0x0F99F411A5F6C484EC5CAD7B9F9C0F01A3D2BB73759BB95567F1FE4910331D32B95ED87E36681230273C9A6677BE3A69,0x12978C5E13A226B039CE22A0F4961D329747F0B78350988DAB4C1263455C826418A667CA97AC55576228FC7AA77D33E5)
# Define the sG1 point found from the Rust script
sG1=E(0x16C2385B2093CC3EDBC0F2257E8F23E98E775F8F6628767E5F4FC0E495285B95B1505F487102FE083E65DC8E9E3A9181,0x0F4B73F63C6FD1F924EAE2982426FC94FBD03FCEE12D9FB01BAF52BE1246A14C53C152D64ED312494A2BC32C4A3E7F9A)
# Multiply G1 by the prime order r to check if G1 lies in the prime subgroup
r_G1 = G1 * r
# Multiply G1 by both the prime order r and the cofactor to check if G1 lies in a subgroup made by the cofactor
h_r_G1 = G1 * r * cofactor_E
# Expected output: False
print(r_G1.is_zero())
# Expected output: True
print(h_r_G1.is_zero())
# Get the order of the point G1
ord_G1 = G1.order()
# Factorize the order of G1 to see its prime factors
print(factor(ord_G1))
```
Now we can solve the DLP using the Pohlig Hellman algorithm which is available as a built-in module in Sage
```python=
# G1 is not in the prime subgroup so we use the factors of its order to solve the DLP using Pohlig Hellman
def compute_dlp_G1(p):
# Divide ord_G1 by the current factor
INT = Integer(ord_G1 / p)
# Solve the discrete log problem for the given factor
return discrete_log(INT * sG1, INT * G1, p, operation='+')
# Factors of G1
facotrs_G1 = [3, 11, 10177, 859267, 52437899]
# Compute solutions store the results in an array
dlp_G1_solutions = [compute_dlp_G1(n) for n in facotrs_G1]
# Print the solutions for each factor
print(dlp_G1_solutions)
# Combine the solutions using CRT
combined_value = crt(dlp_G1_solutions, facotrs_G1)
# Print the combined value for s
print(combined_value)
```
The combined_value will be $2335387132884273659$, since we know that the secret used by Alice is 128 bits this means the result is not the full $s$ used, let's continue solving in the same way for $G_2$ and $sG_2$.
```python=
# Define the G2 point found from the Rust script
G2=E2(0x1173F10AD9F2DBEE8B6C0BB2624B05D72EEC87925F5C3633E2C000E699A580B842D3F35AF1BE77517C86AEBCA1130AE4+u*0x0434043A97DA28EF7100AE559167FC613F057B85451476ABABB27CFF0238A32831A0B4D14BA83C4F97247C8AC339841F, 0x0BEBEC70446CB91BB3D4DC5C8412915E99D612D8807C950AB06BC41583F528FDA9F42EC0FE7CD2991638187EF44258D3+u*0x19528E3B5C90C73A7092BB9AFDC73F86C838F551CCD9DBBA5CC6244CF76AB3372193DBE5B62383FAAE728728D4C1E649)
# Define the sG2 point from the Rust script
sG2=E2(0x165830F15309C878BFE6DD55697860B8823C1AFBDADCC2EF3CD52B56D4956C05A099D52FE4545816830C525F5484A5FA+u*0x179E34EB67D9D2DD32B224CDBA57D4BB7CF562B4A3E33382E88F33882D91663B14738B6772BF53A24653CE1DD2BFE2FA, 0x150598FC4225B44437EC604204BE06A2040FD295A28230B789214B1B12BF9C9DAE6F3759447FD195E92E2B42E03B5006+u*0x12E23B19E117418C568D4FF05B7824E5B54673C3C08D8BCD6D8D107955287A2B075100A51C81EBA44BF5A1ABAD4764A8)
# Multiply G2 by r to check if it lies in the prime subgroup
r_G2 = G2 * r
# Multiply G2 by both r and cofactor to check if it lies in a subgroup made by the cofactor
h_r_G2 = G2 * r * cofactor_E2
# Expected output: False
print(r_G2.is_zero())
# Expected output: True
print(h_r_G2.is_zero())
ord_G2 = 13 * 23 * 2713 * 11953 * 262069 * 402096035359507321594726366720466575392706800671181159425656785868777272553337714697862511267018014931937703598282857976535744623203249 * r
# Verify that ord_G2 is the exact order of G2
assert (ord_G2 * G2).is_zero()
```
Now we can solve DLP for $sG_2$
```python=
# G2 is not in the prime subgroup so we use the factors of its order to solve the DLP using Pohlig-Hellman
def compute_dlp_G2(p):
# Divide ord_G2 by the current factor
INT = Integer(ord_G2 / p)
# Solve the discrete log problem for the given factor
return discrete_log(INT * sG2, INT * G2, p, operation='+')
# Factors of G2
facotrs_G2 = [13, 23, 2713, 11953, 262069]
# Compute s modulo each factor and store the results
dlp_G2_solutions = [compute_dlp_G2(n) for n in facotrs_G2]
# Print the solutions for each factor
print(dlp_G2_solutions)
# Combine solutions from G1 and G2
combined_secret = crt(dlp_G1_solutions + dlp_G2_solutions, facotrs_G1 + facotrs_G2)
# Compute the combined modulus
combined_modulus = prod(facotrs_G1 + facotrs_G2)
# Print the result
print(combined_secret)
print(combined_modulus)
```
The output of this part will be is 115 bits, by solving the DLP for both $sG_1$ and $sG_2$ we were able to extract 115 out of 128 bits of $s$ the remaining bits can be easily determined by brute force. Since we have two modular constraints $s\mod n_1 = r_1$ and $s\mod n_2 = r_2$ the CRT guarantees a unique solution modulo $n_1 \cdot n_2$. The solution where $s'$ is the combined result modulo $n_1 \cdot n_2$ is:
$$
s = k \cdot (n_1 \cdot n_2) + s'
$$
The $k$ variable in the equation has only $128 - 115 = 13$ possibilities in our case.
```python=
def brute_force_crt(combined_modulus, combined_secret):
# Loop through potential candidates for the secret
for k in range(2^13):
# Compute the candidate secret
candidate_secret = combined_modulus * k + combined_secret
# Verify if the candidate matches the point relationships
if G1 * candidate_secret == sG1 and G2 * candidate_secret == sG2:
print(f"Found in k: {k}")
print(f"Secret s: {candidate_secret}")
brute_force_crt(combined_modulus, combined_secret)
```
The result will be
```shell=
Found k: 2989
Found s: 114939083266787167213538091034071020048
```
Finally we managed to find all 128 bits of s.
## Key Resources
[Subgroup Security in Pairing-Based Cryptography](https://eprint.iacr.org/2015/247.pdf)
[Elliptic Curve Cryptography Validation](https://www.nccgroup.com/us/research-blog/an-illustrated-guide-to-elliptic-curve-cryptography-validation/)
## Conclusion
This attack was possible because Alice chose a random point and used it as a generator to compute the trusted setup elements. Since this point was not in the main prime subgroup the solution for DLP became feasible. This attack can be more generalized, imagine if Alice used this point to compute a public key from a private key (a random scalar) in such a case we could easily extract the private key and gain full wallet access. As an active auditor, this opened my mind to many things that can go wrong in the main pillars of blockchain.