SaqrSign - HackMD

# SaqrSign This is a detailed (probably unintended) solution for SaqrSign. Absolutely none of this is mine, this all comes from @ks in the discord, who shared his code. I didn't understand some parts of it, so I took it upon myself to attempt to. ## Original Solution Below is @ks's original solution, written in sage. ```python= import os;os.environ["TERM"] = 'linux' from pwn import * from ntt import NTTDomain, NTTPoints import hashlib from tqdm import trange #r = process(["python", "challenge.py"]) r = remote("54.78.163.105", "30859") N, Q, W, P = 1024, 12289, 4324, 9389 PR.<x_> = PolynomialRing(GF(Q)) QR.<x> = PR.quotient(x_^N + 1) def l2p(A): t = [] while A: t.append(A % Q) A //= Q return QR(t) def HashBall(m: bytes, tau: int) -> list: """ Deterministically generates sparse polynomials with weight tau. """ if isinstance(m, str): m = m.encode() h = hashlib.sha256(m).digest() c = N * [0] for i in range(N - tau, N): hi = int(hashlib.sha256(h + i.to_bytes(2, 'big')).hexdigest(), 16) hi //= i; j = hi % i; hi //= i hi //= 2; k = hi % 2; hi //= 2 c[i] = c[j] c[j] = (1 - 2 * k) % Q return c def ntt(c): c = QR(c).lift() return QR([c(P * pow(W, i, Q)) for i in range(N)]) def intt(c): c = QR(c).lift() c = [c(pow(W, -i, Q)) * pow(N, -1, Q) for i in range(N)] c = [c[i] * pow(P, -i, Q) for i in range(N)] return QR(c) r.recvuntil(b'= ') data = eval(r.recvline().decode()) A = intt(l2p(int(data["A"], 16))) E = intt(l2p(int(data["E"], 16))) def get_sig(msg): #r.sendline(b'S\n' + msg) #r.sendlineafter(b'> ', b'S') #r.sendlineafter(b') ', msg) r.recvuntil(b'| Sig = ') sig = eval(r.recvline()[:-1].decode()) sig_r = bytes.fromhex(sig["r"]) sig_V = intt(l2p(int(sig["V"], 16))) sig_W = intt(l2p(int(sig["W"], 16))) sig_C = QR(HashBall(msg + sig_r, 38)) sig_P = E * sig_V - A * sig_C + sig_W sig_Pint = sum([ZZ(j) * Q ** i for i,j in enumerate(ntt(sig_P).list())]) sig_Pbyt = sig_Pint.to_bytes(-(-len(bin(sig_Pint)[2:]) // 8), 'big') assert sig_r == hashlib.sha256(sig_Pbyt.hex().encode()).digest() return sig_r, sig_V, sig_W, sig_C, sig_P def send_n(msg, n): payload = b'S\n' + msg + b'\n' r.send(payload * n) rs, Vs, Ws, Cs, Ps = [], [], [], [], [] m = 2 * N + 100 send_n(b'asdf', m) for i in trange(m): r0, V0, W0, C0, P0 = get_sig(b'asdf') rs.append(r0) Vs.append(V0) Ws.append(W0) Cs.append(C0) Ps.append(P0) As, Bs = [QR(1)], [QR(0)] for i in trange(1, m): As.append(Vs[i] / Vs[0]) Bs.append(Vs[i] / Vs[0] * Cs[0] - Cs[i]) M = [] for i in trange(m): Av = list(As[i]) + list(-As[i]) Bv = list(Bs[i]) coeff = list(As[i] * QR(list(As[i])[::-1]))[::-1] coeff += list(As[i] * QR(list(Bs[i])[::-1]))[::-1] #coeff.append(vector(Bs[i]) * vector(Bs[i])) M.append(coeff) M = Matrix(GF(Q), M) target_v = vector(GF(Q), [140 - vector(Bs[i]) * vector(Bs[i]) for i in trange(m)]) x = M.solve_right(target_v) Y1_0 = QR(list(x[N:])) / 2 U_0 = Y1_0 + Cs[0] G = Vs[0] / U_0 D = E * G - A sig_Dint = sum([ZZ(j) * Q ** i for i,j in enumerate(ntt(D).list())]) sig_Dbyt = sig_Dint.to_bytes(-(-len(bin(sig_Dint)[2:]) // 8), 'big') r.sendlineafter(b'> ', b'c') r.sendlineafter(b') ', sig_Dbyt.hex().encode()) r.interactive() ``` His explanation is [here](https://discord.com/channels/975328241604788234/1046899101553401877/1280111361698824264). If you're not part of the Flagyard Discord, here's a screenshot instead. ![lt](https://i.imgur.com/s8yxIvT.png "explanation") ## Challenge Let's take a few steps back and provide a simple explanation of what the challenge entails. The challenge implements a scheme called EagleSign. The specification can be found [here](https://csrc.nist.gov/csrc/media/Projects/pqc-dig-sig/documents/round-1/spec-files/EagleSign-spec-web.pdf). "Saqr" (from the challenge name) is the arabic word for "falcon", so that was supposed to hint toward the scheme. Most of the solvers used a recently published paper that uses a statistical attack to crack the private key given enough signatures. That paper is [here](https://eprint.iacr.org/2024/1137.pdf). It also has a provided github library [here](https://github.com/ludopulles/EagleHasFlown) for the attack. The spec can be boiled down to as follows: You have a Quotient Ring $Z_q[x]/x^N + 1$, where $N = 1024$. There are 2 public keys, $A$ and $E$ and 2 private keys $G$ and $D$ ### Keygen Randomly generate a polynomial $G$ with coefficients $\in {0, \pm 1}$. Make sure it has an inverse. generate a polynomial $D$ with the same coefficients, but we don't care if it has an inverse or not. Randomly generate a polynomial $A$ with coefficients $\in Z_q$ and compute. $$ E = (A + D) * G^{-1} $$ ### Signing Generate 2 polynomials $Y_1$ and $Y_2$, $Y_1$ has exactly 140 non-zero values, which are either 1 or -1, and absolute max value of any coefficient in $Y_2$ is 64. Compute $P = A*Y_1 + Y_2$ and hash this to a value $r$ and use this value along with the message to sign $m$ to generate a polynomial $C$ with 38 non-zero values, similar to generation for $Y_1$. Calculate $U = Y_1 + C$ and generate the signature tuple $(r, V, W)$ where $$V = G * U$$ and $$W = Y_2 - D*U$$ Effectively the challenge is you have a signing oracle that will sign anything for you and you need to recover one of the secret keys, $D$. We can calculate $C$ ourselves, given the $r$ value and we can compute $P$ since it's entirely based off of public values, which is effectively what verifying does anyways. We however do not know one part of the $U$, which is $Y_1$, and below is @ks's solution to recover it. Once you recover $Y_1$ you can recover $G$ using the $V$ part of the signature and then recover $D$ using the public key $E$. ## Hadamard Product Looking specifically at the $V$ value from the signature: \begin{split} V_i &= G * U_i \\ &= G * (Y_{1,i} + C_i)\\ \\ G &= \frac{V_i}{Y_{1,i} + C_i}\\ \\ \frac{V_0}{Y_{1,0} + C_0} &= \frac{V_i}{Y_{1,i} + C_i}\\ \\ \frac{V_i}{V_0} &= \frac{Y_{1,i} + C_i}{Y_{1,0} + C_0}\\ \\ \frac{V_i}{V_0}\left(Y_{1,0} + C_0\right) &= Y_{1,i} + C_i\\ \\ Y_{1,i} &= \frac{V_i}{V_0}\left(Y_{1,0} + C_0\right) - C_i \end{split} Let \begin{split} A_i &= \frac{V_i}{V_0}\\ \\ B_i &= \frac{V_i}{V_0} * C_0 - C_i \end{split} Therefore \begin{split} Y_{1,i} &= A_i * Y_{1,0} + B_i \end{split} The unknowns are $Y_{1,i}$, both $A_i$ and $B_i$ are known. Now we introduce the Hadamard product operator, motivated from [this](https://math.stackexchange.com/questions/2259227/sum-of-squared-coefficients-of-polynomial) and [this](https://en.wikipedia.org/wiki/e_(matrices)) as $\bigodot$ Although the operator is normally defined for matrices as element-wise multiplication (`.*` in numpy), we can turn our polynomials into diagonal matrices easily. The polynomials are already represented as lists (or vectors) in the code, so we just need to make a diagonal matrix with that list as the main diagonal. We have $\text{tr}(Y_{1,i} \bigodot Y_{1,i}) = 140$, because the sum of the squared coefficients of $Y_{1,i}$ is 140, since there are 140 non-zero terms in the polynomial $Y_{1,i}$, and those non-zero terms are either 1 or -1. $\text{tr}$ is the trace of a matrix, or sum of its elements. Using the properties of the Hadamard product operator (it's associative and distributive over addition). \begin{split} \text{tr}(Y_{1,i} \bigodot Y_{1,i}) &= \text{tr}\left(\left(A_i * Y_{1,0} + B_i\right) \bigodot \left(A_i * Y_{1,0} + B_i\right)\right)\\ 140 &= \text{tr}\left((A_i * Y_{1,0}) \bigodot (A_i * Y_{1,0})\right) + \text{tr}\left(2 * \left((A_i * Y_{1,0}) \bigodot B_i \right)\right) + \text{tr}\left(B_i \bigodot B_i\right)\\ 140 - \text{tr}\left(B_i \bigodot B_i\right) &= \text{tr}\left((A_i * Y_{1,0}) \bigodot (A_i * Y_{1,0})\right) + \text{tr}\left(2 * \left((A_i * Y_{1,0}) \bigodot B_i \right)\right) \end{split} ### Toy Example In order to see how @ks computed $$ (A_i * Y_{1,0}) \bigodot (A_i * Y_{1,0}) $$ and $$ (A_i * Y_{1,0}) \bigodot B_i $$ We need to understand what's being done internally. I will show a toy example where $N = 4$ instead of $1024$ so it's easier to see. This also means the modulus is $x^4 + 1$. Let's say we have \begin{split} A_i = a_0 + a_1x + a_2x^2 + a_3x^3\\ Y_{1,0} = y_0 + y_1x + y_2x^2 + y_3x^3 \end{split} As a vector $A_i * Y_{1,0}$ looks like, \begin{split} (a_0y_0 - a_3y_1 - a_2y_2 - a_1y_3,\\ a_1y_0 + a_0y_1 - a_3y_2 - a_2y_3,\\ a_2y_0 + a_1y_1 + a_0y_2 - a_3y_3,\\ a_3y_0 + a_2y_1 + a_1y_2 + a_0y_3) \end{split} We are trying to figure out what $Y_{1,0}$ is while we're given the value of $A_i$, so we should try and write things in terms of $Y_{1,0}$. Ideally what we want to achieve is a equivalent expression where the left side of the $\bigodot$ is something known and the right side is something unknown. Then we can then represent the operation through matrix-vector multiplication and solve for unknown $Y_{1,0}$ with linear algebra. I wrote a quick sagemath code to visualize the products ```python= DEGREE = 4 import itertools def cre_poly_mat(v): M = matrix.circulant(v) for i in range(DEGREE): for j in range(i): M[i, j] = -M[i, j] return M N, Q, W, P = 1024, 12289, 4324, 9389 PR = PolynomialRing(GF(Q), [f'a{i}' for i in range(DEGREE)] + [f'y{i}' for i in range(DEGREE)]) all_vars = PR.gens() ais = all_vars[:DEGREE] yis = all_vars[DEGREE:] y_vec = vector(PR, yis) y_mat = cre_poly_mat(yis) y_mat2 = cre_poly_mat(yis[::-1]) a_vec = vector(PR, ais) a_mat = cre_poly_mat(ais) a_mat2 = cre_poly_mat(ais[::-1]) ya_vec = y_vec * a_mat print(ya_vec) ya_vec_2 = ya_vec * ya_vec print("--------------------------------") print(ya_vec_2) print("--------------------------------") for _yi_1, _yi_2 in itertools.combinations_with_replacement(yis, 2): print(_yi_1, _yi_2, ya_vec_2.coefficient(_yi_1 * _yi_2)) print("--------------------------------") print(a_vec * a_mat2) ``` The `ya_vec_2` variable above represents the $\text{tr}\left((A_i * Y_{1,0}) \bigodot (A_i * Y_{1,0})\right)$, so we want to find an equivalent expression. The output is ``` (a0*y0 - a3*y1 - a2*y2 - a1*y3, a1*y0 + a0*y1 - a3*y2 - a2*y3, a2*y0 + a1*y1 + a0*y2 - a3*y3, a3*y0 + a2*y1 + a1*y2 + a0*y3) -------------------------------- a0^2*y0^2 + a1^2*y0^2 + a2^2*y0^2 + a3^2*y0^2 + 2*a0*a1*y0*y1 + 2*a1*a2*y0*y1 - 2*a0*a3*y0*y1 + 2*a2*a3*y0*y1 + a0^2*y1^2 + a1^2*y1^2 + a2^2*y1^2 + a3^2*y1^2 + 2*a0*a1*y1*y2 + 2*a1*a2*y1*y2 - 2*a0*a3*y1*y2 + 2*a2*a3*y1*y2 + a0^2*y2^2 + a1^2*y2^2 + a2^2*y2^2 + a3^2*y2^2 - 2*a0*a1*y0*y3 - 2*a1*a2*y0*y3 + 2*a0*a3*y0*y3 - 2*a2*a3*y0*y3 + 2*a0*a1*y2*y3 + 2*a1*a2*y2*y3 - 2*a0*a3*y2*y3 + 2*a2*a3*y2*y3 + a0^2*y3^2 + a1^2*y3^2 + a2^2*y3^2 + a3^2*y3^2 -------------------------------- y0 y0 a0^2 + a1^2 + a2^2 + a3^2 y0 y1 2*a0*a1 + 2*a1*a2 - 2*a0*a3 + 2*a2*a3 y0 y2 0 y0 y3 -2*a0*a1 - 2*a1*a2 + 2*a0*a3 - 2*a2*a3 y1 y1 a0^2 + a1^2 + a2^2 + a3^2 y1 y2 2*a0*a1 + 2*a1*a2 - 2*a0*a3 + 2*a2*a3 y1 y3 0 y2 y2 a0^2 + a1^2 + a2^2 + a3^2 y2 y3 2*a0*a1 + 2*a1*a2 - 2*a0*a3 + 2*a2*a3 y3 y3 a0^2 + a1^2 + a2^2 + a3^2 -------------------------------- (-a0*a1 - a1*a2 + a0*a3 - a2*a3, 0, a0*a1 + a1*a2 - a0*a3 + a2*a3, a0^2 + a1^2 + a2^2 + a3^2) ``` If you notice, the coefficients of the product of 2 of the $a_i$ variables repeat. The last vector represents the product of $A_i$ and it's "reversed" polynomial, $A_i^*$, ie the same polynomial with the coefficients reversed. More info [here](https://en.wikipedia.org/wiki/Reciprocal_polynomial). You can also see that the product of $A_i$ with $A_i^*$ shares the same values from the coefficients above. If we take the vector (known values): ```python= (-a0*a1 - a1*a2 + a0*a3 - a2*a3, 0, a0*a1 + a1*a2 - a0*a3 + a2*a3, a0^2 + a1^2 + a2^2 + a3^2) ``` And multiply with the following vector (unknown values) ```python= (2*y0*a3, y0*y2 + y1*y3, 2*y0*y1 + 2*y1*y2 + 2*y2*y3, y0^2 + y1^2 + y2^2 + y3^2) ``` We should get the desired output: ```python= a0^2*y0^2 + a1^2*y0^2 + a2^2*y0^2 + a3^2*y0^2 + 2*a0*a1*y0*y1 + 2*a1*a2*y0*y1 - 2*a0*a3*y0*y1 + 2*a2*a3*y0*y1 + a0^2*y1^2 + a1^2*y1^2 + a2^2*y1^2 + a3^2*y1^2 + 2*a0*a1*y1*y2 + 2*a1*a2*y1*y2 - 2*a0*a3*y1*y2 + 2*a2*a3*y1*y2 + a0^2*y2^2 + a1^2*y2^2 + a2^2*y2^2 + a3^2*y2^2 - 2*a0*a1*y0*y3 - 2*a1*a2*y0*y3 + 2*a0*a3*y0*y3 - 2*a2*a3*y0*y3 + 2*a0*a1*y2*y3 + 2*a1*a2*y2*y3 - 2*a0*a3*y2*y3 + 2*a2*a3*y2*y3 + a0^2*y3^2 + a1^2*y3^2 + a2^2*y3^2 + a3^2*y3^2 ``` Which the following code proves: ```python= y_vec_2 = vector(PR, (2*yis[0]*yis[3], yis[0]*yis[2] + yis[1]*yis[3], 2*yis[0]*yis[1] + 2*yis[1]*yis[2] + 2*yis[2]*yis[3], yis[0]^2 + yis[1]^2 + yis[2]^2 + yis[3]^2)) assert ya_vec_2 == (a_vec * a_mat2) * y_vec_2 ``` We accomplished what we wanted, the left side of the Hadamard product operator consists of known values from $A_i$ and the right side consists of unknown values from $Y_{1,i}$ ### Other part Onto the expression $(A_i * Y_{1,0}) \bigodot B_i$: I wrote some similar code to visualize the product: ```python= DEGREE = 4 import itertools def cre_poly_mat(v): M = matrix.circulant(v) for i in range(DEGREE): for j in range(i): M[i, j] = -M[i, j] return M N, Q, W, P = 1024, 12289, 4324, 9389 PR = PolynomialRing(GF(Q), [f'a{i}' for i in range(DEGREE)] + [f'y{i}' for i in range(DEGREE)] + [f'b{i}' for i in range(DEGREE)]) all_vars = PR.gens() ais = all_vars[:DEGREE] yis = all_vars[DEGREE:DEGREE*2] bis = all_vars[DEGREE*2:] y_vec = vector(PR, yis) b_vec = vector(PR, bis) y_mat = cre_poly_mat(yis) y_mat2 = cre_poly_mat(yis[::-1]) b_mat2 = cre_poly_mat(bis[::-1]) a_mat = cre_poly_mat(ais) ya_vec = y_vec * a_mat print(ya_vec) yab_vec = ya_vec * b_vec print(yab_vec) for _yi_1 in yis: print(_yi_1, yab_vec.coefficient(_yi_1)) ``` This is the ouptut: ``` (a0*y0 - a3*y1 - a2*y2 - a1*y3, a1*y0 + a0*y1 - a3*y2 - a2*y3, a2*y0 + a1*y1 + a0*y2 - a3*y3, a3*y0 + a2*y1 + a1*y2 + a0*y3) a0*y0*b0 - a3*y1*b0 - a2*y2*b0 - a1*y3*b0 + a1*y0*b1 + a0*y1*b1 - a3*y2*b1 - a2*y3*b1 + a2*y0*b2 + a1*y1*b2 + a0*y2*b2 - a3*y3*b2 + a3*y0*b3 + a2*y1*b3 + a1*y2*b3 + a0*y3*b3 y0 a0*b0 + a1*b1 + a2*b2 + a3*b3 y1 -a3*b0 + a0*b1 + a1*b2 + a2*b3 y2 -a2*b0 - a3*b1 + a0*b2 + a1*b3 y3 -a1*b0 - a2*b1 - a3*b2 + a0*b3 ``` `yab_vec` is the output of $\text{tr}\left((A_i * Y_{1,0}) \bigodot B_i\right)$. You can also write out that expression as the following: $$ (a_0, a_1, a_2, a_3) * \begin{bmatrix} b_0 & b_1 & b_2 & b_3\\ b_1 & b_2 & b_3 & -b_0\\ b_2 & b_3 & -b_0 & -b_1\\ b_3 & -b_0 & -b_1 & -b_2 \end{bmatrix} \bigodot (y_0, y_1, y_2, y_3) $$ Similar to what we had before, the right side of the $\bigodot$ are unknowns, and the left side are known values. Now in @ks's code he computes the left side by multiplying $A_i$ by the reverse polynomial of $B_i$, so $B_i^*$, and then reverses that product. This is the equivalent of doing $$(a_0, a_1, a_2, a_3) * \begin{bmatrix} b_3 & b_2 & b_1 & b_0\\ -b_0 & b_3 & b_2 & b_1\\ -b_1 & -b_0 & b_3 & b_2\\ -b_2 & -b_1 & -b_0 & b_3 \end{bmatrix}$$ And then reversing the resulting output vector. It should be clear that these 2 matrix-vector products are the same. These were just toy examples to show how what @ks calculated is effectively the Hadamard product. Obviously in the challenge $N = 1024$, so the expressions are a big larger and more complicated. You should be able to use my code to see these expressions by setting the first line to 1024 instead of 4. ## Ending Tying it all together, by computing the left side of the two $\bigodot$ expressions and keeping the output as a vector (not applying the trace part yet), we can make the right side a vector of unknowns, and then use Linear Algebra to recover this vector. Multiplying a row vector by a column vector is effectively doing the trace anyways, so we can represent the original equation with a simple $M*x = b$ equation, and use `solve_right` to solve for unknown vector $x$. Some small notes, the amount of unknowns we have from the first $\bigodot$ expression is equal to $N$, which is 1024, and then the second expression also has that many unknowns, for a total of 2048. We only care about the second-half of unknowns, because that completely recovers the polynomial for $Y_{1, 0}$, as the unknowns are $y_0, y_1, y_2, ..., y_{1023}$. The first half will be unknowns where each unknown can be represented as $\sum y_iy_j$, for some $i, j$. We don't really care what these expressions are. @ks's code also reverses the output polynomial product for both Hadamard product expressions, but you don't need to reverse the first one. It effectively accomplishes nothing since we're not using the recovered values anyways. Hope this made his solution more digestable as a lot of the information is very nuanced. Thanks again to @ks for sharing.