Seccon CTF quals

# Seccon CTF quals Last weekend I played Seccon CTF with my team, about:blankets, and in particular my flatmate Giacomo, aka Sant97. I am very happy that as a team we managed to solve all the crypto challenges! 🎉 You can find the writeup of dual_summon and Tidal wave on [Giacomo's post](https://hackmd.io/@Sant97/Hk8ddpWm1e), while here's the writeup for xiyi, which ended up to be the most difficult challenge, with only 4 solves in total. ## xiyi Author: y011d4 > Calculate \sum_i x_i y_i secretly ### Challenge analysis ```python! # initialize ys = [randbelow(M) for _ in range(L)] # 2: (client) --- n, enc_xs ---> (server) --- enc_alphas, beta_sum_mod_n ---> (client) params = input_json('{"n": ..., "enc_xs": [...]} > ') n, enc_xs = params["n"], params["enc_xs"] assert isinstance(n, int) and n > 0 assert isinstance(enc_xs, list) and len(enc_xs) == L and all([isinstance(x, int) for x in enc_xs]) C = Cryptosystem.from_pubkey(Pubkey(n)) enc_alphas = [] betas = [] for enc_x, y in zip(enc_xs, ys, strict=True): r = Pt(randbelow(n)) enc_alpha = C.add(C.mul(enc_x, Pt(y)), C.encrypt(r)) beta = -r % n enc_alphas.append(enc_alpha) betas.append(beta) beta_sum_mod_n = sum(betas) % n print(json.dumps({"enc_alphas": enc_alphas, "beta_sum_mod_n": beta_sum_mod_n})) # BTW, can you guess ys? params = json.loads(input('{"ys": [...], "p": ..., "q": ...} > ')) guessed_ys, p, q = params["ys"], params["p"], params["q"] assert ( n == p**2 * q and p.bit_length() == q.bit_length() == N and p != q and isPrime(p) and isPrime(q) ), "Don't cheat me!" if guessed_ys == ys: print("Congratz!") print(flag) else: print("Wrong...") print(f"{ys = }") ``` The challenge server exposes a protocol to compute the inner product $\langle x, y\rangle = \sum_{i=1}^n x_iy_i\mod n$, where $x$ is a vector chosen by the client, and $y$ is a secret vector chosen by the server. The goal of the challenge is to recover the server's vector $y$. The way the functionality is implemented is via some homomorphic encryption scheme: the client will generate a public key $n$, and encrypt its inputs into $c_i=\mathsf{Enc}_n(x_i)$. The server will homomorphically evaluate the multiplication by $y_i$, and add randomness $r_i$ into new ciphertexts $z_i$ such that $\mathsf{Dec}(z_i)=x_iy_i+r_i$; the server will then send all the values $z_i$ and the offset sum $\sum r_i$. #### Which HE? We now shift our attention to the homomorphic encryption scheme, which reveals to be [Okamoto-Uchiyama](https://en.wikipedia.org/wiki/Okamoto%E2%80%93Uchiyama_cryptosystem). The public keys are of the form $n=p^2q$, where $p,q$ are two prime numbers of the same size (518 bits in the challenge); there is also a public element $g$, that here is simply $g=\lfloor N/2\rfloor$. The encryption function takes a message $m$ and encrypts it into a ciphertext of the form $c\equiv g^{m+nr}\pmod n$, where $r$ is some randomness. The decryption function uses the apparently magic function $$L(x)=\frac{x^{p-1}-1\mod p^2}{p}\mod p$$ and to decrypt the ciphertext $c$ it computes $$\frac{L(c)}{L(g)}$$ First, we see why the $L$ function makes sense: by Fermat's little theorem, $x^{p-1}\equiv 1\pmod p$, and in particular this means that $x^{p-1}=1+kp$. Now, the function $L$ just returns this $k\mod p$. From this, we quickly get the important property of $L$: it behaves like a logarithm, in the sense that $L(xy)=L(x)+L(y)$. \[Quick proof: if $x^{p-1}=1+k_xp$ and $y^{p-1}=1+k_yp$, then $(xy)^{p-1} \equiv (1+k_xp)(1+k_yp) \equiv 1+p(k_x+k_y)\pmod{p^2}$ \] ### Solution The first thing to notice is that we can choose the public key $n$ freely, but we are subject to the contraints $n=p^2q$ and that $p,q$ both have 518 bits. Another important thing to notice is that all the $y_i$'s are 256 bits long. Looking at how the values returned by the server are generated, we see that, if $c_i=\mathsf{Enc}_n(x_i)$, then the server will compute a ciphertext $z_i$ such that $\mathsf{Dec}(z_i)=x_iy_i+r_i$ in the following way: $$z_i\equiv c_i^{y_i}\cdot g^{r_i+nt_i}\pmod n,$$ where $r_i$ is the randomness to mask $y_i$, and $t_i$ is some additional randomness. #### First, wrong, attempt Our first hope is to find a pair $(n,g)$ such that $L(g)=0$, so that we could compute $L(z_i)=y_i\cdot L(c_i)$ and easily get the $y_i$. However, this means that $g^{p-1}\equiv 1\pmod {p^2}$. If we were able to freely choose $g$, this would be very easy to find. However, the server uses $g=\lfloor n/2\rfloor$. This means that $n=2g$, or $n=2g+1$. Since $n=p^2q$, we have that $g\equiv0,-\frac12\pmod p$. The first one is impossible, and the second one leads to the equation $$2^{p-1}\equiv 1\pmod {p^2}$$ However, the primes of this form are called [Wieferich primes](https://en.wikipedia.org/wiki/Wieferich_prime) and it's not know if there are primes of this form other than $1093$ and $3511$. So, no way to find a tricky $g$. #### More homomorphisms! (Warning: math ahead) One thing that the previous approach pointed us at is the what I will call the "fundamental encryption equation", i.e. $z_i\equiv c_i^{y_i}\cdot g^{r_i+nt_i}\pmod n$, and computing $L$ on this to get $$L(z_i)\equiv y_i\cdot L(c_i) + r_i\cdot L(g)\pmod p$$ In particular, we used the function $L$ to transform an equation in the exponents into a ***linear*** equation. Before, we hoped to get $L(g)=0$ in order to kill the $r_i$; but maybe we will have to keep them. We *just* need more equations! Now, the function $L$ is very cool, because it actually is a homomorphism $$L:(\mathbb Z/n\mathbb Z)^*\to \mathbb Z/p\mathbb Z$$ But wait, we actually know the abstract structure of $(\mathbb Z/n\mathbb Z)^*$! Since $n=p^2q$, we have $$(\mathbb Z/n\mathbb Z)^*\cong (\mathbb Z/p^2\mathbb Z)^* \times (\mathbb Z/q\mathbb Z)^*,$$ where the projections are given simply by modding at $p^2$ and $q$ via CRT. Now, the fun part; every cryptographer knows that $(\mathbb Z/q\mathbb Z)^*\cong \mathbb Z/(q-1)\mathbb Z$: it's the discrete logarithm! And since we can choose $q$, we can do it in such a way that this isomorphism is efficiently computable, for example by choosing $q-1$ to be smooth and computing dlogs via Pohlig-Hellman! For the other part, $(\mathbb Z/p^2\mathbb Z)^*\cong \mathbb Z/p\mathbb Z \times (\mathbb Z/p\mathbb Z)^* \equiv \mathbb Z/p\mathbb Z \times \mathbb Z/(p-1)\mathbb Z$. The second part is again the dlog, and the first part? It's actually our beloved $L$! So, to recap, what have we learned? That if we choose $p,q$ to both have easy dlogs, we get *three* different maps $$ \begin{align*} L &: (\mathbb Z/n\mathbb Z)^*\to \mathbb Z/p\mathbb Z\\ \pi_p&:(\mathbb Z/n\mathbb Z)^*\to \mathbb Z/(p-1)\mathbb Z\\ \pi_q&:(\mathbb Z/n\mathbb Z)^*\to \mathbb Z/(q-1)\mathbb Z \end{align*} $$ Applying these to our fundamental equation, we get three linear equations for any index $i=1,\dots, 32$. $$ \begin{alignedat}{1} L(z_i) &\equiv y_i\cdot L(c_i) &&+ r_i\cdot L(g) &&~ &&\pmod{p}\\ \pi_p(z_i) &\equiv y_i\cdot \pi_p(c_i) &&+ r_i\cdot \pi_p(g) &&+ t_i\cdot n\pi_p(g) &&\pmod{p-1}\\ \pi_q(z_i) &\equiv y_i\cdot \pi_q(c_i) &&+ r_i\cdot \pi_q(g) &&+ t_i\cdot n\pi_q(g) &&\pmod{q-1}\\ \end{alignedat} $$ #### Solving the equations We now have $32\cdot 3$ modular equations, and $32\cdot 3$ unknowns. So an easy Gaussian elimination should solve everything, right? Wrong! The problem is that the moduli are different, and the $r_i$'s and $t_i$'s are actually as big as $n$, so their reduction modulo $p, p-1$ and $q-1$ behaves essentially like threee different variables :( BUT! Here's where any CTF player looks at the linear equations, checks the size of the unknowns, remembers that the $y_i$'s are "small", and immediately thinks of LLL :) \[\*insert "ah shit, here we go again" meme here\*\] Luckily, things are pretty standard now: we rewrite the equations over the integers by adding variables for the multipliers, plug equations and bounds into [rkm's solver](https://github.com/rkm0959/Inequality_Solving_with_CVP) and profit :chart_with_upwards_trend: This just means some patience, and a big whiteboard to write everything down before coding it in sage! ![lavagnona](https://hackmd.io/_uploads/r14SG_I7yl.jpg) Also, don't forget that we know that $\beta\equiv-\sum r_i\pmod n$! #### Scripting We first setup everything, generating smooth $p$ and $q$ ```python! import random def gen_smooth_number(nbits): small_primes = Primes()[1:30] while True: res = 1 while res.nbits() < nbits-5: res *= random.choice(small_primes) for p in small_primes: if (res*p).nbits() == nbits: return (res*p) def gen_smooth_prime(nbits): while True: base = gen_smooth_number(nbits-1) p = 2*base+1 if is_prime(p): return p p, q = gen_smooth_prime(518), gen_smooth_prime(518) N = p^2*q g = N//2 ``` We then create our helper functions, and ensure that our ciphertexts are actually generators of the whole subgroups in order not to lose any information. ```python! Kp = GF(p) Kq = GF(q) gp = Kp.multiplicative_generator() gq = Kq.multiplicative_generator() ct_base = crt([int(gp), int(gq)], [p, q]) def pip(x): return Kp(x).log(gp) def piq(x): return Kq(x).log(gq) def L(x): return (pow(x, p-1, p^2)-1)//p ``` We first prepare offline the ciphertexts and the matrix corresponding to the big system, that we will LLL reduce. ```python! cts = [pow(ct_base, random.randint(0, 2^512-1), N) for _ in range(32)] M = block_matrix(ZZ, [ [diagonal_matrix([L(ct) for ct in cts]), L(g), 0, -p, 0, 0], [diagonal_matrix([pip(ct) for ct in cts]), pip(g), (N*pip(g))%(p-1), 0, -(p-1), 0], [diagonal_matrix([piq(ct) for ct in cts]), piq(g), (N*piq(g))%(q-1), 0, 0, -(q-1)] ], subdivide=False, sparse=False) M = M.stack(vector(ZZ, [0]*32+[-1]*32+[0]*4*32)).augment(vector(ZZ, [0]*3*32+[-N])).stack(identity_matrix(ZZ, 32*6+1)) ``` Now we could simply get the corresponding ciphertexts from the server, compute the functions and ask a solution to rkm's solver, but this was too slow (at least for me), so I had to unpack everything :skull_and_crossbones: If anyone has a better workflow here, please let me know! :pray: Sooo, first we create fake solutions to the equations, and precompute all the weights and the matrix to be reduced: ```python! sols = [0]*(32*3+1) lbounds = sols + [0]*32 + [0]*32*5 + [-50] ubounds = sols + [2^256]*32 + [2^1555]*32*2 + [2^1558]*32*3 + [0] mat, lb, ub = copy(M.transpose()), copy(lbounds), copy(ubounds) num_var = mat.nrows() num_ineq = mat.ncols() weight = None max_element = 0 for i in range(num_var): for j in range(num_ineq): max_element = max(max_element, abs(mat[i, j])) if weight == None: weight = 2^50 * num_ineq * max_element print("Using weight:", weight) # sanity checker if len(lb) != num_ineq: print("Fail: len(lb) != num_ineq") if len(ub) != num_ineq: print("Fail: len(ub) != num_ineq") for i in range(num_ineq): if lb[i] > ub[i]: print("Fail: lb[i] > ub[i] at index", i) # scaling process begins max_diff = max([ub[i] - lb[i] for i in range(num_ineq)]) applied_weights = [] for i in range(num_ineq): ineq_weight = weight if lb[i] == ub[i] else max_diff // (ub[i] - lb[i]) applied_weights.append(ineq_weight) for j in range(num_var): mat[j, i] *= ineq_weight lb[i] *= ineq_weight ub[i] *= ineq_weight base_target = vector([(lb[i] + ub[i]) // 2 for i in range(num_ineq)]) ``` Then we LLL reduce this matrix via fpylll. ```python! from fpylll import IntegerMatrix, LLL Mred = LLL.reduction(IntegerMatrix.from_matrix(mat)) ``` In order to get the solution from this reduced matrix and the solutions computed from the server response, we "just" need to run a CVP, which ideally should be either computing a Gram-Schmidt for Babai's algorithm, or call fpylll CVP function. Unfortunately, neither of those worked, and I ended up copying a [fix by Blupper](https://github.com/fplll/fpylll/pull/274) to fpylll's CVP solver. We are now ready to get the answers we need from the server: ```python! from pwn import remote import json r = remote("xiyi.seccon.games", int(13333)) first_msg = {"n": int(N), "enc_xs": list(map(int, cts))} r.sendlineafter(b"> ", json.dumps(first_msg).encode()) data = json.loads(r.recvline()) true_zs = data["enc_alphas"] Rsum = data["beta_sum_mod_n"] sols = [L(z) for z in true_zs] + [pip(z) for z in true_zs] + [piq(z) for z in true_zs] + [Rsum] cur_target = vector([xxx*weight for xxx in sols] + list(base_target[97:])) result = babai(Mred, cur_target) resp = [result[i] // applied_weights[i] for i in range(len(resp))] found_ys = resp[97:97+32] final_msg = {"p": int(p), "q": int(q), "ys": list(map(int, found_ys))} r.sendlineafter(b"> ", json.dumps(final_msg).encode()) r.recvlines(2) ``` And this prints us the flag! :grin: