# Permuted Write-up > From: Cyber Apocalypse 2024, Hacker Royale > Tags: Crypto > Difficulty: hard > Problem statement: > <Too lengthy and not very helpful, so I am not posting it here. I did not even read it when solving the challenge.> > Solved by: Jackylkk2003 > Files attached: > `source.py`: > ```python= > from Crypto.Cipher import AES > from Crypto.Util.Padding import pad > from Crypto.Util.number import long_to_bytes > > from hashlib import sha256 > from random import shuffle > > from secret import a, b, FLAG > > class Permutation: > def __init__(self, mapping): > self.length = len(mapping) > > assert set(mapping) == set(range(self.length)) # ensure it contains all numbers from 0 to length-1, with no repetitions > self.mapping = list(mapping) > > def __call__(self, *args, **kwargs): > idx, *_ = args > assert idx in range(self.length) > return self.mapping[idx] > > def __mul__(self, other): > ans = [] > > for i in range(self.length): > ans.append(self(other(i))) > > return Permutation(ans) > > def __pow__(self, power, modulo=None): > ans = Permutation.identity(self.length) > ctr = self > > while power > 0: > if power % 2 == 1: > ans *= ctr > ctr *= ctr > power //= 2 > > return ans > > def __str__(self): > return str(self.mapping) > > def identity(length): > return Permutation(range(length)) > > > x = list(range(50_000)) > shuffle(x) > > g = Permutation(x) > print('g =', g) > > A = g**a > print('A =', A) > B = g**b > print('B =', B) > > C = A**b > assert C.mapping == (B**a).mapping > > sec = tuple(C.mapping) > sec = hash(sec) > sec = long_to_bytes(sec) > > hash = sha256() > hash.update(sec) > > key = hash.digest()[16:32] > iv = b"mg'g\xce\x08\xdbYN2\x89\xad\xedlY\xb9" > > cipher = AES.new(key, AES.MODE_CBC, iv) > > encrypted = cipher.encrypt(pad(FLAG, 16)) > print('c =', encrypted) > ``` > `output.txt`: > ```! > g = [11885, 38789, ..., 46319, 30683] > > A = [37452, 41075, ..., 13844, 47293] > > B = [44346, 11814, ..., 46839, 20130] > > c = b'\x89\xba1J\x9c\xfd\xe8\xd0\xe5A*\xa0\rq?!wg\xb0\x85\xeb\xce\x9f\x06\xcbG\x84O\xed\xdb\xcd\xc2\x188\x0cT\xa0\xaaH\x0c\x9e9\xe7\x9d@R\x9b\xbd' > ``` > The ... in `output.txt` are 49996 numbers that is too lengthy to fit in this write-up. ## Problem Analysis In the explanations below, we will assume that all `mapping` lists in Permutation objects have the same length. This assumption is based on the fact that the main operation used in the code is power (`**`), so this assumption will hold at least for the encryption process. Since the code is quite long, let's first have a look at each part of the code and explore the functionalities. ### Multiplication of Permutations ```python=22 def __mul__(self, other): ans = [] for i in range(self.length): ans.append(self(other(i))) return Permutation(ans) ``` Suppose we have `C = A * B`, where `A` and `B` are objects of Permutation class. According to the code, we know that for all `i` in `[0, length_of_permutation - 1]`, `C[i] = A[B[i]]`. For example, suppose `A = Permutation([1, 0, 2])`, `B = Permutation([2, 1, 0])` and `C = A * B`, we have `C = Permutation([A[B[0]], A[B[1]], A[B[2]]])`, that is `Permutation([A[2], A[1], A[0])` and equals to `Permutation([2, 0, 1])`. ![Multiplication example](https://hackmd.io/_uploads/Hk5_3oMC6.png) ### Power of Permutations ```python=30 def __pow__(self, power, modulo=None): ans = Permutation.identity(self.length) ctr = self while power > 0: if power % 2 == 1: ans *= ctr ctr *= ctr power //= 2 return ans ``` After we have the definition of multiplication, the definition of power is very intuitive. It is simply multiplying the permutation by itself `power` times. The code uses repeated squaring method to speed up the computation, just like fast exponentiation. One way to interpret the power is that it applies indexing multiple times. Suppose `B = A ** 3`, then for all `i`, we have `B[i] = A[A[A[i]]]`. Suppose `A = Permutation([1, 0, 2, 4, 5, 3])`, we have ```python A ** 0 = Permutation([0, 1, 2, 3, 4, 5]) A ** 1 = Permutation([1, 0, 2, 4, 5, 3]) A ** 2 = Permutation([0, 1, 2, 5, 3, 4]) A ** 3 = Permutation([1, 0, 2, 3, 4, 5]) A ** 4 = Permutation([0, 1, 2, 4, 5, 3]) A ** 5 = Permutation([1, 0, 2, 5, 3, 4]) A ** 6 = Permutation([0, 1, 2, 3, 4, 5]) A ** 7 = Permutation([1, 0, 2, 4, 5, 3]) A ** 8 = Permutation([0, 1, 2, 5, 3, 4]) A ** 9 = Permutation([1, 0, 2, 3, 4, 5]) ``` We will come back to this example in the later sections. ### Diffie-Hellman Key Exchange ```python=49 x = list(range(50_000)) shuffle(x) g = Permutation(x) print('g =', g) A = g**a print('A =', A) B = g**b print('B =', B) C = A**b assert C.mapping == (B**a).mapping ``` This code is a standard [Diffie-Hellman Key Exchange scheme](https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange). We know about `g`, `A`, `B`, and we want to know `((g**a)**b)` (equivalent to `A ** b`) or `((g**b)**a)` (equivalent to `B ** a`). This is a (supposingly) hard problem, that is, no one should be able to effiently find `C`, `a` or `b` by knowing only `g`, `A`, `B`, at least if this key exchange is secure. ### AES Encryption ```python=63 sec = tuple(C.mapping) sec = hash(sec) sec = long_to_bytes(sec) hash = sha256() hash.update(sec) key = hash.digest()[16:32] iv = b"mg'g\xce\x08\xdbYN2\x89\xad\xedlY\xb9" cipher = AES.new(key, AES.MODE_CBC, iv) encrypted = cipher.encrypt(pad(FLAG, 16)) print('c =', encrypted) ``` This is an AES encryption where the key is calculated using `C` and iv is given. The flag is in the plaintext. Nothing really special about this part. ## Naive Non-solution From the above code analysis, we can see that we can easily get the flag if we can know `C`. To know `C`, we need to know either `a` or `b`. Without loss of generality, let's suppose we want to find `a`. A trivial implementation would be: ```python= p = Permutation.identity(50_000) a = 0 while p.mapping != A.mapping: p = p * g a += 1 print(a) ``` But of course it does not work. The reason is that `a` could be large and it is impractical to brute force `a` by a simple loop. We need more insights from more observations. ## Observations ### More on Power > Suppose `A = Permutation([1, 0, 2, 4, 5, 3])`, we have > > ```python > A ** 0 = Permutation([0, 1, 2, 3, 4, 5]) > A ** 1 = Permutation([1, 0, 2, 4, 5, 3]) > A ** 2 = Permutation([0, 1, 2, 5, 3, 4]) > A ** 3 = Permutation([1, 0, 2, 3, 4, 5]) > A ** 4 = Permutation([0, 1, 2, 4, 5, 3]) > A ** 5 = Permutation([1, 0, 2, 5, 3, 4]) > A ** 6 = Permutation([0, 1, 2, 3, 4, 5]) > A ** 7 = Permutation([1, 0, 2, 4, 5, 3]) > A ** 8 = Permutation([0, 1, 2, 5, 3, 4]) > A ** 9 = Permutation([1, 0, 2, 3, 4, 5]) > ``` As promised, we revisit this example. There is one very interesting observation here: ```python A ** 0 = Permutation([0, 1, 2, 3, 4, 5]) A ** 6 = Permutation([0, 1, 2, 3, 4, 5]) A ** 1 = Permutation([1, 0, 2, 4, 5, 3]) A ** 7 = Permutation([1, 0, 2, 4, 5, 3]) A ** 2 = Permutation([0, 1, 2, 5, 3, 4]) A ** 8 = Permutation([0, 1, 2, 5, 3, 4]) A ** 3 = Permutation([1, 0, 2, 3, 4, 5]) A ** 9 = Permutation([1, 0, 2, 3, 4, 5]) ``` There are some powers resulting in the same permutations. This observation alone is not useful since this must be true according to pigeonhole principle. > The domain of the power is non-negative integers, but there are only $N!$ possible permutations of length $N$. Pigeonhole principle suggests that some powers must share the same permutaion results. But we can see that the first pair of duplicate is the powers (0, 6), then (1, 7), (2, 8), (3, 9). It seems like there are some kind of cycles going on. Also, the cycle length (to avoid ambiguity, let's call it period from now on) is not as large as $6!=720$, but just 6. In fact, we can express the permutations as graphs. We construct the graph by pointing `i` to `A[i]`. ![graphical view](https://hackmd.io/_uploads/rJKhP3zRa.png) One can verify that `(A ** i)[j]` can be found by going through `i` directed edges starting at `j`. This graph model of the permutation is critical for us to do further reasoning. ### Finding Period Finding the period of `A` means that finding the smallest positive integer `n` such that `A ** n = identity = [0, 1, 2, ...]`. That is, for all nodes `j`, travelling through `n` directed edges would stop at `j`. It is obvious to see that the period is the LCM of the sizes of each connected components of the graph. ### But we have to find `a`, not the period... Indeed, finding the period is not so important in this problem. But it is a slightly simpler problem to understand before we proceed with the actual problem. But what does it mean by finding `a`? Recall that `A = g ** a`, `a` is a number such that for all `j`, travelling through the edges `a` steps will stop at `A[j]`. ## Solution Suppose we have `g = Permutation([1, 0, 2, 4, 5, 3])` and `A = Permutation([1, 0, 2, 5, 3, 4])`. How exactly to find `a`? We need to have the following properties: * Starting at 0, travelling through `a` edges will reach 1 * Starting at 1, travelling through `a` edges will reach 0 * Starting at 2, travelling through `a` edges will reach 2 * Starting at 3, travelling through `a` edges will reach 5 * Starting at 4, travelling through `a` edges will reach 3 * Starting at 5, travelling through `a` edges will reach 4 And the graph is still: ![graphical view](https://hackmd.io/_uploads/rJKhP3zRa.png) Consider the rule "Starting at 0, travelling through `a` edges will reach 1", there are 2 nodes in the connected component with 0. We need `a` to be odd. Formally, $$a \equiv 1\ (mod\ 2)$$ Using similar arugments, we have all 6 rules: $$a \equiv 1\ (mod\ 2)$$ $$a \equiv 1\ (mod\ 2)$$ $$a \equiv 0\ (mod\ 1)$$ $$a \equiv 2\ (mod\ 3)$$ $$a \equiv 2\ (mod\ 3)$$ $$a \equiv 2\ (mod\ 3)$$ So how to find `a` from all these equations? From some basic discrete math concepts, we know that we can use [Chinese Remainder Theorem (CRT)](https://en.wikipedia.org/wiki/Chinese_remainder_theorem) to solve the system of linear congurential equations. ~~You may want to review your discrete math course if you don't know what CRT is.~~ For each node `j`, we store 2 things: 1. How many steps do we need to reach `A[j]` from `j` (as the remainder of the modulo) 2. How many nodes are there in the connected component. That is, what is the minimum **positive** integer `m` such that travelling `m` steps from `j` would stop at `j` (as the modulo base) After we get `a`, we can easily compute `C` and then the flag. ## Optimization Since the code could be quite slow. Is it possible for us to do some optimization? One important fact is that for all the nodes in the same connected component, they will always yield the same equation (since otherwise the system will be inconsistent). So for each connected component, we only need to choose 1 node and carry out the computation. For the example above, we can reduce the rules into 3. $$a \equiv 1\ (mod\ 2)$$ $$a \equiv 0\ (mod\ 1)$$ $$a \equiv 2\ (mod\ 3)$$ ## Solve script The solve script is in Sage since I am too lazy to write CRT or find packages that can do CRT. ```sage= from output import * # renamed output.txt to output.py and import it from source import Permutation # Remove some code from source so that no error is thrown from Crypto.Cipher import AES from Crypto.Util.Padding import pad from Crypto.Util.number import long_to_bytes from hashlib import sha256 crt_res = [] crt_mod = [] vis = [False] * 50_000 # Ensures no connected components are traversed more than once for i in range(50_000): curr = i cnt = 0 x = 0 y = 0 while not vis[curr]: vis[curr] = True if curr == A[i]: x = cnt cnt += 1 curr = g[curr] if curr == i: y = cnt if y != 0: assert (Permutation(g) ** y).mapping[i] == i # Checking assert (Permutation(g) ** x).mapping[i] == A[i] # Checking crt_res.append(x % y) crt_mod.append(y) a = crt(crt_res, crt_mod) C = Permutation(B) ** a sec = tuple(C.mapping) sec = hash(sec) sec = long_to_bytes(sec) hash = sha256() hash.update(sec) key = hash.digest()[16:32] iv = b"mg'g\xce\x08\xdbYN2\x89\xad\xedlY\xb9" cipher = AES.new(key, AES.MODE_CBC, iv) decrypted = cipher.decrypt(c) print('Flag =', decrypted) # Output: Flag = b'HTB{w3lL_n0T_aLl_gRoUpS_aRe_eQUaL_!!}\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b' ``` ## PS * How do you even think of that graph thing? ~~You can consult your algorithm design course instructors.~~ If you have seen a lot of graph problems and how to model a problem into a graph problem, then you can observe it.