--- tags: Cyber demon, Writeup, OMGACM --- # Alice & Bob Writeup From FE-CTF 2022: Cyber Demon By MadsAnker and Victor4X Please don't solve Alice & Bob 4 :). ## The challenge Challenge source: ```python #!/usr/bin/env python3 import json, os, random, mmap from pyseccomp import SyscallFilter, Arg, ALLOW, EQ, MASKED_EQ, KILL random = random.SystemRandom() ROUNDS = 1000 def recv(io): fd, _ = io r = b'' while True: c = os.read(fd, 1) if c in (b'', b'\0'): break r += c if not r: return return json.loads(r) def send(io, data): _, fd = io os.write(fd, json.dumps(data).encode() + b'\0') def close(io): i, o = io os.close(i) os.close(o) def sandbox(code, name): pi, co = os.pipe() ci, po = os.pipe() pio = pi, po cio = ci, co pid = os.fork() if pid > 0: close(cio) def call(*args): send(pio, args) return recv(pio) return pid, call elif pid == 0: random.seed(0) flt = SyscallFilter(defaction=KILL) flt.add_rule(ALLOW, "write", Arg(0, EQ, co)) flt.add_rule(ALLOW, "read", Arg(0, EQ, ci)) flt.add_rule(ALLOW, "mmap", Arg(3, MASKED_EQ, mmap.MAP_ANONYMOUS, mmap.MAP_ANONYMOUS), Arg(4, EQ, 2**32-1), Arg(5, EQ, 0), ) flt.add_rule(ALLOW, "munmap") flt.add_rule(ALLOW, "brk") flt.load() exec(code) func = locals()[name] while True: args = recv(cio) if not args: break send(cio, func(*args)) sys.exit(0) else: raise RuntimeError("Could not fork") alice = bytes.fromhex(input("alice> ")).decode() alices = [sandbox(alice, "alice") for _ in range(7)] bob_pid, bob = sandbox(bytes.fromhex(input("bob> ")).decode(), "bob") try: for _ in range(ROUNDS): alice_token = list(os.urandom(28)) alice_persona = random.randrange(0, len(alices)) alice_pid, alice = alices[alice_persona] msg = alice(alice_token) print("alice", alice_token,msg) assert type(msg) == list and len(msg) <= 64 and all(x in range(256) for x in msg) random.shuffle(msg) bob_token, bob_persona = bob(msg) print("bob", bob_token,bob_persona) assert alice_token == bob_token and alice_persona == bob_persona with open("flag", "r") as flag: print(flag.read()) finally: for alice_pid, _ in alices: os.kill(alice_pid, 9) os.kill(bob_pid, 9) ``` That's a lot of stuff to read, so let's break it down. The challenge boils down to writing a piece of code for Alice and one for Bob. 1. Alice will get a list of 28 integers between 0 and 255 (inclusive). We will refer to this as the original list. From this, she must produce a message which consists a list of **up to** 64 integers between 0 and 255 (inclusive). 2. Bob will be given a **shuffled** version of the message produced by Alice, and must regenerate the original list. Aditionally Bob must identify which of the 7 Alices sent the message. We start by figuring out the number of possible unique original lists and then show how these can be mapped to a different set of lists that Alice can use to communicate with Bob. ## The math Alice may get any list of 28 integers between 0 and 255, so there are $256^{28}$ possible original lists. Now we count the number of distinct messages that Alice can send to Bob. Notice that two messages are not considered distinct if they only differ in ordering (i.e. they are permutations of each other). We must include this constaint since the message is shuffled before Bob gets it. If we ignore the fact the message can be of varying length, this is actually a well known problem. For a list of fixed message of $k$ elements each taking on one of $n$ possible values, the number of distinct messages is the amount of $k$-combinations of an $n$-set with replacement. Let's explorer futher anyway as this will lead us to a solution. Two messages are distinct iff. they either differ in the amount of a certain number they include (e.g. one message contains two 1s and the other has none), or they differ in the amount of elements that they contain in total (e.g. one message has 10 elements and the other has 11). Let's say that the message has a fixed amount of elements $k$. Then, all we can pick are the amounts of each number that the message contains, which much sum up to $k$. Thus, the number of distinct lists is equal to the number of distinct solutions to the following equation: $$ x_0 + x_1 + \dots + x_{255} = k $$ where $x_i$ is the number of $i$s in the message. This counting problem can be solved using a useful method known as *stars and bars* which you can learn more about [here](https://en.wikipedia.org/wiki/Stars_and_bars_(combinatorics)). We can count that the total number of solutions to this equation is: $$ {n+k-1 \choose k} $$ We can sum the values of this expression for $k=0,1,\dots,64$ and $n=256$ to get the total number of possible distinct lists: $$ \sum_{k=0}^{64} {256+k-1 \choose k} $$ Taking the base 2 logarithm of this value yields the number of bits of entropy contained in the list: $$ \log2{\left(\sum_{k=0}^{64} {256+k-1 \choose k}\right)} \approx 226.85 $$ This is just above what we need since $$ \log2(256^{28}\cdot7) \approx 226.81 $$ ## Defining the mapping We will define the mapping by defining an order over the set of all possible messages. Interpreting the original list as a base-$256$ number we get some $i \in [256^{28}-1]$. We let the $i$th possible message (as defined by the order), be the message that represents the original list. We take the lexicographical order of possible messages to be the order, with shorter messages always being less than longer ones. Let's consider an example with $n=4$ (as opposed to $256$) and $k=0,1,2,3$.The possible messages written in order would be: | $i$ | message | $k$ | |:------- |:------- |:------- | | 0 | (empty) | 0 | | 1 | 0 | 1 | | 2 | 1 | 1 | | 3 | 2 | 1 | | 4 | 3 | 1 | | 5 | 0,0 | 2 | | 6 | 0,1 | 2 | | 7 | 0,2 | 2 | | 8 | 0,3 | 2 | | 9 | 1,1 | 2 | | 10 | 1,2 | 2 | | 11 | 1,3 | 2 | | 12 | 2,2 | 2 | | 13 | 2,3 | 2 | | 14 | 3,3 | 2 | | 15 | 1,1,1 | 3 | | 16 | 1,1,2 | 3 | | 17 | 1,1,3 | 3 | | 18 | 1,2,2 | 3 | | 19 | 1,2,3 | 3 | | $\dots$ | $\dots$ | $\dots$ | We can compute the $i$'th element as follows: We start by determining the amount of elements needed to reach $i$, $k$. To do this, we find the smallest $k$ such that the number of possible messages with $k$ elements is at least $i$. Formally, $k$ is the smallest number such that $\sum_{r=0}^{k} {256+r-1 \choose r} \geq i$. We can compute compute the $i$'th message recursively as follows: ```python= def getC(i, number, n, k): """ Compute the i'th message **with k elements** in range(0,n-1) starting with number """ if k == 0: # If there are no more numbers left to add return [] if i < ncr(n,k-1): # We are under the number of possible messages for k-1, # so we add number and recurse return getC(i,number,n,k-1)+[number] else: # We are over the number of possible messages for k-1, # so we reduce i by cnr(n,k-1), incerement number and reduce n. # # We have since writing this realized that # only n or number is needed. return getC(i-ncr(n,k-1),number+1,n-1,k) n = 256 k = 0 while (scnr(n, k+1) < i): k += 1 message = getC(i-sncr(n,k),0,256,k) ``` The logic behind this is partially explained by the comments. If the number of possible messages with one less element is larger than $i$, then we should add the current number to the message and proceed to index into the messages wit $k-1$ elements. Otherwise, we should subtract the number of messages with $k-1$ elements from $i$ and proceed with number set 1 higher. Let's consider an example with $n=4$ and $k=0,1,2,3$ as in the previous example. Say we want to compute the $12$'th message ($i = 12$). The number of possible solutions for $k = 0$ is 1, the number of possible solution for $k=1$ is 4, and $10$ for $k=2$. $1+4 < 12 < 1+4+10$ so we set $k=2$. Now we run `getP(7,0,4,2)` (`sncr` sums the number of solutions for $r < k$ so `12-sncr(4,2) = 12-(4+1) = 5`). The number of messages with $k-1 = 1$ is 4, so we enter the else branch. If we look at the table above, this makes sense because we cannot get to $i=12$ by placing $0$ as the first number for $k=2$. Now we call `getC(7-4=3,1,3,2)`. The number of possible messages with $k=1$ is now $3$ (we must start with $1$ and not $0$), so we again enter the else branch. This makes sense because we cannot get to $i=12$ by placing $1$ as the first number for $k=2$. Now we call `getC(3-3=0,2,2,2)`. The number of possible messages with $k=1$ is now $2$ (we must start with $2$), so now we enter the then branch. This makes sense because we **can** get to $i=12$ by placing $2$ as the first number for $k=2$. 2 is added to the end of the solution here, effectively building the solution in reverse. Note that this does not matter since we are not counting on the elements having a specific order. Now we call `getC(0,2,2,1)`. The number of possible messages with $k=0$ is $1$ (just the empty message), so we enter the then branch and add 2 to the result. Lastly, we call `getC(0,2,2,0)` which hits the base case. Bob can do this process in reverse, obtain $i$ and convert it back into a base-256 number to get the original list. Here is the code for obtaning $i$: ```python= def getI(c, number, n, k): if (k == 0): return 0 if number in c: newc = [x for x in c] newc.remove(number) return getI(newc,number,n,k-1) else: return ncr(n,k-1)+getI(c,number+1,n-1,k) ``` ## Figuring out Alice's persona In order to get the flag, Bob also needs to know which Alice was used to generate the message. Alice's persona is a number from 0-6, referencing the index of the called Alice sandbox in the alices list. Encoding this information in the mapping is trivial, as the highest value Alice's original list can have is $256^{28}-1$ (i.e. $i < 256^{28}-1$), so by adding the persona as a multiple of $256^{28}$ to the target value before encoding, Bob will always be able to extract it simply by integer dividing by $256^{28}$. The real issue is that **Alice doesn't know who she is**. When initially uploading code to the challenge, it is exec'd in a forked process with restricted syscalls using the defined `sandbox()` function. Depending on the role of the sandbox, either the now-defined `bob()` or `alice()` function is saved for working with the actual challenge messages. When the sandbox passes data to the `alice()` function, the local scope is completely clean, and none of the `pio, cio, pid, etc.` variables defined when initializing the sandbox are present. Re-creating them is not possible as os.pipe is restricted by the syscall filter. However, the previously mentioned variables *are present* when initially exec'ing the code used for defining `bob()` and `alice()`. In order to include them in the functions, we ended up adding another exec() with an f-string for inserting the variable we needed: (PoC) ```python b"exec(f\"\"\"" + b"def alice(): {...} po={po} {...}" + b"\"\"\")" ``` In the above example, the po variable is put into the correct scope of our `alice()` function. This variable holds a file descriptor that starts at 6 for the first sandbox, and increments by 2 for every subsequent sandbox. Getting the persona is then just: `persona = ({po}-6)//2` Putting all of this together $i$, the number that Alice wishes to encode, is calculated as `i + pow(256,28)*(({po}-6)//2)` after calculating $i$ by interpreting the original list as a base-256 number. ## Conclusion The final solve script is: ```python= def alice(args): import math def ncr(n,k): return math.comb(n+k-1,k) def sncr(n,k): return sum([ncr(n,number) for number in range(k)]) def getC(i, number, n, k): if k == 0: return [] if i < ncr(n,k-1): return getC(i,number,n,k-1)+[number] else: return getC(i-ncr(n,k-1),number+1,n-1,k) i = 0 for arg in args: i*=256 i += arg i += pow(256,28)*(({po}-6)//2) n = 256 k = 0 while (sncr(n, k+1) < i): k += 1 return getC(i-sncr(n,k),0,n,k) def bob(c): import math def ncr(n,k): return math.comb(n+k-1,k) def sncr(n,r): return sum([ncr(n,number) for number in range(r)]) def getI(c, number, n, k): if (k == 0): return 0 if number in c: newc = [x for x in c] newc.remove(number) return getI(newc,number,n,k-1) else: return ncr(n,k-1)+getI(c,number+1,n-1,k) def toBase256(t): if t == 0: return [] return toBase256(t//256)+[t%256] k = len(c) n = 256 target = getI(c,0,n,k)+sncr(n,k) realTarget = target%pow(256,28) aliceid = target//pow(256,28) b256 = toBase256(realTarget) while (len(b256) != 28): b256.insert(0,0) return b256,aliceid ```