"looks like we're keeping the 3rd place"
*goes to sleep*
Meanwhile the scoreboard:
http://chal.hkcert23.pwnable.hk:28134/
No, we're not going to solve a Codeforces problem, but instead we're given a submission attempt for a certain problem, and the goal is to craft a valid input to break the submission program, i.e. to make it either run into an error or yield incorrect results.
In essence, the Codeforces problem involves a maze in which the player can only move right or down, and the submission program has to output the total count of reachable cells, along with the number of possible ways , modulo , to get to each cell in the map. For example, if the map looks like
then there are 2 ways, involving only → and ↓ moves, to get to the bottom right from the top left . The x
represents an obstacle that blocks the player.
This is what the map would look like if we replaced each reachable cell with its possible path counts:
Therefore the output should start with 5
, which is the number of reachable cells, and contain the line 1 2 2
, meaning there are 2 paths leading us to cell .
Below is the provided submission code:
dp[i][j]
refers to the number of paths to cell , and the nested for loop basically sums the number of possible paths to the cells directly above and to the left of the each cell (you can only get to a cell from the top or left) and assigns it to . In fact, it can be proved using induction that this algorithm is indeed correct in computing for all cells.
So which part of the code could go wrong? Well, since @gldanoob did a lot of pwning, he thought there would be a nuance buffer overflow of some sort. It turned out we couldn't overwrite data beyond the array boundaries, without making our input invalid. The program also handles edge cases really well (such as the cells on the map's edges, or an x
on the starting cell).
The key is to notice how the program outputs the results. It checks if dp[i][j]
isn't 0 and prints the line. But since dp[i][j]
holds , the program would ignore the cell even if , which is obviously not zero. That entails if we can craft a map with the number of ways to get to a paricular cell being exactly , the program will
We tried to come up with ways to algorithmically generate maps containing a cell with such a large path count. The first way involves finding a cell with in an empty map (in which the path counts are just binomial coefficients), and begin adding obstacles near it to achieve the path count. Unfortunately the target number is so large, we needed even more precise control of .
Eventually we found out how we could do basic arithmetic operations on by merely adding obstacles. For instance, in
all paths to get to C
are blocked, unless if they pass through A
or B
. Therefore .
To multiply the path count of a cell A
by 2, we can do the following:
This works due to the total of 2 ways to get to B after getting to A.
Since we are able to construct and from A
and B
:
I thought this would be a fun challenge as I might learn about MIPS but at the end it gave me eye strain
Attachment:
https://file.hkcert23.pwnable.hk/mips-rop_e2610ec1ddc37812e250b7ac17cadfe6.zip
nc chal.hkcert23.pwnable.hk 28151
./rop
is MIPS binary containing a buffer overflow vulnerability, and as the title suggests we had to write an exploit using return-oriented programming to get us a shell and cat flag
.
In my case I used gdb-multiarch
and pwndbg
as my debugger, and qemu
to simulate the MIPS runtime.
"what is this minecraft enchanting table ahh language"
- gldanoob
I recognized two calls to puts
(corresponding to the program's output) and one to gets
, which lets us overwrite an unlimited number of bytes beyond a buffer on the stack (before hitting the stack boundary).
Let's run checksec
first for a good measure:
Great. No NX or PIE, which means we can directly inject shellcode into the buffer without leaking an address from the stack via puts()
. For some reason although it has stack protection enabled, the main()
prolog and epilog doesn't seem to contain the corresponding code for stack canaries.
pop rdi
Knowing absolutely nothing other than x86-64 pwning, I immediately started googling and eventually came across this post:
https://ctftime.org/writeup/22613
Instead of chaining the return addresses of the ROP gadgets, like what we do when exploiting an x86 binary, we have to either overwrite ra
or t9
with our return address (depending on whether the gadget ends with jr $ra
or jalr $t9
), which makes thing harder as there are more registers we have to control.
What's even worse is that we can't just overwrite ra
or t9
with a custom address, as assumably ASLR is enabled on the server and there's no way to guess the address of our buffer. Is there any pointer value on the stack we can utilize?
We observe that at the location sp+0x1c
, there resides a pointer to sp
itself (0x2b2aa7b0
), with the current stack frame belonging to __libc_start_main
. If we can find a gadget that loads [sp+0x1c]
into ra
, we can force the program to start executing code at sp
which is controllable by us.
Let's begin searching for gadgets that look something like lw ra, 0x1c(sp)
:
Turns out there are lots of them which could do the trick (due to most of them being a part of function epilogs and static linking). I chose the one at 0x455748
.
(It's actually not that easy - I spent hours to even find a useful gadget)
One last thing, I couldn't fit all my shellcode between sp
and sp+0x1c
, so I would need to alter the execution flow to a larger buffer. Here we could just jr
to the start of the gets()
buffer (at sp-0x50
) and put our actual shellcode there.
Here's what the stack looks like after the call to gets()
(and we injected our evil input):
After the program exits main()
, it loads the address of our gadget into ra
then executes it, which also loads sp
into ra
. Our injected code at sp
then redirects the program to run our evil shellcode at sp-0x50
.
Just an average web app with password authentication and content hosting. What could go wrong?
http://chal-a.hkcert23.pwnable.hk:28107/index
Attachment: https://file.hkcert23.pwnable.hk/secret-notebook_7b1907aba402ecdb7ac74b14972cf0a0.zip
In the site, users can sign up and create public notes freely. They can even retrieve a list of notes other users have written. The only restriction though, is that the Administrator
account has a secret note stored in the same database, and is intended to be non-readable by normal users other than the administrator. And, of course, our goal is to retrieve it.
In short, the users
table is structured like this:
username | password | publicnote | secretnote |
---|---|---|---|
Administrator | ??????? | Welcome! I am admin and ... | hkcert23{REDACTED} |
amogos | ??????? | oi tudo bem? | NULL |
And whenever a user clicks Retrieve Public Notes, the username
and publicnote
fields get selected and here's what gets returned to the user:
Let's look at the part of the code executing the query:
Wait. Did they just format the column
and ascending
parameters into the query string?
Instead of letting the mysql
module to prepare the query, the developer used a custom filter function and just inserts the params as they are into the query.
Here's what the filter function does:
It checks if the query has occurences of certain keywords and symbols, and an HTTP 403 response is sent if the param doesn't pass the filter. And as you've probably guessed it, this sort of custom SQL filters are most likely bypassable.
Let's write down the SQL query that gets executed:
Since the ascending
parameter could only be ASC
or DESC
, we'll just modify the column
param to achieve what we want.
Can we comment out the rest of the query? Nope. The filter blocks all --
and #
comments. We couldn't modify the WHERE
clause either, since our payload is inserted after ORDER BY
, and we can't even have or put a WHERE
clause to begin with.
There's a common SQL trick exploiting the fact that keywords are case-insensitive. We can use a mix of upper and lowercase letters (wHeRe
for example) to bypass checks for WHERE
or where
. Unfortunately that wouldn't work in our case as the filter calls the .lower()
method on our payload before checking.
How about adding a brand new SQL statement? Well, if the filter permitted UNION
, we'd be able to insert another query after it. But as it is also blocked, our only choice is to do blind SQL injection.
SQL subqueries allows flexible and dynamic query writing, however the difference is that doesn't get shown as a part of the query result. It simply returns its values, like a function, to the main query.
For example if this was our query:
The subquery (select password from users
) would just create a temporary table, which helped the WHERE clause to determine if the username was being used as someone's password. The password
field wouldn't be shown as a part of the output.
In our case we'll have to write subqueries in a much more restrictive enviroment. Our query has to be after the ORDER BY
clause (which probably wouldn't do much) and we couldn't even use SELECT
to retrive any fields from the table.
Is it a dead end? No. There's another clever trick which lets us retrieve information from the database even without the output.
I bet you could already tell what it does. If the condition is true, MySQL would evaluate the THEN
clause with ease and we'll receive an HTTP
200 OK
status. On the other hand, if the ELSE
clause gets executed, since exp(1000)
is way beyond what the DOUBLE
type could handle, it throws an error and we'll see the 500 Internal Server Error
status. This could be extremely handy to check if an expression is true.
Now for the final question: what should we put as our condition to test on? The Administrator's password, of course. 😈
We wouldn't have to actually check if the entire password string equals something. We can just utilize the LIKE
operator, which could tell us whether a substring occur in the password. Let's also make sure it only checks for the admin's password, and not everyone else's:
Some knowledge of logic comes handy. Asking if an account being Administrator
implies 0
is in the password, is equivalent to asking if all accounts are either not Administrator
(normal users) or their passwords contain 0
. The query successfully returns some results only if the check is passed on all records in the user
table, hence the "for all" quantifier.
Enough maths. Since the filter forbids quotes, we'll finally replace all the strings with CONCAT
and CHAR
functions.
Now if we send the payload to the server and it doesn't crash and returns us some results, we'd know the admin's password starts with 0
.
0 ~ 9
), as hinted by the source code.200 OK
, remember the corresponding character, and append it to .Not really a hard crypto challenge but I figured the math behind the solution would be worth explaining.
Attachment: https://file.hkcert23.pwnable.hk/solitude_92b66a8882479819f0170a1efa4c8baf.zip
We're given the source code of a script that evaluates an integer polynomial function on a user input , given a random prime odd number . The coefficients are not shown to the user, and our goal is to infer the secret , provided only the value of .
The program only asks for and yields one number. From this point we'll have to either enter the secret exactly, or the program will just restart and all of the coefficients will be regenerated.
Since can be as large as , it's virtually impossible to guess or brute force it. We would have to make sure we choose an , such that wouldn't be in terms of the coefficients other than .
Can't we just use as the input so that ? Unfortunately the program also invalidates our input if , so we can't just make a multiple of .
What else can we do to eliminate the non-constant terms? Well, since the output is modulo , whenever divides , we would know that and we could zero out terms in the polynomial.
The first thing that came to mind is , since for . But how many tries would it take to get a perfect square ? As and , the probability of getting is approximately . And obviously @mystiz doesn't own a quantum computer so it's also a no.
What about the factors of ? Would work if is a multiple of ? Let's do the math:
Can we express the terms in the form , with being an integer? It turns out if is also a multiple of , then whenever occurs we'll know it's just another integer.
Assuming :
We're close. How do we make a multiple of ? Only when is a multiple of .
We are given a PowerPoint file that has a gacha game made using VBA/Macros.
As mentioned in the challenge description, we need to pull a 5 star card, however we only have 10 rolls, and our chances of getting a 5 star is 0%.
If we try to open the VBA file, we can see that it is locked with a password:
As given in the hints, we can crack the password of the VBA. There are many ways to do so, but I used this website to crack/modify the password (follow the steps to modify the password):
https://master.ayra.ch/unlock/
Once you modified the password, you can now open the VBA and read what is inside:
If you look around the code, we can see that in Module 2, we can actually modify the ticket count (see comment in VBA):
But in reality, you don't even need to change the ticket count, just modify the gacha probability and make it so that every roll you get will be a 5 star card!
You can find the code for the gacha probabilities by clicking the "Draw a Card" button:
If we change all the "out" variables to 12 and roll, we get the flag:
Slight problem, the flag contains a lot of similar characters.
You can always try out and see which combination works, but can we leak the flag directly?
It turns out the VBA is using a WebBrowser as display, so we can get the HTML code of the browser by adding in some extra code :
Now, you can remove the unnecessary HTML code and submit the flag!
hkcert23{FIl1liIIlI1III1lll1IlI11ag_Hmrnmmrnmmmrnmn}
"where flag?"
- vow, 2023
We received a .pcap file, and as suggested by the hint, we should open it using Wireshark:
Now here comes the challenge: What should we be looking for?
Perhaps you can try to search some keywords first, something like flag, secrets, file extensions, etc.
Or maybe search for specific protocols first, such as http (usually readable), TCP, etc.
If we search .txt (using search, not filter, open it with Ctrl+F), you would see that the search function returns some results, and if we follow the packets (taught in the hints), we should see these:
(TCP Stream 20)
(TCP Stream 51 / HTTP Packet 1204)
Looking at these, we can see that there is a file named secrets.txt.txt (most likely our flag), and there is something called "Invoke-DNSExfiltrator".
Googling "Invoke-DNSExfiltrator" brings us to a GitHub repository, and you will learn that a technique called "DNS Exfiltration" is being used. (which explains why hint 5 tells you to extract information from DNS packets)
Repository: https://github.com/Arno0x/DNSExfiltrator
So what is DNS Exfiltration? It is basically a method that allows hackers to sneak data or commands into DNS packets.
Now, we know that the flag is most likely contrained in DNS packets, which one is the correct one?
In this case, the DNS packets sent after the 1204 HTTP packet are correct:
There are some clues to determine that these DNS packets the correct ones:
You will also notice that there is a command line argument "-p", which according to the repository, is used to set a password for basic RC4 encryption. In other words, K#2dF!8t@1qZ is most likely a key to decrypt the data.
There are 3 packets in question:
By Googling harder (or maybe reading the source code), you might stumble across a site which explains how the DNSExfiltrator packets work:
Site (In Chinese): https://www.freebuf.com/sectool/223929.html
If you don't know Chinese, no worries, here is a quick rundown on how the .init packets work:
The .init packet will be encoded in Base32, and specifies the information in the image above.
We can use CyberChef to decode the packets: https://gchq.github.io/CyberChef/
Decoding the header data using CyberChef gives us:
The file name is secret.txt.txt, and there are 2 packets containing the data.
Now as for the other 2 DNS packets, the data is arranged like this:
Since there is no specified encoding method after decoding the .init packet, it is safe to assume that it is encoded using Base64URL.
IMPORTANT NOTE: THE DATA IS ENCODED WITH BASE64URL, NOT BASE64.
REFERENCE: https://github.com/Arno0x/DNSExfiltrator/blob/8faa972408b0384416fffd5b4d42a7aa00526ca8/dnsexfiltrator.py#L56
Now we can decode the actual data by removing the heading number and trailing DNS address first, then decode it with Base64URL (Change it in Alphabet) and decrypt it with RC4 using the password:
If you notice the first two letters "PK", this is hinting that this is an PKZIP file. We can use the unzip operation in CyberChef to turn this into an ZIP file and open it:
And there is our flag:
hkcert23{v3ry_5n34ky_w17h_dn53xf1l7r470r_5345623}
Get the flag from the I/Q signal recording.
Frequency Modulation; 440 MHz.
Attachment
Ah yes, forensics. Sound? Looks doable.
*reads description*
I/Q what..?
You get some hints from the description:
And some more hints from the attachment filename SDRuno_20231009_134842Z_440113kHz.wav
:
Before I do dumb shvt and dive deep into any rabbit holes, I quickly checked its spectrograms and metadata.
Nothing.
So I started searching. From our best buddy google, SDR = Software-defined radio, and SDRs come with abilities to demodulate I/Q signals (which is what communicates radio signals).
So of course I downloaded SDRuno, opened the .wav file with it, tuned the frequency to 440MHz (according to desc), and got some weird noises sounding like a thousand birds squeaking on steroids.
Here's where I got stumped. Then what? This has to be some information but what can I decode it with? I had no prior knowledge about radio stuff and I tried a few radio decoders but to no avail.
In the writeup threads the challenge author kindly revealed some information about the mysteries:
tbh I didn't even fully understand those but we don't talk about that
There was also an informative page on SSTV in the thread which had an entire list of decoding softwares supporting SSTV, and I chose Black Cat Software. Luckily this software also supports SDRuno as a plugin, so I had to worry less about the bridging and recording problems.
The Black Cat SSTV software (will call it BC from now on) is more intuitive than SDRuno (which appears to be a complete mess to a layman) and has actually good guides on how to use on ther page.
After following their guides and goofing around for a while, I finally got something rather promising:
'tis a QRcode baby
But imagine being able to scan it – I can't.
So I messed around more with the settings, specifically changed this parameter:
And it worked like a charm:
After cropping and skewing and stretching:
Looks promising right? Imma try scanning it…doesn't work?
I used my android phone and online qrcode readers – all of which failed to read it.
Until @vow pointed his iPhone towards my discord stream, and scanned it with ease.
Turns out iPhone actually has a reason to exist other than peer pressure – scanning QRcodes.
Flag: hkcert23{n0_0n3_pl4ys_r4d10_n0w4d4ys}
Yes I'm still trying with my android. And no its camera is intact.
Attachment: https://file.hkcert23.pwnable.hk/sign-me-a-flag-ii_ee9268d1310ede6d37cd4b5eda18457f.zip
This is an advanced version of the challenge Sign me a flag (I), of which @mystiz kindly provided a partial solution guide. You can read it here and be amazed of his sophisticated understanding of cryptography: https://hackmd.io/@blackb6a/hkcert-ctf-2023-i-en-a58d115f39feab46#求旗簽名-I–Sign-me-a-Flag-I-Crypto
Anyways, the program allows you to sign a message with a hidden random server key combined with the user-provided client key:
And our goal is to forge a signature for the string "gib flag pls"
, with the client key set to null:
And of course, the program bans the word flag
if you try to sign it using the program itself.
The setup is pretty much the same as the previous challenge, however @mystiz cleverly added some subtle differences. In Sign me a flag (I), the XOR operator () was defined as follows:
Since the output had the length of the shorter operand, providing only 1 byte as an operand would give us a 1-byte output as the HMAC
's parameter. And because brute-forcing all the 256 combinations of the 1-byte key was definitely doable, we could easily recover the server key byte by byte.
Unfortunately the function is no longer used in this challenge, and is instead replaced with the xor
method from pwntools
. This library function ensures that no matter what we put as the client key, the resulting key would never be less than 16 bytes:
What's worse is that we can't just feed the same message twice for the program to sign, since an incrementing ID would be prepended to each message before hashing.
Stare at your computer for long enough and you'll eventually discover two interesting properties:
xor
has strings of different lengths as arguments, the shorter string gets warped around (instead of the longer string getting trimmed):Wait.. it feels the second property could be exploited by being used to verify a byte in the server key. To be precise, assume . We want to craft as the HMAC key, which also generates the same hash:
Now what could be? It must be 1 byte longer than (5 bytes in total), and recall the first property, where gets warped around:
Now if also ends with the byte , the byte gets canceled:
This implies signing with the two client keys, and , gives the same hash:
In other words, if we obtain the same signatures from them, we'll know the first byte of must be .
Can we generalize this technique to the other bytes in ? Yep. And in fact we can even verify the other bytes without even knowing the first one. For example, letting and , we can see that
would yield the same hash as
The solution I tried during the competition involved waiting for the signature hash from the server every time I sent a client key, essentially taking as much as round trips from/to the server to fully recover the server key. That doesn't fit really well into the 2-minute time limit set by the server. Can we do better than that?
If you look into how TCP sockets work, basically the I/O is buffered. So if we quickly send lots of packets to the server, they queue up at the server side, and the server processes them in the same order as how we sent them. Exploiting this fact, we can shrink down our runtime to ideally one round trip:
(Practically the server wouldn't be able to keep up if there are too many concurrent requests; I had to keep the batch size around 10k)
There's one more problem waiting for us. Remember how a unique was prepended to every message we sent? In order to verify that
we need to have , which is impossible in reality. How do we get around that?
Well, since the digits of are directly prepended to the message, we can just prepend the digit 0
to beforehand, so it becomes 'the last digit of the ID'. Then when hits , just send the same message again but without the 0
: