TheZZAZZGlitch's April Fools 2024 Event - Glitch Lord's Final Stand writeup

# TheZZAZZGlitch's April Fools 2024 Event - Glitch Lord's Final Stand writeup When I found ZZAZZ's YT channel about two years ago, I fell into the Pokèmon glitching rabbit hole. As a junior software engineer myself, I got hooked on these old architectures and the shenanigans you could pull off on them. After missing the small 2023 event, I decided to join my first-ever "hacking contest" and see where it goes! I set my initial goal quite low, as I had no prior Assembly knowledge. All I wanted was to beat **one** of the security challenges and somehow get into the Top 50 on the leaderboard. ## Challenge 3 Number 3 felt like a neat thing to get into GBZ80 assembly. So I downloaded the file and opened it in Notepad++ while opening the [OPcode reference](https://meganesu.github.io/generate-gb-opcodes/) in my browser. I started with analyzing the *\_StrEQ\_* function. Reading assembly for the first time was painful, but after an hour I managed to kind of understand what was happening. I decided that my strategy would be to thoroughly run through the code from start to finish in the order it ran, analyze it in detail, and hopefully find something to exploit. At this point, I was mainly looking for unterminated strings (or ways to produce these) and maybe some ways to corrupt non-static jumps. ### A looooooooong search When I got to the *.checkLineLength* loop, I indeed found something. The counter would just reset whenever it hit a `$0a` or `$0d`. This meant that there was a bypass. I wrote it down as a potential puzzle piece of an exploit chain. By that time I developed an actual headache and called it a day. The next day it was time for *.getPath*. I made sure to write down in detail how each of the subfunctions behaved, to maybe figure out an edge case that would result in some unintended results (I've been there often enough). However, I found nothing in there. When I reached *.findRoute* I was sure I got my first main clue. This function features a `jp hl` at the end and there were some calls in between. This was the only jump in the entire code that did not have a hard-coded destination and was instead reading from a pointer table. So I spent the next two hours checking *\_StrEq\_* again if there was a way to somehow corrupt `hl` which might be a way to archive arbitrary code execution. After all, that's how our trusty Gen 1 glitch items work. However, this was to no avail. The pointer was loaded correctly and safely stored in the stack with no way to pop twice or push more data into there. This was the first time I got worried the exploit might depend on some flag getting reset or overwritten. This would be very easy to miss, especially for a beginner (and also for ZZAZZ itself, as others figured out). Thankfully somebody linked me to the [gbdev compiler](https://gbdev.io/rgbds-live/), so I could finally run things locally now. After debugging through the / route without any findings, and deeply investigating the *MakeContentLengthHeader* method that looked kinda sus, I realized something completely unrelated. The *wScratchBuffer* was conveniently right before the stack, making a buffer overflow possible then overwrite the stack and run some custom code. Maybe these were later parts in the exploit chain. And I could feed as much data as I wanted into this. Well, up to 0x800 bytes, as the harness enforced this limit (yes I tried that, would've been kind of a cheese through). However, the *\_ScanUtil\_* would cancel whenever it found `$00`, `$0a`, and `$0d`. This made it impossible to put more than 256 characters into the buffer, way too few to reach the stack. Once again my worries popped up that I missed the exploit already. But then I noticed the secret path had one more function I had not checked yet. *URLDecode*. ### Success? Unfortunately, this function would also return on $0a and $0d. But wait, the escape function didn't have this check and it was reading two bytes regardless. So fiddled a `%\r\n` into the input string and indeed, despite the *.getHex* returning $00, the function went on. So I put a string of 200 G's followed by two newlines after the `?` into the inputBuffer and repeated it 4 times. And indeed the buffer overflew, the stack was overwritten and the program counter was sent into oblivion. I got it! Now just see what happens when I put this into the input box on the live server ... And damn. Turns out http does have a valid charset and \r and \n are not allowed here. I'd have to escape them - but then I would be unable to bypass the line length check. I considered if I maybe was on the wrong path, but then I noticed the following message in the discord server. ![image](https://hackmd.io/_uploads/S1mX0OeW0.png) It was barely a hint, but maybe there was more here. Time to unfriend HTTP for real. ### The fight against HTTP So I did what any good dev does. ~~Read the docs~~. And quickly found out why CRLF was banned, it would separate the query string and headers. But I could surely provide headers! If I'd ensure that the last character in each line was a `%`, I could go on. So I switched to Postman, deleted all predefined headers, and added my own. My oversized request did make it through the check, but I still got "invalid password". Looking into the debug console I realized why, the stupid HTTP version added itself after the request string. With the first line not ending with a %, it wouldn't read to my tactical nuke in the headers. But there had to be a way. And indeed, in HTTP 0.9, the version was not required. A `GET /` therefore was a 100% valid request. So I needed to downgrade the version. However, Postman didn't support this and even curl would put it every time. At this point, I was about to write my own HTTP implementation that would just feed the data into the socket. But then I found a Stackoverflow answer, that suggested netcat. Then finally, after an hour and a half of wrestling with bash and netcat to send the data exactly as I wanted, the holy ... Error 502. Never in my life have I been so happy about a 5xx error code. That thing was about to fall. ### It's ACE time So, the last thing to do was to change from just blowing the stack to jumping into *wScratchBuffer* and executing some bytes I could conveniently put there by correctly using escapes. So I booted up my local environment again. The first task was to find the position in the input that would be written to the relevant stack entry and a target to jump to. Once this was done, I needed to make an ACE payload to output the password. I first thought about copying into one of the buffers and then patching the code to not overwrite. However, I found the much easier way to just jump back to .wrongPW and have the pointer to the password in `hl`. The function would then call *\_StrCopy\_* for me and create all the headers needed. ``` ld hl, $E603 jp $C203 ``` After some debugging to make everything align, I sent the request to the server. And I was greeted with the secret page, yet no password. So my local source was different than the server one even before the password. I'd have to make a more complex payload to scan for the password. Or wait. I must have landed just before my target, since *.contextCorrect* was loaded into hl. This instruction was three bytes long. So, increment the jump target by 3 and try again! Now I got `K` as a result. But that was good, that must be the header. By checking the RAM it was clear `hl` was exactly 4 bytes too low. Once I fixed that, my terminal showed me the fruit of effectively 9 hours of work🎊🎊🎊 This was the final request I've sent to the server: ```bash printf "GET /secret?p=%%\r\nlelHax%%: G%%21%%EA%%03%%C3%%C5%%03GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG%%\r\nmoreHax%%: GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG%%\r\nsuperHax%%: GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG%%\r\nurded%%: GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG%%0b%%da\r\n\r\n" | nc -v fools2024.online 26273 ``` Payload ``` ld hl, $EA03 jp $C503 ``` ### Closing notes In a nutshell, Challenge 3 was a headache-inducing rollercoaster. I don't think I went from "I think I got it" to "frick that doesn't work" that often in such a time before. But I learned a ton, especially reading assembly and some quirks that come with it (like `xor a` as a more efficient way for `a=0`) * I placed my code at $da0b because when directly writing into memory $0a did its thing * I know I could just remove the ?p= thingy, but I didn't ¯\\\_(ツ)\_/¯ * I used an offset of 4 for fetching the password, but only 3 for the code. That probably caused an extra $03 `inc bc` to be executed, but it worked, so deal with it --- ## Challenge 1 The promise alone was super interesting. We all know about Missingno.'s destructive properties and how it writes well past the allocated buffers meant for the sprites. And apparently, we can recover from the scrambled area. Thankfully I did remember that **Retro Game Mechanics Explained** had a [video on sprite decompression](https://www.youtube.com/watch?v=aF1Yw_wu2cM&t=1569). So watching this was of course my first step. The video turned out as the key to solving the challenge. At the linked timestamp, it is explained that for some reason during decompression, the data is **OR**ed to memory. While this does not matter for the buffers which are cleared before decompression starts, this very much does matter for our data, which would not get overwritten by zero-bytes. And turns out, that the rough area where B1F executes from (namely `$a7d0`) seems to be spared from any none-zero bytes due to a large RLE packet. This means that it is only affected by the delta decoding and the XOR, which both should be reversible! My mind was blown at this point. Not just because it was possible, but also because of the sheer dedication it takes for somebody to even look at this. I also noticed in the [sprite decompression tool](https://rgmechex.com/tech/gen1decompress.html) that none of the steps after the XOR modified any out-of-bounds data. This meant that we should be good to go to reverse starting from the XOR. ### Setting up Before doing anything I wanted to validate it in action. So I fired up bgb, overwrote SRAM bank 0 with `FF`-bytes, and Missingno.'d myself by setting `$D059` to `1F` while having a breakpoint on any r/w operations on SRAM. As I ran through the code, I took a dump of the bank each time I reached a major step in the code. This would allow me to write tests for my code and validate each step to prevent bugs from silking through. I also took notice of the exact order of operations and where they started or ended. So the last byte the XOR targeted was `$A857`. Also, the delta decoding always jumped 104 bytes after finishing a byte. This happened 13 times before it jumped back to redo it with a 1-byte offset. So we had 104 (13x8) columns, each 13 bytes wide. This was very plausible, given Missingno.'s sprite is 13x13 tiles. There still was one important thing I realized. After Missingno. is loaded, the game loads your own Pokemon. There is nothing out of the ordinary happening here, but it clears the sprite buffers in the progress, leaving only the out-of-bounds data intact. So we would need to patch the save file in some way. So I took another dump, this time only of the three sprite buffers right after the XOR was done. ### Undoing the remaining operations For the XOR, it XORed the second to the third buffer, storing it in the third. Since we were out of bounds, we had to do this backward, starting at address `$A857` down to including `$A310`. This worked first try, and the dump post-XOR matched the dump pre-XOR after I ran it through. Since the decompression runs a delta-decode, we need to write a delta-encode now. For this, we initialize the last bit with 0 and then read a byte from memory. For each bit, we only set the bit in the output byte when it was not equal to the last bit. Then we write the output over the input and jump 104 bytes forward to read the next byte. This is done `SPRITE_WIDTH=13` times, while the entire function runs `SPRITE_HEIGHT=104` times. Oh, and we have to do it for buffer 1 first then again for buffer 2 since delta-decoding happens for both buffers. After getting rid of a nasty bug, my test passed, and I was able to reverse the post-XOR dump into the pre-delta-decoding one. Progress! ### Testing my way to success Now I wanted to put a serious test, so I asked **TimoVM** to give me some sample payload I could put at `$A7D0`. Once I had this, I once again overwrote SRAM-bank 0 with `FF`'s, pasted the payload, and dumped once after each bigger step, taking another dump after returning to the overworld. Unsurprisingly, the payload wasn't working anymore. So first I grabbed my Missingno. preimage I took earlier and wrote a few more lines that would overwrite the sprite buffers with it. This gave me a 1:1 match with the post-XOR dump! Then I ran the full code and pasted it into bgb. Sure, the textbox showed just as if nothing happened! So now it was time for the big thing. I stripped the save file down to 8KB, fed it into my code, pasted it into bgb again, and KABOOM, there was my password 🎊🎊🎊 ### Closing notes This challenge was a lot of fun. Not only did I learn a lot about sprite decompression, but also blew my mind several times along the way. Ultimately, I spent about 6 hours on this, though I could have done it a lot faster if I hadn't validated any step thoroughly and skipped the second test that just showed my code was working as intended. ### Source code * To get the `preimage.dump` fight Missingno. with a breakpoint at offset `$2513` then dump `0x498` bytes * I removed the tests here, simply imagine another file being read and iterated over to compare to `sram0` * Coded in go 1.22.2 ```go package main import ( "fmt" "os" ) const ( BUFFER_1_START = 392 // hex 188 BUFFER_2_START = 784 // hex 310 CORRUPTION_END = 2135 // hex 857; last byte written BUFFER_LENGTH = 392 // hex 188 SPRITE_WIDTH = 104 // hex 68 SPRITE_HIGHT = 13 // hex D ) func main() { // input (8KiB for SRAM bank 0) fi_input, err := os.Open("rekt.dump") if err != nil { panic(err) } defer func() { if err := fi_input.Close(); err != nil { panic(err) } }() // an image taken from the 3 sprite buffers after the XOR happened fi_preimage, err := os.Open("preimage.dump") if err != nil { panic(err) } defer func() { if err := fi_preimage.Close(); err != nil { panic(err) } }() // output fo_result, err := os.Create("result.dump") if err != nil { panic(err) } defer func() { if err := fo_result.Close(); err != nil { panic(err) } }() // read the file into the buffer sram0 := make([]byte, 8192) _, err = fi_input.Read(sram0) if err != nil { panic(err) } // read preimage preimage := make([]byte, 1176) _, err = fi_preimage.Read(preimage) if err != nil { panic(err) } // patch SRAM with preimage for i := 0; i < len(preimage); i++ { sram0[i] = preimage[i] } // Undo the XOR for i := CORRUPTION_END; i >= BUFFER_2_START; i-- { byteBuffer2 := sram0[i] byteBuffer1 := sram0[i-BUFFER_LENGTH] sram0[i] = byteBuffer2 ^ byteBuffer1 } // Now undo delta #1 for i := 0; i < SPRITE_WIDTH; i++ { delta_encode_row(sram0, BUFFER_1_START+i) } // Now undo delta #2 for i := 0; i < SPRITE_WIDTH; i++ { delta_encode_row(sram0, BUFFER_2_START+i) } // write output if _, err := fo_result.Write(sram0[:]); err != nil { panic(err) } // pray it works ;) } func delta_encode_row(buf []byte, offset int) { lastBit := byte(0) // initialize with 0 for i := 0; i < SPRITE_HIGHT; i++ { readAddr := offset + i*SPRITE_WIDTH workByte := buf[readAddr] outputByte := byte(0) for pos := byte(7); pos != 255; pos-- { readBit := getBit(workByte, pos) // either 1 or 0 if readBit != lastBit { outputByte = setBit(outputByte, pos) lastBit = readBit } } buf[readAddr] = outputByte } } func setBit(n byte, pos byte) byte { n |= (1 << pos) return n } func getBit(n byte, pos byte) byte { return (n >> pos) & 1 } ``` --- ## Challenges 2 and 4 I did look into challenge 4, but ultimately found I had no real clue where to start. Even looking at the solution I barely understand anything, since I am neither familiar with the Gen 3 games nor with the hardware architecture. To tackle such challenges in the future I'll have to look deeper into the hardware of the GameBoy Advance as well as the decompilations. For challenge 2, it was mostly a lack of time combined with multiple users stating that it was tedious to figure out. So I didn't look at it at all. Maybe I'll attempt this sometime in the future (without cheesing it of course). ## Conclusion and Outlook ![image](https://hackmd.io/_uploads/BJlz9dzXWC.png) Scoring 1.168 points and placing an incredible #17 on the leaderboard, I must say this event was a huge success. Watching videos and learning about how these games were pieced together is one thing, actually picking things apart and reversing algorithms is another. I am definitely going to participate again next April Fools. Hopefully, I'll have a little more time available next time, so I can tackle all of the challenges. Half of the time actually went into 100%ing the game (which was insanely good and definitely not wasted time!), so if the next event is smaller that might actually help :sweat_smile: I am going to participate again next April Fools. Hopefully, I'll have a little more time available next time, so I can tackle all of the challenges. Half of the time went into 100%ing the game (which was insanely good and not wasted time!), so if the next event is smaller that might help :sweat_smile: