Flare-On 5 Challenge === > Author: shiki7 ![](https://i.imgur.com/XvK2qCT.png) [TOC] ## Challenge 1. Minesweeper Championship Registration We're given a `jar` file. Using `jad` to decompile the contained `class` files gives us the key: ``` C:\Users\pc\Documents\CTF\flareon\1\m>jad -p InviteValidator.class // Decompiled by Jad v1.5.8g. Copyright 2001 Pavel Kouznetsov. // Jad home page: http://www.kpdus.com/jad.html // Decompiler options: packimports(3) // Source File Name: InviteValidator.java import javax.swing.JOptionPane; public class InviteValidator { public InviteValidator() { } public static void main(String args[]) { String response = JOptionPane.showInputDialog(null, "Enter your invitation code:", "Minesweeper Championship 2018", 3); if(response.equals("GoldenTicket2018@flare-on.com")) JOptionPane.showMessageDialog(null, (new StringBuilder("Welcome to the Minesweeper Championship 2018!\nPlease enter the following code to the ctfd.flare-on.com website to compete:\n\n")).append(response).toString(), "Success!", -1); else JOptionPane.showMessageDialog(null, "Incorrect invitation code. Please try again next year.", "Failure", 0); } } ``` The flag is: `GoldenTicket2018@flare-on.com` ## Challenge 2. Ultimate Minesweeper We're given an .NET assembly which is a minesweeper game. According to the decompilation result of `.NET Reflector`, the coordinates of the normal fields are generated dynamically. I extracted the algorithm and wrote a simple program which prints out the coordinates: ```csharp= using System; using System.IO; // Ch3aters_Alw4ys_W1n@flare-on.com namespace Test { class Test { private static UInt32 VALLOC_NODE_LIMIT = 30; private static UInt32 VALLOC_TYPE_HEADER_PAGE = 0xfffffc80; private static UInt32 VALLOC_TYPE_HEADER_POOL = 0xfffffd81; private static UInt32 VALLOC_TYPE_HEADER_RESERVED = 0xfffffef2; private static UInt32[] VALLOC_TYPES; private static uint DeriveVallocType(uint r, uint c) { return ~((r * VALLOC_NODE_LIMIT) + c); } private static void AllocateMemory() { for (uint i = 0; i < VALLOC_NODE_LIMIT; i++) { for (uint j = 0; j < VALLOC_NODE_LIMIT; j++) { bool flag = true; uint r = i + 1; uint c = j + 1; if (Array.IndexOf(VALLOC_TYPES, DeriveVallocType(r, c)) > -1) { Console.WriteLine("I : {0}, J : {1}", i, j); } } } } public static void Main(string[] args) { VALLOC_TYPES = new uint[] { VALLOC_TYPE_HEADER_PAGE, VALLOC_TYPE_HEADER_POOL, VALLOC_TYPE_HEADER_RESERVED }; AllocateMemory(); return; } } } ``` The flag is: `Ch3aters_Alw4ys_W1n@flare-on.com` ## Challenge 3. FLEGGO In this challenge we get dozens of executables. A quick glance at some of the binaries revealed that every program compares the input with some embedded resource, and extract an image along with a character if correct input is provided. Notably every image extracted is marked with an number, presumably the index of the character inside the flag string. So I wrote a simple script to help me recover the flag: ```python= # -*- coding:utf-8 -*- import pepy import os import subprocess import sys import glob import shutil import IPython from PIL import Image # get the key def parse_one(filename): pe = pepy.parse(filename) brick = pe.get_resources()[0] assert brick.type_str == "B\x00R\x00I\x00C\x00K\x00" # BRICK # read wchar_t string key = '' idx = 0 while True: if brick.data[idx] == 0: break key += chr(brick.data[idx]) idx += 2 return key # extract essentials def extract_essentials(filename, key): proc = subprocess.Popen(filename, stdin=subprocess.PIPE, stdout=subprocess.PIPE) proc.stdin.write(key + '\n') proc.wait() output = proc.stdout.read() essntials = map(lambda x: x.replace(' ', ''), output.split('\r\n')[2].split('=>')) return essntials # mor3_awes0m3_th4n_an_awes0me_p0ssum@flare-on.com def main(argv): # make image directory if not os.path.exists("images"): os.mkdir("images") results = [] for executable in glob.glob("FLEGGO\\*.exe"): print "Processing %s..." % executable key = parse_one(executable) essentials = extract_essentials(executable, key) print essentials shutil.move("FLEGGO\\" + essentials[0], "images\\" + essentials[0]) os.system("open %s" % ("images\\" + essentials[0])) number = int(raw_input("Index: ")) # I'm no good at tesseract... can't get it working... essentials[0] = number results.append(essentials) results.sort(key=lambda x: x[0]) flag = '' for idx, char in results: flag += char print flag IPython.embed() return 0 if __name__ == '__main__': sys.exit(main(sys.argv)) ``` ~~And yes, I would use OCR if I could.~~ The flag is: `mor3_awes0m3_th4n_an_awes0me_p0ssum@flare-on.com` ## Challenge 4. binstall We're provided with a malicious dotNet assembly which is obfuscated. I first ran `de4dot` on this binary and started using `dnSpy` to analyze the deobfuscated file. The malware will first delete some files, and install an DLL file which is called `browserassist.dll`. The DLL is configured as an AppInitDLL, but according to the challenge description, this DLL will only work in the context of `firefox.exe`. Unfortunately I couldn't get this running, so let's begin our static analysis as usual. Inspecting the DLL I found that there is a general decryption routine, which will use hashed `FL@R3ON.EXE` as key and decrypt some data. Decrypting some of the data I found a suspicious `pastebin` URL, using the same decryption routine to decrypt the content of the URL gives us several JSON object which contains some javascript snippets. Looks like this DLL will inject these javascript to some of the web pages. Combining and analyzing the javascript gives us the flag. *Details forgotten.* *Sadly I lost the progress due to the death of my VM...* ## Challenge 5. Web 2.0 This is a `WebAssembly` reversing challenge. We're provided with: * `index.html`, html page running the challenge. * `test.wasm`, WebAssembly executable * `main.js`, javascript referenced by the index page, fetching and executing `test.wasm` By reading `main.js`, we found there's an function exported by `test.wasm` named `Match`. `main.js` calls this function with two parameters, an array containing some data, and our input, and if the function returns `1`, we win the challenge. So the verifing logic must be in the WebAssembly file, we need to dig through it. I cat't tell you how to efficiently read the WebAssembly specification and manually decompile the whole thing, actually it's a quite painful process. Fortunately I found a shortcut which greatly accelerates my analysis: I used `wasm2c` in [wabt](https://github.com/WebAssembly/wabt), converting the WebAssembly into a single `c` source file, and compiled the generated source file into a native binary, and this makes the logic human-readable. Reversing the compiled native binary and it showed us it's actually an VM and the provided array is the bytecode, so I can finally wrote the solver script: ```python= #!/usr/bin/env python # -*- coding: utf-8 -*- from ctypes import c_uint8 code = [ 0xE4, 0x47, 0x30, 0x10, 0x61, 0x24, 0x52, 0x21, 0x86, 0x40, 0xAD, 0xC1, 0xA0, 0xB4, 0x50, 0x22, 0xD0, 0x75, 0x32, 0x48, 0x24, 0x86, 0xE3, 0x48, 0xA1, 0x85, 0x36, 0x6D, 0xCC, 0x33, 0x7B, 0x6E, 0x93, 0x7F, 0x73, 0x61, 0xA0, 0xF6, 0x86, 0xEA, 0x55, 0x48, 0x2A, 0xB3, 0xFF, 0x6F, 0x91, 0x90, 0xA1, 0x93, 0x70, 0x7A, 0x06, 0x2A, 0x6A, 0x66, 0x64, 0xCA, 0x94, 0x20, 0x4C, 0x10, 0x61, 0x53, 0x77, 0x72, 0x42, 0xE9, 0x8C, 0x30, 0x2D, 0xF3, 0x6F, 0x6F, 0xB1, 0x91, 0x65, 0x24, 0x0A, 0x14, 0x21, 0x42, 0xA3, 0xEF, 0x6F, 0x55, 0x97, 0xD6 ] result = '' idx = 0 while idx < len(code): func = code[idx] & 0xF if func == 0: result += chr(code[idx + 1]) idx += 2 elif func == 1: result += chr(code[idx + 1] ^ 0xff) idx += 2 elif func == 2: result += chr(code[idx + 1] ^ code[idx + 2]) idx += 3 elif func == 3: result += chr(code[idx + 1] & code[idx + 2]) idx += 3 elif func == 4: result += chr(code[idx + 1] | code[idx + 2]) idx += 3 elif func == 5: result += chr(c_uint8(code[idx + 1] + code[idx + 2]).value) idx += 3 elif func == 6: result += chr(c_uint8(code[idx + 2] - code[idx + 1]).value) idx += 3 else: raise "Invalid instruction" print result ``` The flag is: `wasm_rulez_js_droolz@flare-on.com` ## Challenge 6. Magic A traditional Linux `ELF` reversing challenge, we're provided an stripped x64 ELF binary. The binary roughly does several things: * call `srand` with an static seed * for each part of the input, decrypts the corresponding checking function and calls it, and encrypts it again after return. * xor's some data with the input * goes back to step 2, this loops 666 times. Interestingly, this binary overwrites it self after each input, shuffling the input checker sequence, and saves it to the disk, overriding the original file. I wrote the following script to extract the input checking sequence: ```python= #!/usr/bin/env python # -*- coding: utf-8 -*- import idaapi import cPickle as pickle dataset_addr = 0x605100 check_rounds = 33 ''' 00000000 magic_t struc ; (sizeof=0x120, mappedto_6) 00000000 ; XREF: .data:dataset/r 00000000 code_ptr dq ? 00000008 code_len dd ? 0000000C input_offset dd ? 00000010 length dd ? 00000014 key_offset dd ? 00000018 xor_key dq ? 00000020 parameter db 256 dup(?) 00000120 magic_t ends ''' l2s = lambda x: ''.join(map(chr, x)) def get_contiguous(ea, length): result = [] for i in xrange(length): result.append(idaapi.get_byte(ea + i)) return result def xor(idx): addr = dataset_addr + 288 * idx code_ptr = idaapi.get_64bit(addr) code_len = idaapi.get_32bit(addr + 8) xor_key = idaapi.get_64bit(addr + 0x18) for i in xrange(code_len): orig = idaapi.get_byte(code_ptr + i) ^ idaapi.get_byte(xor_key + i) idaapi.patch_byte(code_ptr + i, orig) return code_len def dump_data(idx): addr = dataset_addr + 288 * idx code_ptr = idaapi.get_64bit(addr) code_len = idaapi.get_32bit(addr + 8) input_offset = idaapi.get_32bit(addr + 0xc) length = idaapi.get_32bit(addr + 0x10) key_offset = idaapi.get_32bit(addr + 0x14) xor_key = idaapi.get_64bit(addr + 0x18) parameter = get_contiguous(addr + 0x20, 256) name = idaapi.get_name(code_ptr) return { 'code_ptr': code_ptr, 'code_len': code_len, 'input_offset': input_offset, 'length': length, 'key_offset': key_offset, 'xor_key': xor_key, 'parameter': parameter, 'name': name } rounds = [] for i in xrange(check_rounds): rounds.append(dump_data(i)) open("/Users/arch/CTF/flareon/6/dump.pickle", 'wb').write(pickle.dumps(rounds)) print 'Done!' ``` Further reverse engineering tells us the algorithms used inside the checker function. And I wrote the following script to immitate the program behaviour and found the correct input. ```python= #!/usr/bin/env python # -*- coding: utf-8 -*- # mag!iC_mUshr00ms_maY_h4ve_g!ven_uS_Santa_ClaUs@flare-on.com from IPython import embed import cPickle as pickle import struct, binascii, itertools, string, base64, ctypes libc = ctypes.CDLL("libc.so.6") u64 = lambda x: struct.unpack("<Q", x)[0] u32 = lambda x: struct.unpack("<L", x)[0] l2s = lambda x: ''.join(map(chr, x)) b64pad = lambda x: x + (4 - len(x) % 4) * '=' w32 = lambda x: ctypes.c_uint32(x).value DATA = "$\\u y:\x12E\x1e \x1dq\x197&\x17g\x03\x10>0g|J\x11\x1b^U\x08\x13b\x11hl|ZD\x17,\x12yY\t$c\x0cmW\x1fe'\x0cj\x0f]&CFY3\x14\\7\nc&\x02\x16,\x00\x00\x00" libc.srand(0x5f215742) def rand(): return w32(libc.rand()) def fib(x): result = 0 v5 = 1 v4 = 0 while x != 0: result = v4 + v5 v4 = v5 v5 = result x -= 1 return result def find_fib(x): for i in xrange(256): if fib(i) % pow(2, 64) == x: return i raise Exception("Not find in fib seq") KEY = 'Tis but a scratch.' def rc4_xor(data, key): key = map(ord, key) def KSA(key): keylength = len(key) S = range(256) j = 0 for i in range(256): j = (j + S[i] + key[i % keylength]) % 256 S[i], S[j] = S[j], S[i] # swap return S def PRGA(S): i = 0 j = 0 while True: i = (i + 1) % 256 j = (j + S[i]) % 256 S[i], S[j] = S[j], S[i] # swap K = S[(S[i] + S[j]) % 256] yield K S = KSA(key) KS = PRGA(S) keys = [] for i in xrange(len(data)): keys.append(KS.next()) return map(lambda x: x[0] ^ x[1], zip(data, keys)) data = pickle.loads(open("../dump.pickle").read()) def solve(): global data result = {} for i in xrange(33): chars = [] name = data[i]['name'] param = data[i]['parameter'] length = data[i]['length'] key_offset = data[i]['key_offset'] inp_off = data[i]['input_offset'] if name.startswith('xor'): for j in xrange(length): chars.append(chr(param[j] ^ 0x2a)) elif name.startswith('add'): for j in xrange(length): chars.append(chr(param[j] - 0xd)) elif name.startswith('strcmp'): for j in xrange(length): chars.append(chr(param[j])) elif name.startswith('fib'): idx = 0 for j in xrange(length): par = u64(l2s(param[idx:idx+8])) chars.append(chr(find_fib(par))) idx += 8 elif name.startswith('rc4'): dec = rc4_xor(param[0:length], KEY) for c in dec: chars.append(chr(c)) elif name.startswith('hash'): # crc32b h = u32(l2s(param[0:4])) for comb in itertools.product(string.printable, repeat=length): if w32(binascii.crc32(''.join(comb))) == h: for c in comb: chars.append(c) break else: raise Exception("CRC32B not found") elif name.startswith('hybrid'): newtable = '2a395f64c2a74623'.decode('hex') + 'BpM(GtkS'[::-1] + 'J@8bjR%I'[::-1] + 'P$1-YDEi'[::-1] + 'fqvL!Tyg'[::-1] + '0OWQmhc+'[::-1] + 'l3nu4ZNe'[::-1] + 'Kzaw2&H7'[::-1] origtable = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/' trans = string.maketrans(newtable, origtable) decode_len = (length * 4) / 3 + (length * 4) % 3 par = l2s(param[0:decode_len]).translate(trans) par = b64pad(par) #print par for c in base64.b64decode(par): chars.append(c) else: raise Exception("No coherent decoding function") for idx, c in enumerate(chars): result[inp_off + idx] = c solved = '' for i in xrange(69): solved += result[i] return solved def randomize(): inp_off = 0 for i in xrange(33): idx = i + (rand() % (33 - i)) data[idx]['input_offset'] = inp_off inp_off += data[idx]['length'] tmp = data[i] data[i] = data[idx] data[idx] = tmp rand() # this choose xor key for i in xrange(33): idx = i + (rand() % (33 - i)) tmp = data[i] data[i] = data[idx] data[idx] = tmp return results = [] for i in xrange(666): g = solve() print g results.append(g) randomize() open("./answers.pickle", 'wb').write(pickle.dumps(results)) print 'Done!' embed() ``` Feeding the original program with the inputs gives us the flag. The flag is: `mag!iC_mUshr00ms_maY_h4ve_g!ven_uS_Santa_ClaUs@flare-on.com` ## Challenge 7. WoW This challenge is pretty interesting. The given binary is a 32-bit Windows PE executable, but this binary switches to 64-bit code segment and loads a 64-bit DLL in memory. The loaded DLL hooks `NtDeviceIoControlFile` and extract another 32-bit DLL and the check logic seems to be inside this 32-bit DLL, but the strange thing is, it will take user input as a port number and connects to `127.0.0.1:{input}` and at last, `recv` the flag. This gives me a hint about why the program hooks `NtDeviceIoControlFile`. On Windows, most of the socket-related functions will make IoCtls to `afd.sys`, with `AFD_XXX` as the IoControlCode. The hook intercepts all the `connect`, `recv` calls and redirects it to its own logic. After figuring out the working mechanisms of this challenge, the rest is just a piece of cake. ```python= import struct # P0rt_Kn0ck1ng_0n_he4v3ns_d00r@flare-on.com keys = [15, 87, 97, 119, 11, 250, 181, 209, 129, 153, 172, 167, 144, 88, 26, 82, 12, 160, 8, 45, 237, 213, 109, 231, 224, 242, 188, 233, 242] data = [95, 104, 68, 98, 35, 186, 33, 84, 51, 115, 4, 101, 80, 151, 114, 38, 1, 196, 205, 17, 182, 11, 214, 249, 88, 118, 126, 101, 105] assert len(keys) == len(data) def solve(): for i in xrange(len(keys)): data[i] ^= keys[i] for j in xrange(i + 1, len(keys)): keys[j] ^= keys[i] return solve() print ''.join(map(chr, data)) + '@flare-on.com' ``` ## Challenge 8. Doogie Hacker The challenge gives us a dump of the first few sectors of a disk. Running this binary inside QEMU gives us a prompt asking for password. So our goal is to find the password. Doing a little reversing shows that the program first read the code from disk to `0x8000` and jumps to it, and then: * get current time and stores it to `0x87f2` (4 bytes). * `keyed_xor(0x8809, 0x87f2, 4)` * read user input * `keyed_xor(0x8809, user_input, len(user_input))` * `write_string(0x8809)` So the first problem is the time. Actually the time comes from the prompt, it tells us the correct time is `February 06, 1990`, so I used a small utility called `faketime` to run QEMU. And now I can extract the content at `0x8809` xored with the time: ```python= #!/usr/bin/env python # -*- coding: utf-8 -*- import idaapi key = '19900206'.decode('hex') def xor(data, key): result = '' for idx, c in enumerate(data): result += chr(c ^ ord(key[idx % len(key)])) return result d = [] start = 0x8809 while True: b = idaapi.get_byte(start) if b == 0: break d.append(b) start += 1 dt = xor(d, key) open("/Users/arch/CTF/flareon/8/dump.bin", 'wb').write(dt) ``` And the second problem is about the key, we don't know how long the key is and what could be inside the decrypted data. So I found this fascinating tool [xortool](https://github.com/hellman/xortool) suitable for this situation. Brute-forcing all the possible most frequent characters shows that the original data could be some kind of ascii art and gives us a pretty good finding of the key: ``` ioperal7dwvmalware ``` ... and tweaking this a little bit gives us the correct key: `ioperateonmalware` The flag is: `R3_PhD@flare-on.com` ## Challenge 9. leet editr This is a pretty annoying challenge, as it cost me a lot of time to figure out why the challenge refuses to run on a `zh_CN` Windows 7. The program first allocated some pages and copied some data to it, and then it installed a `VEH`, and jumps to the allocated page. The `VEH` is actually the key part of this challenge. It will dynamically decrypt the data and code when the faulting instruction needs it to, and encrypt it back after the execution of the instruction. I don't know how to use dynamic approaches against this anti-debug technique as my `x64dbg` always gets deceived by the `single trap exception` even if I told him to ignore it. So I wrote a python script to extract the code and data from the binary: ```python= import idaapi import ctypes ''' If I were to title this piece, it would be A_FLARE_f0r_th3_Dr4m4t1(C) ''' c32 = lambda x: ctypes.c_uint32(x).value c8 = lambda x: ctypes.c_uint8(x).value ss_size = 36 css_addr = 0x40C2B0 css_cnt = 6 dss_addr = 0x40CD00 dss_cnt = 1 def get_contiguous(ea, size): result = '' for i in xrange(size): result += chr(idaapi.get_byte(ea + i)) return result def xor1(data, key): result = '' for i in xrange(len(data)): result += chr(ord(data[i]) ^ key) return result def xorc(data, key, inc): result = '' key = c8(key) for i in xrange(len(data)): result += chr(ord(data[i]) ^ key) key = c8(key + inc) return result matome_code = '' ''' # extract code segments for i in xrange(css_cnt): desc = css_addr + i * ss_size enc_type = idaapi.get_32bit(desc) assert enc_type == 1 # xor enc size = idaapi.get_32bit(desc + 8) static_addr = idaapi.get_32bit(desc + 12) static_data = idaapi.get_32bit(static_addr) code = get_contiguous(static_data, size) encparam = idaapi.get_32bit(desc + 28) keyaddr = idaapi.get_32bit(encparam + 4) key = idaapi.get_byte(keyaddr) decrypted = xor1(code, key) matome_code += decrypted filename = r"C:\Users\pc\Documents\CTF\flareon\9\dump\code{}.bin".format(i) open(filename, 'wb').write(decrypted) open(r"C:\Users\pc\Documents\CTF\flareon\9\dump\all_code.bin", 'wb').write(matome_code) ''' def rc4_xor(data, key): key = map(ord, key) def KSA(key): keylength = len(key) S = range(256) j = 0 for i in range(256): j = (j + S[i] + key[i % keylength]) % 256 S[i], S[j] = S[j], S[i] # swap return S def PRGA(S): i = 0 j = 0 while True: i = (i + 1) % 256 j = (j + S[i]) % 256 S[i], S[j] = S[j], S[i] # swap K = S[(S[i] + S[j]) % 256] yield K S = KSA(key) KS = PRGA(S) keys = [] for i in xrange(len(data)): keys.append(KS.next()) return ''.join(map(lambda x: chr(x[0] ^ x[1]), zip(map(ord, data), keys))) ''' # extract data segments data = '' ptr = 0 size = idaapi.get_32bit(dss_addr + 8) anatomy = idaapi.get_32bit(dss_addr + 20) static_addr = idaapi.get_32bit(dss_addr + 12) static_data = idaapi.get_32bit(static_addr) idx = 0 encparam = idaapi.get_32bit(dss_addr + 28) k0s = idaapi.get_32bit(encparam) k0 = get_contiguous(idaapi.get_32bit(encparam + 4), k0s) k1s = idaapi.get_32bit(encparam + 8) k1 = get_contiguous(idaapi.get_32bit(encparam + 12), k1s) k2s = idaapi.get_32bit(encparam + 16) k2 = get_contiguous(idaapi.get_32bit(encparam + 20), k2s) while ptr < size: offset = idaapi.get_32bit(anatomy + 8 * idx) size = idaapi.get_32bit(anatomy + 4 + 8 * idx) if offset == 1: break encrypted = get_contiguous(static_data + offset, size) if idx % 3 == 1: decrypted = '' for i in xrange(len(encrypted)): decrypted += chr(ord(encrypted[i]) ^ c8(ord(k1[i % k1s]) + idx * i)) elif idx % 3 == 2: decrypted = xorc(rc4_xor(encrypted, k2), idx, idx - 1) else: decrypted = '' seed = idx for i in xrange(len(encrypted)): decrypted += chr(ord(encrypted[i]) ^ c8(57 - 0x93 * seed)) seed = c32((12345 - 0x3E39B193 * seed)) & 0x7FFFFFFF idx += 1 data += decrypted open(r"C:\Users\pc\Documents\CTF\flareon\9\dump\all_data.bin", 'wb').write(data) ''' # extract code code_addr = 0x40F110 code_size = 0x0C50 key = 'sn00gle-fl00gle-p00dlekins' encrypted = get_contiguous(code_addr, code_size) decrypted = rc4_xor(encrypted, key) dec2 = rc4_xor(decrypted.decode('base64'), 'yummy') open(r"C:\Users\pc\Documents\CTF\flareon\9\dump\phase2.bin", 'wb').write(dec2) print 'Done!' ``` This gives us some sort of `vbscript` and the code. The code actually calls the `windows script host` and inserts several custom objects, and executes the script. The script will open a `Internet explorer`, set the content and monitors the input editarea. Interestingly, there's a snippet encrypted with `RC4` which will detect several debuggers, the script will dynamically decrypt and executes this snippet. So the script will first check if the ascii art is the correct one, I put the ascii art `FLARE` extracted from the data to the input box and it asks for the title. Actually the title input comes from the code (a string embedded inside the code). Input the correct title gives us the flag. The flag is: `scr1pt1ng_sl4ck1ng_and_h4ck1ng@flare-on.com` ## Challenge 10. golf This challenge is my personal favourite, though it didn't fraustrated me too much, it did give me an inspiration of how we can utilize hardware features to create RE challenges. So the challenge is a Windows executable, which will extract a driver and load it while running. Something interesting is that the executable has a special instruction `vmcall` inside, with the challenge description `Did you bring your visor?`, this immediately reminds me that the driver could be a `hypervisor`. Since I already had some experience working with `VMX` related stuffs, understanding how the driver works isn't too hard to me. The most fascinating part is that the driver utilizes `EPT_VIOLATION` vmexit to implement a whole different instruction set, which is an awesome idea. The script: ```python= import os, sys, struct u8 = lambda x: struct.unpack("<B", x)[0] u16 = lambda x: struct.unpack("<h", x)[0] u32 = lambda x: struct.unpack("<L", x)[0] u64 = lambda x: struct.unpack("<Q", x)[0] ''' 0x01: ret 0xBB [reg]: pop reg 0xAA [reg]: push reg 0xC2 [r1] [r2]: add r2, r1 0x00 [r1] [r2]: set r2, r1 0xC3 [reg] [imm32]: sub reg, imm32 0xC1 [reg] [imm32]: add reg, imm32 0xD1 [reg] [imm32]: add32 reg, imm32 0xD3 [reg] [imm32]: sub32 reg, imm32 0xD2 [r1] [r2]: add32 r1, r2 0xD4 [r1] [r2]: sub32 r2, r1 0xC9 [imm32] [imm64]: set [rsp+imm32], imm64 0xD5 [r1] [r2]: xor32 r1, r2 0xC5 [r1] [r2]: xor r2, r1 0xD6 [reg] [imm32]: mov32 reg, imm32 0xC6 [reg] [imm64]: mov reg, imm64 0x30: memset(rdi, *rax, rcx) 0xC8 [reg] [imm32]: set [rsp+imm32], reg 0xD8 [reg] [imm32]: set32 [rsp+imm32], reg 0x1A [reg] [imm32]: set reg, [rsp + imm32] 0xC7 [r1] [r2]: set r1, r2 0x4A [r1] [imm32]: lea r1, rsp + imm32 0x44 [r1] [r2]: testz r1, r2 0x40 [r1] [r2]: cmp32 r1, r2 0x42 [off32] [imm32]: cmp32 [rsp+off32], imm32 0x50 [imm32]: set rip, rip + imm32 0x51 [imm16]: setnz rip, rip + imm16 0x52 [imm16]: setz rip, rip + imm16 0x54 [imm16]: setxx rip, rip + imm16 0xd7 [r1] [imm32]: xor32 r1, [rsp+imm32] 0x19 [r1] [imm32]: set32 r1, [rsp+imm32] 0x1b [imm32]: setal [rsp+imm32] 0x17 [imm32]: set8 [rsp+imm32], al 0xc0 [r1] [imm32]: xor32 r1, [imm32] 0x02 [r1] [r2]: set32 r2, r1 0x43 [r1] [r2]: testz32 r1, r2 0xbe [r1] [imm32]: and32 r1, imm32 0xbc [r1] [imm8]: shr32 r1, imm8 0xb9 [r1] [imm8]: shr r1, imm8 0x41 [r1] [imm32]: cmp32 r1, imm32 0x1d [r1] [r...] [imm32]: set r1, [rsp + imm32] 0x1e [r1] [r...] [imm32]: set r1, [imm32 + r... + rsp] 0x1f [r1] [r...] [imm32]: set r1, [imm32 + r...] 0x20 [r1] [r...] [imm32]: ''' def _getreg(idx): if idx == 0xF4: return "rip" elif idx == 0xEE: return "rax" elif idx == 0xEF: return "rbx" elif idx == 0xF0: return "rcx" elif idx == 0xF1: return "rdx" elif idx == 0xF2: return "rsi" elif idx == 0xF3: return "rdi" elif idx == 0xf5: return "rsp" elif idx == 0xf6: return 'rbp' elif idx == 0xf7: return 'r8' elif idx == 0xf8: return 'r9' elif idx == 0xf9: return 'r10' elif idx == 0xfa: return 'r11' elif idx == 0xfb: return 'r12' elif idx == 0xfc: return 'r13' elif idx == 0xfd: return 'r14' elif idx == 0xfe: return 'r15' #raise Exception("None") return None def _getreg32(idx): if idx == 0xF4-9: return 'eip' elif idx == 0xEE-9: return 'eax' elif idx == 0xEF-9: return 'ebx' elif idx == 0xF0-9: return 'ecx' elif idx == 0xF1-9: return 'edx' elif idx == 0xF2-9: return 'esi' elif idx == 0xF3-9: return 'edi' elif idx == 0xF5-9: return 'esp' elif idx == 0xf6-9: return 'ebp' return None def disasm(buffer): index = 0 while True: print '%d:' % index, opcode = ord(buffer[index]) if opcode == 0x01: print 'ret' index += 1 elif opcode == 0xbb: print 'pop {}'.format(_getreg(u8(buffer[index + 1]))) index += 2 elif opcode == 0xaa: print 'push {}'.format(_getreg(u8(buffer[index + 1]))) index += 2 elif opcode == 0xc2: print 'add {}, {}'.format(_getreg(u8(buffer[index + 2])), _getreg(u8(buffer[index + 1]))) index += 3 elif opcode == 0x00: print 'mov {}, {}'.format(_getreg(u8(buffer[index + 2])), _getreg(u8(buffer[index + 1]))) index += 3 elif opcode == 0xc3: print 'sub {}, {}'.format(_getreg(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0xc1: print 'add {}, {}'.format(_getreg(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0xd1: print 'add32 {}, {}'.format(_getreg32(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0xd3: print 'sub32 {}, {}'.format(_getreg32(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0xd2: print 'add32 {}, {}'.format(_getreg32(u8(buffer[index + 2])), _getreg32(u8(buffer[index + 1]))) index += 3 elif opcode == 0xd4: print 'sub32 {}, {}'.format(_getreg32(u8(buffer[index + 2])), _getreg32(u8(buffer[index + 1]))) index += 3 elif opcode == 0xc9: print 'set64 [rsp + {}], {}'.format(u32(buffer[index + 1: index + 5]), u64(buffer[index + 5: index + 13])) index += 13 elif opcode == 0xd5: print 'xor32 {}, {}'.format(_getreg32(u8(buffer[index + 1])), _getreg32(u8(buffer[index + 2]))) index += 3 elif opcode == 0xc5: print 'xor {}, {}'.format(_getreg(u8(buffer[index + 2])), _getreg(u8(buffer[index + 1]))) index += 3 elif opcode == 0xd6: print 'set32 {}, {}'.format(_getreg32(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0xc6: print 'set64 {}, {}'.format(_getreg(u8(buffer[index + 1])), u64(buffer[index + 2: index + 10])) index += 10 elif opcode == 0x30: print 'memset rdi, *rax, rcx' index += 1 elif opcode == 0xc8: print 'set64 [rsp + {}], {}'.format(u32(buffer[index + 2: index + 6]), _getreg(u8(buffer[index + 1]))) index += 6 elif opcode == 0xd8: print 'set32 [rsp + {}], {}'.format(u32(buffer[index + 2: index + 6]), _getreg32(u8(buffer[index + 1]))) index += 6 elif opcode == 0x1a: print 'set64 {}, [rsp + {}]'.format(_getreg(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0xc7: print 'set64 {}, {}'.format(_getreg(u8(buffer[index + 2])), _getreg(u8(buffer[index + 1]))) index += 3 elif opcode == 0x4a: print 'lea {}, [rsp + {}]'.format(_getreg(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0x44: print 'testz {}, {}'.format(_getreg(u8(buffer[index + 2])), _getreg(u8(buffer[index + 1]))) index += 3 elif opcode == 0x40: print 'cmp32 {}, {}'.format(_getreg32(u8(buffer[index + 1])), _getreg32(u8(buffer[index + 2]))) index += 3 elif opcode == 0x42: print 'cmp32 [rsp + {}], {}'.format(u32(buffer[index + 1: index + 5]), u32(buffer[index + 5: index + 9])) index += 9 elif opcode == 0x50: print 'set64 rip, rip + {}'.format(u16(buffer[index + 1: index + 3])) index += 3 elif opcode == 0x51: print 'set64nz rip, rip + {}'.format(u16(buffer[index+1:index+3])) index += 3 elif opcode == 0x52: print 'set64z rip, rip + {}'.format(u16(buffer[index+1:index+3])) index += 3 elif opcode == 0x54: print 'set64xx rip, rip + {}'.format(u16(buffer[index+1:index+3])) index += 3 elif opcode == 0xd7: print 'xor32 {}, [rsp + {}]'.format(_getreg32(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0x19: print 'set32 {}, [rsp + {}]'.format(_getreg32(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0x1b: print 'setal [rsp + {}]'.format(u32(buffer[index + 1: index + 5])) index += 5 elif opcode == 0x17: print 'set8 [rsp + {}], al'.format(u32(buffer[index + 1: index + 5])) index += 5 elif opcode == 0xc0: print 'xor32 {}, {}'.format(_getreg32(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0x02: print 'set32 {}, {}'.format(_getreg32(u8(buffer[index + 2])), _getreg32(u8(buffer[index + 1]))) index += 3 elif opcode == 0x43: print 'testz32 {}, {}'.format(_getreg32(u8(buffer[index + 2])), _getreg32(u8(buffer[index + 1]))) index += 3 elif opcode == 0xbe: print 'and32 {}, {}'.format(_getreg32(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode == 0xbc: print 'shr32 {}, {}'.format(_getreg32(u8(buffer[index + 1])), u8(buffer[index + 2])) index += 3 elif opcode == 0xb9: print 'shr {}, {}'.format(_getreg(u8(buffer[index + 1])), u8(buffer[index + 2])) index += 3 elif opcode == 0x41: print 'cmp32 {}, {}'.format(_getreg32(u8(buffer[index + 1])), u32(buffer[index + 2: index + 6])) index += 6 elif opcode >= 0x1c and opcode <= 0x20: ptr = index if opcode == 0x1d: ptr += 1 target = _getreg(u8(buffer[ptr])) ptr += 1 while _getreg(u8(buffer[ptr])) != None: ptr += 1 imm32 = u32(buffer[ptr: ptr + 4]) ptr += 4 print 'set64sx {}, dword ptr [rsp + {}]'.format(target, imm32) elif opcode == 0x1e: ptr += 1 target = _getreg32(u8(buffer[ptr])) ptr += 1 s = [] while _getreg(u8(buffer[ptr])) != None: s.append(_getreg(u8(buffer[ptr]))) ptr += 1 imm32 = u32(buffer[ptr: ptr + 4]) ptr += 4 print 'set8sx {}, byte ptr [rsp + {} + {}]'.format(target, '+'.join(s), imm32) elif opcode == 0x1f: ptr += 1 target = _getreg32(u8(buffer[ptr])) ptr += 1 s = [] while _getreg(u8(buffer[ptr])) != None: s.append(_getreg(u8(buffer[ptr]))) ptr += 1 imm32 = u32(buffer[ptr: ptr + 4]) ptr += 4 print 'set8sx {}, byte ptr [{} + {}]'.format(target, '+'.join(s), imm32) elif opcode == 0x20: ptr += 1 target = _getreg32(u8(buffer[ptr])) ptr += 1 s = [] while _getreg(u8(buffer[ptr])) != None: s.append(_getreg(u8(buffer[ptr]))) ptr += 1 imm32 = u32(buffer[ptr: ptr + 4]) ptr += 4 print 'set32 {}, byte ptr [{} + {}]'.format(target, '+'.join(s), imm32) elif opcode == 0x1c: ptr += 1 target = _getreg32(u8(buffer[ptr])) ptr += 1 s = [] while _getreg(u8(buffer[ptr])) != None: s.append(_getreg(u8(buffer[ptr]))) ptr += 1 imm32 = u32(buffer[ptr: ptr + 4]) ptr += 4 print 'set32 {}, byte ptr [rsp + {} + {}]'.format(target, '+'.join(s), imm32) index = ptr else: print 'ud%d' % opcode break return code = open("code4.bin", 'rb').read() disasm(code) ``` The checker are broke into several parts, with each of them having a relatively easy checking algorithm. ## Challenge 11. malware skillz This challenge is cool, but reverse engineering the c2 module is a time-consuming job... Script to extract main c2 module from traffic: ```python= # extract dns records import dpkt, struct, IPython from hexdump import hexdump pcap = dpkt.pcapng.Reader(open("pcap.pcap", 'rb')) u32 = lambda x: struct.unpack("<L", x)[0] collected = '' def is_txtdnsresp(buf): dns = dpkt.dns.DNS(buf) if dns.an[0].type == 16: return dns.an[0].text[0] return '' for ts, buf in pcap: eth = dpkt.ethernet.Ethernet(buf) ip = eth.data udp = ip.data if udp.sport == 53: collected += is_txtdnsresp(udp.data) def decode(data): assert len(data) % 2 == 0 idx = 0 result = '' while idx < len(data): low = ord(data[idx]) - ord('A') hi = ord(data[idx+1]) - ord('a') result += chr(low + (hi << 4)) idx += 2 return result decoded = decode(collected) #hexdump(decoded) def rc4_xor(data, key): # damn this is slightly modified! key = map(ord, key) data = map(ord, data) def KSA(key): keylength = len(key) S = range(255) j = 0 for i in range(255): j = (j + S[i] + key[i % keylength]) % 255 S[i], S[j] = S[j], S[i] # swap return S def PRGA(S): i = 0 j = 0 while True: i = (i + 1) % 255 j = (j + S[i]) % 255 S[i], S[j] = S[j], S[i] # swap K = S[(S[i] + S[j]) % 255] yield K S = KSA(key) KS = PRGA(S) keys = [] for i in xrange(len(data)): keys.append(KS.next()) return ''.join(map(lambda x: chr(x[0] ^ x[1]), zip(data, keys))) k = decoded[:0x10] size = u32(decoded[0x10:0x14]) print 'Size:', hex(size) d = decoded[0x14:] decrypted = rc4_xor(d, k) open("decrypted.bin", 'wb').write(decrypted) print 'Done!' ``` Script to find native function by hash: ```python= import lief, ctypes, os, sys, IPython c32 = lambda x: ctypes.c_uint32(x).value def ror4(x, s): return ((x >> s) | ((x << (32 - s)) & 0xFFFFFFFF)) def get_hash(modname, funcname): modhash = 0 modname += '\x00' # mimic unicode string for ch in map(ord, modname): modhash = c32(ror4(modhash, 0xd)) if ch >= 0x61: ch -= 0x20 modhash = c32(modhash + ch) # mimic unicode string modhash = c32(ror4(modhash, 0xd) + 0) funchash = ord(funcname[0]) funcname += '\x00' for ch in map(ord, funcname[1:]): funchash = c32(ror4(funchash, 0xd)) funchash = c32(funchash + ch) return c32(funchash + modhash) #print hex(get_hash('kernel32.dll', 'LoadLibraryA')) module_mem_list = ['ntdll.dll', 'kernel32.dll', 'kernelbase.dll', 'shell32.dll', 'user32.dll', 'advapi32.dll', 'ws2_32.dll', 'gdi32.dll', 'mpr.dll', 'wininet.dll'] def find_func_by_hash(h): for mod in module_mem_list: pe = lief.PE.parse("dllcoll/" + mod) for func in pe.exported_functions: hv = get_hash(mod, str(func)) if hv == h: return mod, str(func) return '', '' #print find_func_by_hash(get_hash('kernel32.dll', 'LoadLibraryA')) #print find_func_by_hash(0x95898DFF) # CreateMutexW #print find_func_by_hash(0x5DE2C5AA) # GetLastError #print find_func_by_hash(0xE035F044) # Sleep #print find_func_by_hash(0xCF448459) # RtlDeleteCriticalSection #print find_func_by_hash(0xEA61FCB1) # LocalFree #print find_func_by_hash(0x528176EE) # LocalAlloc #print find_func_by_hash(0xDC322193) # RtlInitializeCriticalSection #print find_func_by_hash(0x6B8029) # WSAStartup #print find_func_by_hash(0x42131B45) # CryptAcquireContextW #print find_func_by_hash(0xCEE535FF) # RtlEnterCriticalSection #print find_func_by_hash(0x3A182487) # RtlLeaveCriticalSection print 'Done!' ``` Script to extract c2 traffic: ```python= import lief, dpkt, os, sys, IPython, socket, struct from hexdump import hexdump # use this with mimic.c # we decontaminate this shit... dehex = lambda x: x.replace(' ', '').replace('\n', '').decode('hex') chex = lambda x: ','.join(map(lambda y: '0x%02x' % ord(y), x)) u32 = lambda x: struct.unpack("<L", x)[0] u16 = lambda x: struct.unpack("<H", x)[0] client_decstub = ''' {{ char* a1 = NULL; int a2; char* decfunc = RVATOVA(char*, module, 0x65A0); THISCALL(5, decfunc, crypto_c, {encrypted}, sizeof({encrypted}), {iv}, &a1, &a2); hexdump(a1, a2); void* ptr = halloc(a2); memcpy(ptr, a1, a2); char* sa1 = NULL; int sa2; char* encfunc = RVATOVA(char*, module, 0x64F0); THISCALL(5, encfunc, crypto_s, ptr, a2, {iv}, &sa1, &sa2); write_file("server_{idx}.bin", a1, a2); // sent from server }} ''' server_decstub = ''' {{ char* a1 = NULL; int a2; char* decfunc = RVATOVA(char*, module, 0x65A0); THISCALL(5, decfunc, crypto_s, {encrypted}, sizeof({encrypted}), {iv}, &a1, &a2); void* ptr = halloc(a2); memcpy(ptr, a1, a2); char* sa1 = NULL; int sa2; char* encfunc = RVATOVA(char*, module, 0x64F0); THISCALL(5, encfunc, crypto_c, ptr, a2, {iv}, &sa1, &sa2); free(ptr); write_file("client_{idx}.bin", a1, a2); }} ''' defglobals = [] defcode = [] def uchar_arr(name, buf): return 'unsigned char {name}[] = {{ {buf} }};'.format(name=name, buf=chex(buf)) def add_server_packet(data, iv, idx): assert len(iv) == 16 ivname = 'siv%d' % idx dname = 'sd%d' % idx defglobals.append(uchar_arr(dname, data)) defglobals.append(uchar_arr(ivname, iv)) defcode.append(client_decstub.format(encrypted=dname, iv=ivname, idx=idx)) return def add_client_packet(data, iv, idx): assert len(iv) == 16 ivname = 'civ%d' % idx dname = 'cd%d' % idx defglobals.append(uchar_arr(dname, data)) defglobals.append(uchar_arr(ivname, iv)) defcode.append(server_decstub.format(encrypted=dname, iv=ivname, idx=idx)) return pcaps = dpkt.pcapng.Reader(open("./pcap.pcap", 'rb')) if not os.path.exists("./splitted/"): os.mkdir("./splitted/") ''' # this is the first stage, attacking 192.168.221.91 def is_client(pkt): # ethernet pkt if socket.inet_ntoa(pkt.data.src) == '192.168.221.91' and socket.inet_ntoa(pkt.data.dst) == '52.0.104.200': return True return False def is_server(pkt): if socket.inet_ntoa(pkt.data.dst) == '192.168.221.91' and socket.inet_ntoa(pkt.data.src) == '52.0.104.200': return True return False def is_valid(pkt): # tcp pkt if len(pkt.data) >= 58 and pkt.data[3] == '\x8f': return True return False c = 0 s = 0 while True: try: ts, buf = pcaps.next() except StopIteration: break packet = dpkt.ethernet.Ethernet(buf) if is_client(packet) and isinstance(packet.data.data, dpkt.tcp.TCP) and is_valid(packet.data.data): tcp_data = packet.data.data.data size = u32(tcp_data[:4]) & 0xFFFFFF print 'Valid client packet (%d), extracting...' % size #if size > 2000: hexdump(tcp_data) buffer = tcp_data rem_len = size - len(tcp_data) while rem_len != 0: ts, buf = pcaps.next() assert is_client(packet) and isinstance(packet.data.data, dpkt.tcp.TCP) pkt = dpkt.ethernet.Ethernet(buf) buffer += pkt.data.data.data rem_len -= len(pkt.data.data.data) assert len(buffer) == size iv = buffer[42:58] data = buffer[58:] add_client_packet(data, iv, c) open("splitted/c_{}.bin".format(c), 'wb').write(buffer) c += 1 elif is_server(packet) and isinstance(packet.data.data, dpkt.tcp.TCP) and is_valid(packet.data.data): tcp_data = packet.data.data.data size = u32(tcp_data[:4]) & 0xFFFFFF print 'Valid server packet (%d), extracting...' % size #if size > 2000: hexdump(tcp_data) buffer = tcp_data rem_len = size - len(tcp_data) while rem_len != 0: ts, buf = pcaps.next() assert is_server(packet) and isinstance(packet.data.data, dpkt.tcp.TCP) pkt = dpkt.ethernet.Ethernet(buf) buffer += pkt.data.data.data rem_len -= len(packet.data.data.data) iv = buffer[42:58] data = buffer[58:] add_server_packet(data, iv, s) open("splitted/s_{}.bin".format(s), 'wb').write(buffer) s += 1 ''' def is_client(pkt): # ethernet pkt if socket.inet_ntoa(pkt.data.src) == '192.168.221.91' and socket.inet_ntoa(pkt.data.dst) == '192.168.221.105' and pkt.data.data.dport == 445: return True return False def is_server(pkt): if socket.inet_ntoa(pkt.data.dst) == '192.168.221.91' and socket.inet_ntoa(pkt.data.src) == '192.168.221.105' and pkt.data.data.sport == 445: return True return False def is_valid_client(pkt): data = pkt.data[4:] # netbios if len(data) < 4: return False if data[0:4] == '\xfeSMB': flags = u32(data[0x10:0x14]) if flags & 1 == 0: # resp cmd = u16(data[0xc:0xe]) if cmd == 9: # write return True return False def is_valid_server(pkt): data = pkt.data[4:] # netbios if len(data) < 4: return False if data[0:4] == '\xfeSMB': flags = u32(data[0x10:0x14]) if (flags & 1) != 0: # resp cmd = u16(data[0xc:0xe]) if cmd == 8: # read status = u32(data[0x8:0xc]) if status == 0: return True return False c = 0 s = 0 while True: try: ts, buf = pcaps.next() except StopIteration: break packet = dpkt.ethernet.Ethernet(buf) if isinstance(packet.data.data, dpkt.tcp.TCP): if is_client(packet) and is_valid_client(packet.data.data): wreq = packet.data.data.data[4+64+48:] if len(wreq) <= 58 or wreq[3] != '\x8f': continue size = u32(wreq[0:4]) & 0xffffff print 'Valid client packet (%d), extracting...' % size buffer = '' buffer += wreq rem_size = size - len(wreq) while rem_size != 0: ts, buf = pcaps.next() packet = dpkt.ethernet.Ethernet(buf) assert is_client(packet) buffer += packet.data.data.data rem_size -= len(packet.data.data.data) assert len(buffer) == size iv = buffer[42:58] data = buffer[58:] add_client_packet(data, iv, c) c += 1 elif is_server(packet) and is_valid_server(packet.data.data): rrsp = packet.data.data.data[4+64+16:] if len(rrsp) <= 58 or rrsp[3] != '\x8f': continue size = u32(rrsp[0:4]) & 0xffffff print 'Valid server packet (%d), extracting...' % size buffer = '' buffer += rrsp rem_size = size - len(rrsp) while rem_size != 0: ts, buf = pcaps.next() packet = dpkt.ethernet.Ethernet(buf) assert is_client(packet) buffer += packet.data.data.data rem_size -= len(packet.data.data.data) assert len(buffer) == size iv = buffer[42:58] data = buffer[58:] add_server_packet(data, iv, s) s += 1 open("data.h", 'wb').write('\n'.join(defglobals)) open("code.h", 'wb').write('\n'.join(defcode)) print 'Done!' ``` Program to decrypt c2 traffic: ```c // clang -m32 -o mim.exe mimic.c #include <stdio.h> #include <stdlib.h> #include <Windows.h> ////////////////////////////////////////// // Shit starts here void* load_mem(char* filename) { void* buffer = NULL; FILE* fp = fopen(filename, "rb"); if(!fp) { return buffer; } fseek(fp, 0, SEEK_END); long size = ftell(fp); fseek(fp, 0, SEEK_SET); // So that we don't need to fix the relocation // NO IT'S NOT JUST ABOUT RELOC! We need to fix the import as well! buffer = VirtualAlloc((void*)(0x400000), size, MEM_COMMIT | MEM_PRIVATE, PAGE_EXECUTE_READWRITE); if(!buffer) { fclose(fp); return buffer; } fread(buffer, 1, size, fp); fclose(fp); return buffer; } void write_file(char* filename, void* buffer, size_t size) { FILE* fp = fopen(filename, "wb"); fwrite(buffer, 1, size, fp); fclose(fp); return; } char* load_module(char* filename) { return (char*)LoadLibraryA(filename); } #define RVATOVA(type, base, rva) (type)((char*)base + rva) // Some little trick to placate the msvc __thiscall // This apparently does not comply to multithreaded environments #define THISCALL_CALLER __asm__ volatile ( \ "mov (%0), %%ecx\n\t" \ "mov (%1), %%eax\n\t" \ "jmp *%%eax\n\t" \ : \ : "m" (g_this), "m" (g_callsite) \ : "cc" \ ) int g_this, g_callsite; int __stdcall __attribute__((naked, noinline)) __thiscall0() { THISCALL_CALLER; } int __stdcall __attribute__((naked, noinline)) __thiscall1(int arg0) { THISCALL_CALLER; } int __stdcall __attribute__((naked, noinline)) __thiscall2(int arg0, int arg1) { THISCALL_CALLER; } int __stdcall __attribute__((naked, noinline)) __thiscall3(int arg0, int arg1, int arg2) { THISCALL_CALLER; } int __stdcall __attribute__((naked, noinline)) __thiscall4(int arg0, int arg1, int arg2, int arg3) { THISCALL_CALLER; } int __stdcall __attribute__((naked, noinline)) __thiscall5(int arg0, int arg1, int arg2, int arg3, int arg4) { THISCALL_CALLER; } int __stdcall __attribute__((naked, noinline)) __thiscall6(int arg0, int arg1, int arg2, int arg3, int arg4, int arg5) { THISCALL_CALLER; } #define THISCALL(nargs, callsite, this, ...) \ g_this = (int)this; \ g_callsite = (int)callsite; \ __thiscall##nargs(__VA_ARGS__); #define halloc(x) HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, x) #define hfree(x) HeapFree(GetProcessHeap(), 0, x) void hexdump(unsigned char* buf, int size) { for(int base = 0; base < size; base += 0x10) { printf("%08X: ", base); for(int offset = 0; offset < 0x10 && offset + base < size; offset++) { printf("%02x ", buf[base + offset]); } putchar('\n'); } return; } ////////////////////////////////////////// // Holy god this ends.. char* mod; /* // these are from the first stage unsigned char client[] = {0x0b,0x7d,0xbe,0x80,0xe7,0xb8,0x44,0x1f,0xf3,0x05,0x5c,0xe8,0xd0,0x75,0x7b,0xcb}; unsigned char server[] = {0xc3,0x95,0x08,0x1d,0xe5,0xa3,0xf2,0xe5,0x44,0xe8,0x18,0xae,0x75,0x80,0xd4,0xd3}; */ // second stage unsigned char client[] = {0xe7,0x66,0xe6,0x5a,0xe8,0x50,0x9d,0x68,0x33,0xd7,0x3a,0x37,0xd1,0xec,0x4a,0xd8}; unsigned char server[] = {0x5f,0xa5,0x29,0x40,0x57,0x65,0x44,0xd4,0x4d,0x01,0xfa,0x2a,0x37,0xf4,0x9f,0xc4}; #include "data.h" int main(int argc, char* argv[]) { char* module = load_module("mal.dll"); // this loads a neutralized version... if(!module) { printf("Cannot load module, %d\n", GetLastError()); return -1; } mod = module; //THISCALL(3, 0x0, 0x0, 1, 2, 3); void* crypto_c = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, 0x148); char* initfunc = RVATOVA(char*, module, 0x63E0); // init structure THISCALL(1, initfunc, crypto_c, 0); void* crypto_s = halloc(0x148); THISCALL(1, initfunc, crypto_s, 0); void* comp_c = halloc(32); char* compinit = RVATOVA(char*, module, 0xA5B0); THISCALL(1, compinit, comp_c, 0); void* comp_s = halloc(32); THISCALL(1, compinit, comp_s, 0); unsigned char key[16]; for(int i = 0; i < 16; i++) { key[i] = client[i] ^ server[i] ^ 0xAA; } char* keyfunc = RVATOVA(char*, module, 0x64B0); THISCALL(2, keyfunc, crypto_c, key, 16); // set key THISCALL(2, keyfunc, crypto_s, key, 16); #pragma clang diagnostic push #pragma clang diagnostic ignored "-Wint-conversion" #pragma clang diagnostic ignored "-Wpointer-sign" #include "code.h" #pragma clang diagnostic pop printf("Survived!\n"); return 0; } ``` ## Challenge 12. Suspicious floppy disk This is an insane challenge, you feel real despair when you unveiled the first layer of the abstruse logic, only to see another layer of terror. Generally we could break this challenge into 3 parts. ### 1. Finding the input checker Playing around with the image given, I found something interesting: `type MESSAGE.DAT` command also triggers the checker routine. Moreover, if we extract `infohelp.exe` and put it into `dosbox`, we can see that the checker routine magically disappeared. Further reversing showed that `infohelp.exe` involves no sequence checking login. So what is actually going on? Debugging the image inside `bochs` gave me the answer: The `IVT` for `int 13h` is replaced! This turns out to be `BIOS interrupt`. Looking into the detoured logic shows there's more action when reading/writing specific sectors: * When writing to the sector containing `KEY.DAT` data, the hook copies the data written to a global buffer. * When reading from the sector containing `MESSAGE.DAT` data, the hook enters another function which is the checker function presumably. So given the checker function, reversing result showed it's actually a `subleq` VM. Let's first extract the VM environment and goes on to the second part: ```python= import os, sys, struct from hexdump import hexdump ''' Progress now: I found that int13h hook, but it looks like some kind of red herring, by the sentence "BE SURE TO XXX". I don't know if there's any lower-level hooks or modifications in this level.... ''' def ror8(b, r): return ((b >> r) | ((b << (8 - r)) & 0xFF)) assert ror8(1, 1) == 0x80 assert ror8(0x81, 1) == 0xC0 data = open("image.img", 'rb').read() ''' # dump decrypted boot section. sector6 = data[5*0x200: 5*0x200+0x200] data6 = '' for ch in sector6: data6 += chr(ror8(ord(ch), 1)) hexdump(data6) open("sector6.bin", 'wb').write(data6) ''' # the following comprehension could be wrong.... ### def chs2l(c, h, s): # chs to lba addressing return (c * 2 + h) * 0x12 + s - 1 # 3.5 inch floppy settings def l2o(lba): # lba to offset return lba * 0x200 # 512b sector ### assert chs2l(0, 0, 6) == 5 print hex(chs2l(0x21, 1, 0x11)) #forbid_section = data[l2o(0x4c8): l2o(0x593)] # these are the section forbidden from reading by the hook #hexdump(forbid_section) #message_section = data[l2o(chs2l(0x21, 1, 0x11)): l2o(chs2l(0x21, 1, 0x12))] #hexdump(message_section) # extract subleq commands dump = open("../dump/before.dump", 'rb').read() start = 0x97c00 + 0x223 end = start + 0x2DAD * 2 subleq = dump[start:end] hexdump(subleq) open("subleq.vm", 'wb').write(subleq) ``` ### 2. The first layer of terror I first crafted a small python script which implements the `subleq` VM (the script is enhanced gradually in the process of analysis, shown is the final script I used): ```python= #!/usr/bin/env python # -*- coding: utf-8 -*- import struct, os, sys, ctypes, IPython, hashlib from hexdump import hexdump c16 = lambda x: ctypes.c_int16(x).value cu16 = lambda x: ctypes.c_uint16(x).value u16 = lambda x: struct.unpack("<H", x)[0] s16 = lambda x: struct.unpack("<h", x)[0] p16 = lambda x: struct.pack("<H", c16(x))[0] data = bytearray(open("subleq.vm", 'rb').read()) assert len(data) % 2 == 0 vmmem = [] for i in xrange(len(data) / 2): vmmem.append(s16(data[i * 2: i * 2 + 2])) class TaggedInt(object): def __init__(self, v, tag): self.v = v self.tag = tag return def __int__(self): print "Calling int() on '%s'" % (self.tag) return self.v def __sub__(self, x): value = int(x) print "Calling sub(%d) on '%s'" % (value, self.tag) return TaggedInt(self.v - value, self.tag) def __add__(self, x): value = int(x) print "Calling add(%d) on '%s'" % (value, self.tag) return TaggedInt(self.v + value, self.tag) def __neg__(self): print "Calling neg() on '%s'" % (self.tag) return TaggedInt(-self.v, self.tag) def tagged_subtract(a, b): if isinstance(a, TaggedInt): return a - b elif isinstance(b, TaggedInt): return - b + a return a - b def set_input(inp, tag=True): start = 0x1208 / 2 for idx, ch in enumerate(inp): if tag: vmmem[start] = TaggedInt(ord(ch), 'char%d' % (idx)) else: vmmem[start] = ord(ch) start += 1 if tag: vmmem[start] = TaggedInt(0, 'term') else: vmmem[start] = 0 return set_input('\x00' * 63, False) breakpoints = [0x5] def subleq_vm(data, start, end): '''Some debugging helpers''' step = False stop = False def interactive(): step = False stop = False def disasm(addr=None): if addr == None: addr = current if isinstance(addr, str): addr = int(addr, 0) a = data[addr] b = data[addr + 1] pc = data[addr + 2] print '%x: subleq %x, %x' % (addr, b, a), if pc != 0: print 'jump=%x' % pc, print '' return def bp(addr=None): if addr == None: print 'Usage: bp <addr>' return if isinstance(addr, str): addr = int(addr, 0) breakpoints.append(addr) return def bd(addr=None): if addr == None: print 'Usage: bd <addr>' return if isinstance(addr, str): addr = int(addr, 0) breakpoints.remove(addr) return def bl(): for idx, bkpt in enumerate(breakpoints): print '%d: %x' % (idx, bkpt) return def get(addr=None): if addr == None: print 'Usage: get <addr>' return if isinstance(addr, str): addr = int(addr, 0) print 'vmmem[%x] = %d' % (addr, vmmem[addr]) return def set(addr=None, val=None): if addr == None or val == None: print 'Usage: set <addr> <val>' return if isinstance(addr, str): addr = int(addr, 0) if isinstance(val, str): val = int(val, 0) vmmem[addr] = val return funcs = {'disasm': disasm, 'bp': bp, 'bd': bd, 'bl': bl, 'get': get, 'set': set} while True: comm = raw_input('>>> ').split() if comm[0] == 'quit': break elif comm[0] == 'kill': stop = True elif comm[0] == 'stepi': step = True else: try: funcs[comm[0]](*comm[1:]) except KeyError: print 'No such command' return (step, stop) '''VM Code''' addr_dict = {} current = start inscount = 0 vmins = 0 while True: if current + 3 > end: break ''' if current in breakpoints or step: print 'Break! pc = %x' % current if step: step = False step, stop = interactive() if stop: break ''' if current == 0x520: print '%x: %d' % (vmmem[0x7f6], vmmem[0x7f6 + vmmem[0x7f6]]) vmins += 1 a = data[current] b = data[current + 1] pc = data[current + 2] inscount += 1 if not addr_dict.has_key(current): ''' print current, ': subleq', a, b, if pc != 0: print pc, print '' ''' addr_dict[current] = True if exec_subleq(data, a, b, pc): if cu16(pc) == 0xFFFF: break current = pc else: current += 3 # output if data[4] != 0: char = cu16(data[2]) & 0xFF sys.stderr.write(chr(char)) data[4] = 0 data[2] = 0 print 'Inscount:', inscount print 'VMIns:', vmins return def exec_subleq(data, a, b, pc): a_lit = data[int(a)] b_lit = data[int(b)] b_res = c16(int(tagged_subtract(b_lit, a_lit))) data[b] = b_res if pc != 0: if b_res <= 0: return True return False subleq_vm(vmmem, 5, 0x2dad) ``` Analysing this VM could be a pain in the ass, it's really painful staring at thousands lines of `subleq x, x, x` trying to figure out the what the program actual does. Hopefully we have some resource on this topic already, including a [writeup with Binary ninja plugin](https://blahcat.github.io/2017/10/13/flareon-4-writeups/#challenge-11) and an [official writeup](https://www.fireeye.com/content/dam/fireeye-www/global/en/blog/threat-research/Flare-On%202017/Challenge11.pdf) published last year about the same topic. Finally I came to the result of this manually-written pseudocode: ``` _start() { push(0x25b7, 0x7f6); call(0xfc); fix_stack(); HALT(); } 0xfc(arg1, arg2) { //_bp = _sp; DECLARE_ARGUMENT(arg2); // 0x10f DECLARE_ARGUMENT(arg1); // 0x148 DECLARE_VAR(v1) // 0x181 DECLARE_VAR(v2) // 0x185 DECLARE_VAR(v3) // 0x189 DECLARE_VAR(v4, -2) // 0x18d DECLARE_VAR(v5, 3) // 0x191 DECLARE_VAR(v6, 4) // 0x195 DECLARE_VAR(v7, 5) // 0x199 DECLARE_VAR(v8, 6) // 0x19d DECLARE_VAR(v9, 0) // 0x1a1 DECLARE_VAR(v10, 1) // 0x1a5 DECLARE_VAR(v11, 0) // 0x1a9 DECLARE_VAR(v12) // 0x1ad v2 = arg2; v12 = *v2; while(true) { v2 = arg2; v1 = *v2; if(v1 < arg1) { v2 += v1; v3 = *v2; if(v3 + 2 == 0) { break; } push(arg2); call(0x520); fix_stack(); v2 = arg2; v2 += v8; v9 = *v2; if(v9 == v10) { *v2 = v11; v2 = arg2 + v6; v9 = *v2; // OUT r2 = v9; r4 = 1; // OUT DONE *v2 = v11; } v2 = arg2; v2 += v7; v9 = *v2; if(v9 == v10) { *v2 = v11; r3 += 1; v9 = r1; v2 = arg2; v2 += v5; *v2 = v9; } } else { break; } } v2 = arg2; *v2 = v12; return; } 0x520(arg1) { _bp523 = _sp; DECLARE_ARGUMENT(arg1) // 0x533 DECLARE_VAR(v2, -2) // 0x537 DECLARE_VAR(v3) // 0x53b [0x7f1] = arg1; [0x7f5] = *[0x7f1]; [0x7f1] += [0x7f5]; [0x7ef] = *[0x7f1]; // 0x7ef = opcode if([0x7ef] == [0x7f2]) { // 0x7f2 == -1, nop [0x7f1] = arg1; [0x7f5] += 1; *[0x7f1] = [0x7f5]; return; } [0x7f1] = arg1 + 1; [0x7ee] = *(arg1 + 1); [0x7f1] = arg1 + [0x7ef]; [0x7f0] = *(arg1 + [0x7ef]); [0x7f4] = [0x7f0] - [0x7ee]; [0x7ee] = [0x7f4]; [0x7f1] = arg1 + 1; *[0x7f1] = [0x7ee]; if([0x7ef] != [0x7f3]) { // 0x7f3 == 2 [0x7f1] = arg1 + [0x7ef]; *[0x7f1] = [0x7ee]; } [0x7f5] = *arg1; if([0x7ee] < 0) { [0x7f5] += 1; } [0x7f5] += 1; *arg1 = [0x7f5]; return; } ``` ... What the hell is this? Why the logic doesn't look like `print(banner); if(check(input)) { print(good); } else { print(fail); }`? Oh my... ### 3. The final layer of despair Fortuanately I didn't close my chrome tab browsing [esolang](https://esolangs.org/wiki/OISC). Review the page for a few seconds and I found [this](https://esolangs.org/wiki/RSSB). Looks like this is a `subleq` VM running a `rssb` VM. Holy fuck. So as I did in the second part, I crafted another `rssb` vm script first: ```python= #!/usr/bin/env python # -*- coding: utf-8 -*- from hexdump import hexdump import IPython, sys, os, struct, ctypes c16 = lambda x: ctypes.c_int16(x).value cu16 = lambda x: ctypes.c_uint16(x).value u16 = lambda x: struct.unpack("<H", x)[0] s16 = lambda x: struct.unpack("<h", x)[0] p16 = lambda x: struct.pack("<H", c16(x))[0] data = bytearray(open("subleq.vm", 'rb').read()) assert len(data) % 2 == 0 vmmem = [] for i in xrange(len(data) / 2): vmmem.append(s16(data[i * 2: i * 2 + 2])) vmstate = 0x7f6 def set_input(inp): start = 0x1208 / 2 for idx, ch in enumerate(inp): vmmem[start] = ord(ch) start += 1 for i in xrange(64 - (start - (0x1208 / 2))): vmmem[start] = 0 start += 1 return #set_input('\x00' * 63) set_input(raw_input('Input: ')) #set_input(17 * 'A' + '@flare-on.com') #set_input(15 * 'B1' + '@flare-on.com') def extract_result(): start = 0xa79 for idx in xrange(15): print vmmem[vmstate + start + idx], ',', sys.exit(0) return #extract_result() def get_name(v): if v == 0: return 'pc' elif v < 0: return str(v) elif v == 1: return 'acc' elif v == 2: return 'zero' elif v == 4: return 'out' elif v == 6: return 'outst' return hex(v) breakpoints = [0x15b] # https://en.wikipedia.org/wiki/One_instruction_set_computer # Looks like RSSB?, and he implements RSSB inside a SUBLEQ interpreter.... holyfuck def rssb_vm(): def interactive(): stop = False step = False def get(addr): if addr == None: print 'Usage: get <addr>' return if isinstance(addr, str): addr = int(addr, 0) print 'VM[%x] = %d (%x)' % (addr, vmmem[vmstate + addr], vmmem[vmstate + addr]) return def disasm(addr, length): if addr == None: print 'Usage: disasm <addr> [len]' return if isinstance(addr, str): addr = int(addr, 0) if isinstance(length, str): length = int(length, 0) elif length == None: length = 1 for i in xrange(length): print '%x: rssb %s' % (addr + i, get_name(vmmem[vmstate + addr + i])) return def set(addr, val): if addr == None or val == None: print 'Usage: set <addr> <val>' return if isinstance(addr, str): addr = int(addr, 0) if isinstance(val, str): val = int(val, 0) vmmem[vmstate + addr] = val return def bp(addr): if addr == None: print 'Usage: bp <addr>' return if isinstance(addr, str): addr = int(addr, 0) breakpoints.append(addr) return def bd(addr): if addr == None: print 'Usage: bd <addr>' return if isinstance(addr, str): addr = int(addr, 0) breakpoints.remove(addr) return def bl(): for idx, bpkt in enumerate(breakpoints): print '%d: %x' % (idx, bpkt) return def evaluate(stmt): if stmt == None: print 'Usage: eval <stmt>' return print eval(stmt) return def interact(): IPython.embed() return functions = {'get': get, 'set': set, 'bp': bp, 'bd': bd, 'bl': bl, 'eval': evaluate, 'disasm': disasm, 'ipy': interact} while True: command = raw_input('>>> ').split() if command[0] == 'quit': break elif command[0] == 'kill': stop = True elif command[0] == 'stepi': step == True else: try: functions[command[0]](*command[1:]) except KeyError: print 'No such command' return (stop, step) debug = False stop = False step = False trace = False output = True inscount = 0 if trace: tracefile = open("runtrace.txt", 'wb') while vmmem[vmstate] < 0x25b7: if trace: trace_pc = vmmem[vmstate] print >>tracefile, '%x: rssb %s' % (vmmem[vmstate], get_name(vmmem[vmstate + vmmem[vmstate]])), if debug and (vmmem[vmstate] in breakpoints or step): print 'Break! pc = %x' % vmmem[vmstate] step = False stop, step = interactive() ''' if vmmem[vmstate] == 0x1f62: print 'Intermediate:', vmmem[vmstate + 0x1ef6], vmmem[vmstate + 0x1ef2] if vmmem[vmstate] == 0x1ff1: print 'Result:', vmmem[vmstate + 0x1ef6] # this is trigger 1 if vmmem[vmstate] == 0x20c2: if vmmem[vmstate + 0x20aa] != 0: print 'Trigger 1' vmmem[vmstate + 0x20aa] = 0 # this is trigger 2 if vmmem[vmstate] == 0x21ba: if vmmem[vmstate + 0x21a2] == 0: print 'Trigger 2' vmmem[vmstate + 0x21a2] = 1 ''' if stop: break if vmmem[vmstate + vmmem[vmstate]] == -2: break exec_rssb() if trace: print >>tracefile, '[acc = %d, vmmem[pc] = %d]' % (vmmem[vmstate + 1], vmmem[vmstate + vmmem[vmstate + trace_pc]]) inscount += 1 if vmmem[vmstate + 6] != 0: if output: sys.stderr.write(chr(vmmem[vmstate + 4])) vmmem[vmstate + 6] = 0 print 'Inscount:', inscount return def exec_rssb(): pc = vmmem[vmstate] + vmstate if vmmem[pc] == -1: vmmem[vmstate] = c16(vmmem[vmstate] + 1) return a = vmmem[vmstate + 1] b = vmmem[vmstate + vmmem[pc]] c = c16(b - a) vmmem[vmstate + 1] = c if vmmem[pc] != 2: vmmem[vmstate + vmmem[pc]] = c if c < 0: vmmem[vmstate] += 1 vmmem[vmstate] += 1 return rssb_vm() ``` ... and went on analyzing this. It's much more frustrating since we don't have the references mentioned above, I have to figure out the high-level primitives on my own. With the ability to debug this VM and 2 days of dedicated analysis, I finally found the algorithm used to check the password: ```python= #!/usr/bin/env python # -*- coding: utf-8 -*- import z3, ctypes c16 = lambda x: ctypes.c_int16(x).value z16 = lambda x: z3.BitVecVal(x, 16) def main(length): s = z3.Solver() answers = [] for i in xrange(length): bv = z3.BitVec('x%d' % i, 16) s.add(bv >= 0x20, bv <= 0x7f) s.add(bv != 0x40) answers.append(bv) vlen = length vb47 = sum(answers) + z16(length * 1024 * 3) if length % 2 == 1: answers.append(z16(ord('@'))) length += 1 vx = z16(0) ans = map(lambda x: z16(x), [-897 , -3313 , -3231 , -3759 , -1914 , -3119 , -9385 , -9771 , -7570 , -1843 , -1687 , -9972 , -2015 , -3711 , -1953]) for i in xrange(length / 2): x1 = (answers[i * 2 + 1] - z16(0x20)) * z16(128) + (answers[i * 2] - z16(0x20)) x1 ^= vx s.add(x1 + vb47 == ans[i]) vx += 33 print s.check(), length result = '' try: for idx in xrange(vlen): result += chr(s.model()[answers[idx]].as_long()) except z3.z3types.Z3Exception: return False print result return True for i in xrange(1, 31): main(i) ``` The final answer is: `Av0cad0_Love_2018@flare-on.com`