# Alpacahack Round 8 (Rev) - Writeups (all solved) ## Masking Tape ### 1) Challenge overview The program expects exactly one CLI argument: `./masking-tape <input>`. It transforms your input **byte-by-byte**, then compares two derived byte arrays against two constants stored in `.rodata`: `enc1` and `enc2`. If both comparisons match, it prints **`congratz`**. --- ### 2) What happens to each byte? Inside the loop: ```c __s[i] = ((int)__s[i] << 3) | (__s[i] >> 5); ``` That is an **8-bit rotate-left by 3** (ROL8). For original byte `x`: - `r = ROL8(x, 3)` Then it splits `r` into two complementary “bit groups” using masks: - `0xCC` = `11001100` (keeps bits 2–3 and 6–7) - `0x33` = `00110011` (keeps bits 0–1 and 4–5) The trick: depending on whether `r` is **even or odd**, the program either stores: - If `r` is **even** (`r & 1 == 0`): - `enc1[i] = r & 0xCC` - `enc2[i] = r & 0x33` - If `r` is **odd** (`r & 1 == 1`), it **swaps** them: - `enc1[i] = r & 0x33` - `enc2[i] = r & 0xCC` So `enc1` and `enc2` together contain all bits of `r`, just shuffled. --- ### 3) Inverting the transformation Since the masks are complementary, we can merge them back. Try both possibilities: - Assume even: - `r_even = (enc1[i] & 0xCC) | (enc2[i] & 0x33)` - Assume odd (swapped): - `r_odd = (enc1[i] & 0x33) | (enc2[i] & 0xCC)` Then validate using the parity bit and the expected masked outputs: - if `r_even` is even and produces the same `enc1/enc2` masks → it’s correct, - otherwise `r_odd` must be correct. Finally recover the original byte: - `x = ROR8(r, 3)` (rotate-right 3) Repeat for all bytes → get the original input string (the flag). --- ### 4) Locating `enc1` and `enc2` in the ELF You can find symbol addresses using: ```bash nm -a ./masking-tape | grep -E 'enc1|enc2' readelf -S ./masking-tape | grep rodata ``` In this binary: - `.rodata` starts at Vaddr `0x2000` and file offset `0x2000` - `enc1` is at `0x2020` - `enc2` is at `0x2040` Because the `.rodata` vaddr matches its file offset here, you can directly `seek()` to those offsets and read C-strings. --- ### 5) Solver (Python) ```python #!/usr/bin/env python3 from pathlib import Path BIN = "./masking-tape" ENC1_OFF = 0x2020 ENC2_OFF = 0x2040 def read_cstring(buf: bytes, off: int) -> bytes: end = buf.index(b"\x00", off) return buf[off:end] def ror8(x: int, n: int = 3) -> int: return ((x >> n) | ((x << (8 - n)) & 0xFF)) & 0xFF def main(): data = Path(BIN).read_bytes() enc1 = read_cstring(data, ENC1_OFF) enc2 = read_cstring(data, ENC2_OFF) assert len(enc1) == len(enc2) out = [] for a, b in zip(enc1, enc2): # asumsi r genap r_even = (a & 0xCC) | (b & 0x33) ok_even = (r_even & 1) == 0 and a == (r_even & 0xCC) and b == (r_even & 0x33) if ok_even: r = r_even else: # asumsi r ganjil (swap) r_odd = (a & 0x33) | (b & 0xCC) ok_odd = (r_odd & 1) == 1 and a == (r_odd & 0x33) and b == (r_odd & 0xCC) assert ok_odd r = r_odd out.append(ror8(r, 3)) print(bytes(out).decode()) if __name__ == "__main__": main() ``` --- ## Final **Flag:** `Alpaca{y0u'r3_a_pr0_crack3r}` ## hidden ### 1) Summary The program validates `argv[1]` by: - Packing it into **32-bit (4-byte) words**. - Transforming each word using a **4×32-bit internal state**. - Comparing the resulting word array against a constant array stored in `.data` via `memcmp`. - If equal → prints `congratz`. Because the per-word transform is essentially XOR + bit rotations, it is **reversible**. We can recover the original input by reversing the transform using the known expected output array from `.data`. --- ### 2) What `main()` does High-level flow: 1. Initializes the state from: ``` "AlpacaHackRound8" ``` (16 bytes → 4 little-endian `uint32` words) 2. Packs input into 32-bit words: - `n = (len + 3) >> 2` (ceil word count) - `buf = calloc(n,4)` (zero-padded) - `memcpy(buf, input, len)` 3. For each word: - computes `out[i]` from `buf[i]` and the current state - writes `out[i]` back into the buffer - updates the state depending on `out[i] & 1` 4. Compares the output buffer to the `.data` constant array. --- ### 3) Expected output in `.data` For this binary, the relevant `.data` layout is: - Offset `0x20`: `uint32 n_words` - Offset `0x40`: `n_words * 4` bytes = expected output words In this instance, `n_words = 27`. --- ### 4) The core per-word transform Given input word `x` and state `(s0,s1,s2,s3)`: 1) Compute: - `c = rol(s0, 5) + ror(s1, 3)` - `d = ror(s2, 3) - rol(s3, 5)` (all modulo 2³²) 2) Output: - `out = x ^ c ^ d` 3) State update based on `out` LSB: - If `out & 1`: ``` s0 ^= rol(d,11) s1 ^= rol(d,13) s2 ^= ror(c,15) s3 ^= ror(c,13) ``` - Else: ``` s0 ^= ror(d,13) s1 ^= ror(d,15) s2 ^= rol(c,13) s3 ^= rol(c,11) ``` --- ### 5) Reversing the transform Since: - `out = x ^ c ^ d` We invert with: - `x = out ^ c ^ d` The state update depends on `out`, but we know `out` from the constant target array, so we can replay state transitions exactly. --- ### 6) Solver (Python) ```python #!/usr/bin/env python3 import struct MASK = 0xFFFFFFFF def rol(x, n): x &= MASK return ((x << n) | (x >> (32 - n))) & MASK def ror(x, n): x &= MASK return ((x >> n) | (x << (32 - n))) & MASK def get_section_bytes(path: str, sec_name: str) -> bytes: b = open(path, "rb").read() if b[:4] != b"\x7fELF" or b[4] != 2 or b[5] != 1: raise SystemExit("Not an ELF64 little-endian") e_shoff = struct.unpack_from("<Q", b, 0x28)[0] e_shentsize = struct.unpack_from("<H", b, 0x3A)[0] e_shnum = struct.unpack_from("<H", b, 0x3C)[0] e_shstrndx = struct.unpack_from("<H", b, 0x3E)[0] def shdr(i): off = e_shoff + i * e_shentsize sh_name = struct.unpack_from("<I", b, off + 0x00)[0] sh_off = struct.unpack_from("<Q", b, off + 0x18)[0] sh_size = struct.unpack_from("<Q", b, off + 0x20)[0] return sh_name, sh_off, sh_size # section-header string table _, shstr_off, shstr_size = shdr(e_shstrndx) shstr = b[shstr_off:shstr_off + shstr_size] def name_at(off): end = shstr.find(b"\x00", off) return shstr[off:end].decode("ascii", errors="ignore") for i in range(e_shnum): n_off, off, size = shdr(i) if name_at(n_off) == sec_name: return b[off:off + size] raise SystemExit(f"Section {sec_name} not found") def main(): binpath = "hidden" data = get_section_bytes(binpath, ".data") # Known layout for this challenge binary: # 0x20: uint32 n_words # 0x40: n_words * 4 bytes (expected output words) n_words = struct.unpack_from("<I", data, 0x20)[0] expected = list(struct.unpack_from("<{}I".format(n_words), data, 0x40)) # initial state from "AlpacaHackRound8" (16 bytes -> 4 words LE) state = list(struct.unpack("<4I", b"AlpacaHackRound8")) recovered = [] for outw in expected: c = (rol(state[0], 5) + ror(state[1], 3)) & MASK d = (ror(state[2], 3) - rol(state[3], 5)) & MASK inw = (outw ^ c ^ d) & MASK recovered.append(inw) # update state depends on outw's LSB if outw & 1: state[0] ^= rol(d, 11) state[1] ^= rol(d, 13) state[2] ^= ror(c, 15) state[3] ^= ror(c, 13) else: state[0] ^= ror(d, 13) state[1] ^= ror(d, 15) state[2] ^= rol(c, 13) state[3] ^= rol(c, 11) state = [x & MASK for x in state] raw = struct.pack("<{}I".format(n_words), *recovered) flag = raw.rstrip(b"\x00").decode("ascii", errors="replace") print(flag) if __name__ == "__main__": main() ``` --- ### 7) Result Running the solver prints the recovered input: ``` Alpaca{th15_f145_1s_3xc3ssiv3ly_l3ngthy_but_th1s_1s_t0_3nsur3_th4t_1t_c4nn0t_b3_e4s1ly_s01v3d_us1ng_angr} ``` ## vcipher ### Challenge summary The `vcipher` ELF binary: - asks for a flag, - enforces an **exact 32-character** length, - runs the flag through a cipher, - and compares the produced output against a **hardcoded reference array**. The hint *“Verilog can also be converted to C++”* points to **Verilator**: the binary contains `Verilated...` symbols and a `Vcipher` class typical of Verilog-to-C++ simulation. --- ### 1) Input validation: 32 characters The `.rodata` contains: - `Error: Flag must be exactly 32 characters.` `main` checks the input length against `0x20` (32). So the flag must be 32 chars. --- ### 2) Extract the reference output A hardcoded 8×`uint32` reference array is stored in `.rodata`: ``` 91715a34 0a95c4dc 4e3fd78a eede0660 a4f674b4 4d572096 6856ba7f 7e39cb45 ``` These are the expected outputs `out[0..7]` (little-endian `uint32`). --- ### 3) What `main` does (high level) `main` splits the 32-byte input into **8 blocks** of **4 bytes**. Each block is packed into a little-endian `uint32`, fed into the Verilator module `Vcipher`, clocked until `done=1`, and the output word is collected. Finally, the 8 output words are compared against the reference array. Therefore, reversing `Vcipher` is enough to recover the flag. --- ### 4) Reverse the Verilator core The main logic lives in `Vcipher___024root___sequent__TOP__1`: - On block start: `x = input ^ key`, counter reset - Loop: - while `counter <= 0x24f8896`, do `rol x, 3`, `counter++` - On finish: - `out = x` - key schedule: `key += 0x17CA85FE` The initial key is set in `main`: - `K0 = 0xE8D2BFCD` So per block: - `K_{i+1} = (K_i + 0x17CA85FE) mod 2^32` --- ### 5) Reduce the huge loop: effective rotation is 5 32-bit rotations are periodic: - repeating `ROL(3)` N times equals `ROL((3*N) mod 32)`. The loop performs `0x24f8896 + 1` iterations due to the `<=` condition, hence: - `3*(0x24f8896 + 1) mod 32 = 5` So the cipher per block is: $$ out = rol_5(input \oplus key) $$ Inverse: $$ input = ror_5(out) \oplus key $$ --- ### 6) Recover the flag from the reference array We already know `out[i]`. For each block: - `in_i = ror_5(out_i) ^ key` - `key = (key + 0x17CA85FE) mod 2^32` Then pack each `in_i` as 4 little-endian bytes to rebuild the 32-character ASCII flag. #### Solver (Python) ```python import struct def ror32(x, r): r %= 32 return ((x >> r) | ((x << (32 - r)) & 0xFFFFFFFF)) & 0xFFFFFFFF outs_bytes = bytes.fromhex( "91715a340a95c4dc4e3fd78aeede0660" "a4f674b44d5720966856ba7f7e39cb45" ) outs = struct.unpack("<8I", outs_bytes) K = 0xE8D2BFCD ADD = 0x17CA85FE plain_words = [] for o in outs: p = ror32(o, 5) ^ K plain_words.append(p) K = (K + ADD) & 0xFFFFFFFF flag = b"".join(struct.pack("<I", w) for w in plain_words) print(flag.decode()) ``` Output: ``` Alpaca{V3r1l0g2Cpp_by_v3r1l4t0r} ``` ## Subway ### TL;DR The binary looks like it performs lots of floating-point operations, but it’s a disguise. It stores bits as **IEEE-754 signed zero**: - `0x00000000` = **+0.0** - `0x80000000` = **-0.0** Using sign-bit flipping and signed-zero behavior, it implements boolean gates and then full 32-bit integer arithmetic. The correct input length is **44 bytes**; a valid input prints **`congratz`**. **Flag / valid input:** ``` Alpaca{l0gic4l_0per4ti0ns_by_subtr4ction_;D} ``` **Check:** ```bash ./subway 'Alpaca{l0gic4l_0per4ti0ns_by_subtr4ction_;D}' # congratz ``` --- ### 1) Recon: the input must be 44 bytes `main` verifies: - `argv[1]` exists - `strlen(argv[1]) == 44` Otherwise it prints `"wrong"`. --- ### 2) Main trick: signed zero as bit encoding The constant `0x80000000` (float `-0.0`) appears frequently. The program encodes bits as: - bit 0 → `-0.0` (`0x80000000`) - bit 1 → `+0.0` (`0x00000000`) So every 32-bit word is represented as an array of 32 signed-zero floats. --- ### 3) “Weird floats” are boolean gates A simple example: - `FUN_001011a9(x) = x ^ 0x80000000` This flips the sign bit (`+0.0 ↔ -0.0`) which behaves like **bitwise NOT** in this encoding. Other helper functions combine signed-zero operations to implement AND/OR/XOR and a full-adder, enabling `ADD`/`SUB` over 32 bits. Bottom line: the float layer is just a disguised bitwise engine. --- ### 4) Data layout: 44 bytes → 11 little-endian u32 The input is split into 11 chunks of 4 bytes, interpreted as little-endian u32: $$ x_i = b_{4i} + (b_{4i+1}<<8) + (b_{4i+2}<<16) + (b_{4i+3}<<24) $$ --- ### 5) Stage 1: per-word affine transform (MUL + XOR) `FUN_001020fe` applies the same transform to each word: - `k = 0x0215abf3` - `c = 64*k - 1 (mod 2^32)` - transform: $$ y_i = (x_i \cdot k)\ \oplus\ c \pmod{2^{32}} $$ Multiplication is implemented via the classic **shift-and-add** method (scan multiplier bits and add shifted multiplicand), visible in `FUN_0010209c`. Since `k` is odd, it has a modular inverse modulo \(2^{32}\), so this stage is invertible. --- ### 6) Stage 2: long mixing function is linear (ADD/SUB only) The huge `FUN_0010167a` consists solely of data-independent sequences of word-level **ADD** and **SUB** operations. Therefore it is a linear map: $$ z = M \cdot y \pmod{2^{32}} $$ You can recover the 11×11 matrix \(M\) by feeding basis vectors \(e_0..e_{10}\) and reading outputs as matrix columns. If invertible: $$ y = M^{-1} \cdot z $$ --- ### 7) Final check: target `z` is hidden in `.rodata` (352 floats) The verifier iterates over **11×32 = 352** entries using a constant float table from `.rodata`. After simplification, it enforces each output bit to match a target pattern encoded in the **sign** of those floats: - weight > 0 → target bit = 1 - weight < 0 → target bit = 0 So we can build `z_target` by extracting those 352 floats and packing their sign bits into 11 u32 words. --- ### 8) Solve without brute force: invert all stages Pipeline: $$ x \xrightarrow{\text{affine}} y \xrightarrow{\text{linear mix}} z \xrightarrow{\text{check}} OK $$ Given `z_target`: 1) Compute: $$ y = M^{-1} \cdot z_{target} \pmod{2^{32}} $$ 2) Invert affine: $$ x_i = ( (y_i \oplus c) \cdot k^{-1}) \bmod 2^{32} $$ 3) Pack `x_i` as 44 bytes LE → valid input. ### Solver (Python) ```python #!/usr/bin/env python3 import re, struct, math, subprocess from pathlib import Path BIN = "./subway" MOD = 2**32 def u32(x): return x & 0xffffffff def inv_mod_2pow(a): a &= 0xffffffff if a % 2 == 0: raise ValueError("pivot even (no inverse mod 2^32)") return pow(a, -1, MOD) def mat_inv_mod(M): n = len(M) A = [[u32(M[r][c]) for c in range(n)] for r in range(n)] I = [[0]*n for _ in range(n)] for i in range(n): I[i][i] = 1 for col in range(n): pivot = None for r in range(col, n): if A[r][col] & 1: pivot = r break if pivot is None: raise ValueError(f"No odd pivot in col {col}") if pivot != col: A[col], A[pivot] = A[pivot], A[col] I[col], I[pivot] = I[pivot], I[col] invp = inv_mod_2pow(A[col][col]) for c in range(n): A[col][c] = u32(A[col][c] * invp) I[col][c] = u32(I[col][c] * invp) for r in range(n): if r == col: continue factor = A[r][col] if factor == 0: continue for c in range(n): A[r][c] = u32(A[r][c] - u32(factor * A[col][c])) I[r][c] = u32(I[r][c] - u32(factor * I[col][c])) return I def mat_vec_mul(A, v): n = len(A) out = [0]*n for i in range(n): s = 0 for j in range(n): s = u32(s + u32(A[i][j] * v[j])) out[i] = s return out # - call 0x10a0 = malloc(0x58) -> base output # - call 0x1430 = SUB (a-b) mod 2^32 # - call 0x14c2 = ADD (a+b) mod 2^32 def parse_instr_lines(): out = subprocess.check_output( ["objdump", "-d", "--start-address=0x167a", "--stop-address=0x2092", BIN], text=True ).splitlines() instrs = [] for ln in out: m = re.match(r"\s*([0-9a-f]+):\s+[0-9a-f ]+\s+([a-z][a-z0-9]+)\s*(.*)$", ln) if m: addr = int(m.group(1), 16) mnem = m.group(2) ops = m.group(3).strip() instrs.append((addr, mnem, ops)) return instrs mem_re = re.compile(r'^(?:(-?0x[0-9a-f]+)|(-?\d+))?\((%[a-z0-9]+)\)$') reg_alias = { 'rax':'rax','eax':'rax','rbx':'rbx','ebx':'rbx','rcx':'rcx','ecx':'rcx','rdx':'rdx','edx':'rdx', 'rsi':'rsi','esi':'rsi','rdi':'rdi','edi':'rdi','rbp':'rbp','ebp':'rbp','rsp':'rsp','esp':'rsp' } def reg_size(name): name = name.strip('%') return 32 if name.endswith('d') or name in ('eax','ebx','ecx','edx','esi','edi','ebp','esp') else 64 def parse_reg(op): op = op.strip() if op.startswith('%'): n = op[1:] return reg_alias[n], reg_size(n) return None def parse_mem(op): op = op.strip() m = mem_re.match(op) if not m: return None dh, dd, base = m.groups() if dh is not None: disp = int(dh, 16) if not dh.startswith('-') else -int(dh[1:], 16) elif dd is not None: disp = int(dd, 10) else: disp = 0 base = reg_alias[base[1:]] return base, disp def parse_imm(op): op = op.strip() if op.startswith('$'): s = op[1:] return int(s, 16) if s.startswith('0x') or s.startswith('-0x') else int(s, 10) return None def split_ops(ops): if not ops: return [] return [o.strip() for o in ops.split(',')] def emu_mix(instrs, input_words): regs = {r:0 for r in reg_alias.values()} mem = {} heap = 0x60000000 base_in = 0x50000000 for i,w in enumerate(input_words): mem[base_in + i*8] = u32(w) regs['rdi'] = base_in # arg1 for addr, mnem, ops in instrs: opsl = split_ops(ops) def read_op(op): imm = parse_imm(op) if imm is not None: return imm & 0xffffffffffffffff m = parse_mem(op) if m is not None: base, disp = m a = (regs[base] + disp) & 0xffffffffffffffff return mem.get(a, 0) r = parse_reg(op) if r is not None: canon, bits = r v = regs[canon] return (v & 0xffffffff) if bits == 32 else (v & 0xffffffffffffffff) raise ValueError(f"read fail: {op} at {addr:x}") def write_op(op, val): val &= 0xffffffffffffffff m = parse_mem(op) if m is not None: base, disp = m a = (regs[base] + disp) & 0xffffffffffffffff mem[a] = val return r = parse_reg(op) if r is not None: canon, bits = r regs[canon] = (val & 0xffffffff) if bits == 32 else val return raise ValueError(f"write fail: {op} at {addr:x}") if mnem == 'mov': src, dst = opsl write_op(dst, read_op(src)) elif mnem in ('add','sub'): src, dst = opsl a = read_op(dst); b = read_op(src) write_op(dst, (a + b) if mnem == 'add' else (a - b)) elif mnem == 'call': target = int(opsl[0].split()[0], 16) if target == 0x10a0: # malloc@plt size = regs['rdi'] & 0xffffffff regs['rax'] = heap # init 0x58 bytes as 0 for off in range(0, size, 8): mem[heap + off] = 0 heap += 0x1000 elif target == 0x1430: # SUB regs['rax'] = u32((regs['rdi'] & 0xffffffff) - (regs['rsi'] & 0xffffffff)) elif target == 0x14c2: # ADD regs['rax'] = u32((regs['rdi'] & 0xffffffff) + (regs['rsi'] & 0xffffffff)) else: raise ValueError(f"unexpected call {target:x} at {addr:x}") elif mnem in ('push','pop','ret'): if mnem == 'ret': break # stack ops ignored (not needed for our dataflow) else: raise ValueError(f"unsupported {mnem} at {addr:x}") base_out = regs['rbx'] out = [u32(mem.get(base_out + i*8, 0)) for i in range(11)] return out def main(): b = Path(BIN).read_bytes() # weights float[352] in rodata: 0x3040..0x35c0 weights = struct.unpack("<352f", b[0x3040:0x35c0]) z = [] for i in range(11): wblk = weights[i*32:(i+1)*32] word = 0 for j, w in enumerate(wblk): word |= ((1 if w > 0 else 0) << j) z.append(u32(word)) instrs = parse_instr_lines() M = [[0]*11 for _ in range(11)] for i in range(11): vec = [0]*11 vec[i] = 1 out = emu_mix(instrs, vec) for r in range(11): M[r][i] = out[r] Minv = mat_inv_mod(M) y = mat_vec_mul(Minv, z) # invert affine: y_i = (x_i * k) XOR c k = 0x0215abf3 c = u32(u32(64*k) - 1) invk = inv_mod_2pow(k) x = [u32(u32((yi ^ c) * invk)) for yi in y] flag = b"".join(struct.pack("<I", w) for w in x) print(flag.decode()) if __name__ == "__main__": main() ``` Result: ``` Alpaca{l0gic4l_0per4ti0ns_by_subtr4ction_;D} ``` --- # Notes Thanks for reading! If you like Reverse Engineering writeups like this, consider following for more CTF notes, tooling tips, and reversing tricks.