# Alpacahack Round 8 (Rev) - Writeups (all solved)
## Masking Tape
### 1) Challenge overview
The program expects exactly one CLI argument: `./masking-tape <input>`.
It transforms your input **byte-by-byte**, then compares two derived byte arrays against two constants stored in `.rodata`: `enc1` and `enc2`.
If both comparisons match, it prints **`congratz`**.
---
### 2) What happens to each byte?
Inside the loop:
```c
__s[i] = ((int)__s[i] << 3) | (__s[i] >> 5);
```
That is an **8-bit rotate-left by 3** (ROL8). For original byte `x`:
- `r = ROL8(x, 3)`
Then it splits `r` into two complementary “bit groups” using masks:
- `0xCC` = `11001100` (keeps bits 2–3 and 6–7)
- `0x33` = `00110011` (keeps bits 0–1 and 4–5)
The trick: depending on whether `r` is **even or odd**, the program either stores:
- If `r` is **even** (`r & 1 == 0`):
- `enc1[i] = r & 0xCC`
- `enc2[i] = r & 0x33`
- If `r` is **odd** (`r & 1 == 1`), it **swaps** them:
- `enc1[i] = r & 0x33`
- `enc2[i] = r & 0xCC`
So `enc1` and `enc2` together contain all bits of `r`, just shuffled.
---
### 3) Inverting the transformation
Since the masks are complementary, we can merge them back.
Try both possibilities:
- Assume even:
- `r_even = (enc1[i] & 0xCC) | (enc2[i] & 0x33)`
- Assume odd (swapped):
- `r_odd = (enc1[i] & 0x33) | (enc2[i] & 0xCC)`
Then validate using the parity bit and the expected masked outputs:
- if `r_even` is even and produces the same `enc1/enc2` masks → it’s correct,
- otherwise `r_odd` must be correct.
Finally recover the original byte:
- `x = ROR8(r, 3)` (rotate-right 3)
Repeat for all bytes → get the original input string (the flag).
---
### 4) Locating `enc1` and `enc2` in the ELF
You can find symbol addresses using:
```bash
nm -a ./masking-tape | grep -E 'enc1|enc2'
readelf -S ./masking-tape | grep rodata
```
In this binary:
- `.rodata` starts at Vaddr `0x2000` and file offset `0x2000`
- `enc1` is at `0x2020`
- `enc2` is at `0x2040`
Because the `.rodata` vaddr matches its file offset here, you can directly `seek()` to those offsets and read C-strings.
---
### 5) Solver (Python)
```python
#!/usr/bin/env python3
from pathlib import Path
BIN = "./masking-tape"
ENC1_OFF = 0x2020
ENC2_OFF = 0x2040
def read_cstring(buf: bytes, off: int) -> bytes:
end = buf.index(b"\x00", off)
return buf[off:end]
def ror8(x: int, n: int = 3) -> int:
return ((x >> n) | ((x << (8 - n)) & 0xFF)) & 0xFF
def main():
data = Path(BIN).read_bytes()
enc1 = read_cstring(data, ENC1_OFF)
enc2 = read_cstring(data, ENC2_OFF)
assert len(enc1) == len(enc2)
out = []
for a, b in zip(enc1, enc2):
# asumsi r genap
r_even = (a & 0xCC) | (b & 0x33)
ok_even = (r_even & 1) == 0 and a == (r_even & 0xCC) and b == (r_even & 0x33)
if ok_even:
r = r_even
else:
# asumsi r ganjil (swap)
r_odd = (a & 0x33) | (b & 0xCC)
ok_odd = (r_odd & 1) == 1 and a == (r_odd & 0x33) and b == (r_odd & 0xCC)
assert ok_odd
r = r_odd
out.append(ror8(r, 3))
print(bytes(out).decode())
if __name__ == "__main__":
main()
```
---
## Final
**Flag:** `Alpaca{y0u'r3_a_pr0_crack3r}`
## hidden
### 1) Summary
The program validates `argv[1]` by:
- Packing it into **32-bit (4-byte) words**.
- Transforming each word using a **4×32-bit internal state**.
- Comparing the resulting word array against a constant array stored in `.data` via `memcmp`.
- If equal → prints `congratz`.
Because the per-word transform is essentially XOR + bit rotations, it is **reversible**. We can recover the original input by reversing the transform using the known expected output array from `.data`.
---
### 2) What `main()` does
High-level flow:
1. Initializes the state from:
```
"AlpacaHackRound8"
```
(16 bytes → 4 little-endian `uint32` words)
2. Packs input into 32-bit words:
- `n = (len + 3) >> 2` (ceil word count)
- `buf = calloc(n,4)` (zero-padded)
- `memcpy(buf, input, len)`
3. For each word:
- computes `out[i]` from `buf[i]` and the current state
- writes `out[i]` back into the buffer
- updates the state depending on `out[i] & 1`
4. Compares the output buffer to the `.data` constant array.
---
### 3) Expected output in `.data`
For this binary, the relevant `.data` layout is:
- Offset `0x20`: `uint32 n_words`
- Offset `0x40`: `n_words * 4` bytes = expected output words
In this instance, `n_words = 27`.
---
### 4) The core per-word transform
Given input word `x` and state `(s0,s1,s2,s3)`:
1) Compute:
- `c = rol(s0, 5) + ror(s1, 3)`
- `d = ror(s2, 3) - rol(s3, 5)`
(all modulo 2³²)
2) Output:
- `out = x ^ c ^ d`
3) State update based on `out` LSB:
- If `out & 1`:
```
s0 ^= rol(d,11)
s1 ^= rol(d,13)
s2 ^= ror(c,15)
s3 ^= ror(c,13)
```
- Else:
```
s0 ^= ror(d,13)
s1 ^= ror(d,15)
s2 ^= rol(c,13)
s3 ^= rol(c,11)
```
---
### 5) Reversing the transform
Since:
- `out = x ^ c ^ d`
We invert with:
- `x = out ^ c ^ d`
The state update depends on `out`, but we know `out` from the constant target array, so we can replay state transitions exactly.
---
### 6) Solver (Python)
```python
#!/usr/bin/env python3
import struct
MASK = 0xFFFFFFFF
def rol(x, n):
x &= MASK
return ((x << n) | (x >> (32 - n))) & MASK
def ror(x, n):
x &= MASK
return ((x >> n) | (x << (32 - n))) & MASK
def get_section_bytes(path: str, sec_name: str) -> bytes:
b = open(path, "rb").read()
if b[:4] != b"\x7fELF" or b[4] != 2 or b[5] != 1:
raise SystemExit("Not an ELF64 little-endian")
e_shoff = struct.unpack_from("<Q", b, 0x28)[0]
e_shentsize = struct.unpack_from("<H", b, 0x3A)[0]
e_shnum = struct.unpack_from("<H", b, 0x3C)[0]
e_shstrndx = struct.unpack_from("<H", b, 0x3E)[0]
def shdr(i):
off = e_shoff + i * e_shentsize
sh_name = struct.unpack_from("<I", b, off + 0x00)[0]
sh_off = struct.unpack_from("<Q", b, off + 0x18)[0]
sh_size = struct.unpack_from("<Q", b, off + 0x20)[0]
return sh_name, sh_off, sh_size
# section-header string table
_, shstr_off, shstr_size = shdr(e_shstrndx)
shstr = b[shstr_off:shstr_off + shstr_size]
def name_at(off):
end = shstr.find(b"\x00", off)
return shstr[off:end].decode("ascii", errors="ignore")
for i in range(e_shnum):
n_off, off, size = shdr(i)
if name_at(n_off) == sec_name:
return b[off:off + size]
raise SystemExit(f"Section {sec_name} not found")
def main():
binpath = "hidden"
data = get_section_bytes(binpath, ".data")
# Known layout for this challenge binary:
# 0x20: uint32 n_words
# 0x40: n_words * 4 bytes (expected output words)
n_words = struct.unpack_from("<I", data, 0x20)[0]
expected = list(struct.unpack_from("<{}I".format(n_words), data, 0x40))
# initial state from "AlpacaHackRound8" (16 bytes -> 4 words LE)
state = list(struct.unpack("<4I", b"AlpacaHackRound8"))
recovered = []
for outw in expected:
c = (rol(state[0], 5) + ror(state[1], 3)) & MASK
d = (ror(state[2], 3) - rol(state[3], 5)) & MASK
inw = (outw ^ c ^ d) & MASK
recovered.append(inw)
# update state depends on outw's LSB
if outw & 1:
state[0] ^= rol(d, 11)
state[1] ^= rol(d, 13)
state[2] ^= ror(c, 15)
state[3] ^= ror(c, 13)
else:
state[0] ^= ror(d, 13)
state[1] ^= ror(d, 15)
state[2] ^= rol(c, 13)
state[3] ^= rol(c, 11)
state = [x & MASK for x in state]
raw = struct.pack("<{}I".format(n_words), *recovered)
flag = raw.rstrip(b"\x00").decode("ascii", errors="replace")
print(flag)
if __name__ == "__main__":
main()
```
---
### 7) Result
Running the solver prints the recovered input:
```
Alpaca{th15_f145_1s_3xc3ssiv3ly_l3ngthy_but_th1s_1s_t0_3nsur3_th4t_1t_c4nn0t_b3_e4s1ly_s01v3d_us1ng_angr}
```
## vcipher
### Challenge summary
The `vcipher` ELF binary:
- asks for a flag,
- enforces an **exact 32-character** length,
- runs the flag through a cipher,
- and compares the produced output against a **hardcoded reference array**.
The hint *“Verilog can also be converted to C++”* points to **Verilator**: the binary contains `Verilated...` symbols and a `Vcipher` class typical of Verilog-to-C++ simulation.
---
### 1) Input validation: 32 characters
The `.rodata` contains:
- `Error: Flag must be exactly 32 characters.`
`main` checks the input length against `0x20` (32). So the flag must be 32 chars.
---
### 2) Extract the reference output
A hardcoded 8×`uint32` reference array is stored in `.rodata`:
```
91715a34 0a95c4dc 4e3fd78a eede0660
a4f674b4 4d572096 6856ba7f 7e39cb45
```
These are the expected outputs `out[0..7]` (little-endian `uint32`).
---
### 3) What `main` does (high level)
`main` splits the 32-byte input into **8 blocks** of **4 bytes**.
Each block is packed into a little-endian `uint32`, fed into the Verilator module `Vcipher`, clocked until `done=1`, and the output word is collected.
Finally, the 8 output words are compared against the reference array. Therefore, reversing `Vcipher` is enough to recover the flag.
---
### 4) Reverse the Verilator core
The main logic lives in `Vcipher___024root___sequent__TOP__1`:
- On block start: `x = input ^ key`, counter reset
- Loop:
- while `counter <= 0x24f8896`, do `rol x, 3`, `counter++`
- On finish:
- `out = x`
- key schedule: `key += 0x17CA85FE`
The initial key is set in `main`:
- `K0 = 0xE8D2BFCD`
So per block:
- `K_{i+1} = (K_i + 0x17CA85FE) mod 2^32`
---
### 5) Reduce the huge loop: effective rotation is 5
32-bit rotations are periodic:
- repeating `ROL(3)` N times equals `ROL((3*N) mod 32)`.
The loop performs `0x24f8896 + 1` iterations due to the `<=` condition, hence:
- `3*(0x24f8896 + 1) mod 32 = 5`
So the cipher per block is:
$$
out = rol_5(input \oplus key)
$$
Inverse:
$$
input = ror_5(out) \oplus key
$$
---
### 6) Recover the flag from the reference array
We already know `out[i]`. For each block:
- `in_i = ror_5(out_i) ^ key`
- `key = (key + 0x17CA85FE) mod 2^32`
Then pack each `in_i` as 4 little-endian bytes to rebuild the 32-character ASCII flag.
#### Solver (Python)
```python
import struct
def ror32(x, r):
r %= 32
return ((x >> r) | ((x << (32 - r)) & 0xFFFFFFFF)) & 0xFFFFFFFF
outs_bytes = bytes.fromhex(
"91715a340a95c4dc4e3fd78aeede0660"
"a4f674b44d5720966856ba7f7e39cb45"
)
outs = struct.unpack("<8I", outs_bytes)
K = 0xE8D2BFCD
ADD = 0x17CA85FE
plain_words = []
for o in outs:
p = ror32(o, 5) ^ K
plain_words.append(p)
K = (K + ADD) & 0xFFFFFFFF
flag = b"".join(struct.pack("<I", w) for w in plain_words)
print(flag.decode())
```
Output:
```
Alpaca{V3r1l0g2Cpp_by_v3r1l4t0r}
```
## Subway
### TL;DR
The binary looks like it performs lots of floating-point operations, but it’s a disguise. It stores bits as **IEEE-754 signed zero**:
- `0x00000000` = **+0.0**
- `0x80000000` = **-0.0**
Using sign-bit flipping and signed-zero behavior, it implements boolean gates and then full 32-bit integer arithmetic.
The correct input length is **44 bytes**; a valid input prints **`congratz`**.
**Flag / valid input:**
```
Alpaca{l0gic4l_0per4ti0ns_by_subtr4ction_;D}
```
**Check:**
```bash
./subway 'Alpaca{l0gic4l_0per4ti0ns_by_subtr4ction_;D}'
# congratz
```
---
### 1) Recon: the input must be 44 bytes
`main` verifies:
- `argv[1]` exists
- `strlen(argv[1]) == 44`
Otherwise it prints `"wrong"`.
---
### 2) Main trick: signed zero as bit encoding
The constant `0x80000000` (float `-0.0`) appears frequently. The program encodes bits as:
- bit 0 → `-0.0` (`0x80000000`)
- bit 1 → `+0.0` (`0x00000000`)
So every 32-bit word is represented as an array of 32 signed-zero floats.
---
### 3) “Weird floats” are boolean gates
A simple example:
- `FUN_001011a9(x) = x ^ 0x80000000`
This flips the sign bit (`+0.0 ↔ -0.0`) which behaves like **bitwise NOT** in this encoding.
Other helper functions combine signed-zero operations to implement AND/OR/XOR and a full-adder, enabling `ADD`/`SUB` over 32 bits.
Bottom line: the float layer is just a disguised bitwise engine.
---
### 4) Data layout: 44 bytes → 11 little-endian u32
The input is split into 11 chunks of 4 bytes, interpreted as little-endian u32:
$$
x_i = b_{4i} + (b_{4i+1}<<8) + (b_{4i+2}<<16) + (b_{4i+3}<<24)
$$
---
### 5) Stage 1: per-word affine transform (MUL + XOR)
`FUN_001020fe` applies the same transform to each word:
- `k = 0x0215abf3`
- `c = 64*k - 1 (mod 2^32)`
- transform:
$$
y_i = (x_i \cdot k)\ \oplus\ c \pmod{2^{32}}
$$
Multiplication is implemented via the classic **shift-and-add** method (scan multiplier bits and add shifted multiplicand), visible in `FUN_0010209c`.
Since `k` is odd, it has a modular inverse modulo \(2^{32}\), so this stage is invertible.
---
### 6) Stage 2: long mixing function is linear (ADD/SUB only)
The huge `FUN_0010167a` consists solely of data-independent sequences of word-level **ADD** and **SUB** operations. Therefore it is a linear map:
$$
z = M \cdot y \pmod{2^{32}}
$$
You can recover the 11×11 matrix \(M\) by feeding basis vectors \(e_0..e_{10}\) and reading outputs as matrix columns. If invertible:
$$
y = M^{-1} \cdot z
$$
---
### 7) Final check: target `z` is hidden in `.rodata` (352 floats)
The verifier iterates over **11×32 = 352** entries using a constant float table from `.rodata`. After simplification, it enforces each output bit to match a target pattern encoded in the **sign** of those floats:
- weight > 0 → target bit = 1
- weight < 0 → target bit = 0
So we can build `z_target` by extracting those 352 floats and packing their sign bits into 11 u32 words.
---
### 8) Solve without brute force: invert all stages
Pipeline:
$$
x \xrightarrow{\text{affine}} y \xrightarrow{\text{linear mix}} z \xrightarrow{\text{check}} OK
$$
Given `z_target`:
1) Compute:
$$
y = M^{-1} \cdot z_{target} \pmod{2^{32}}
$$
2) Invert affine:
$$
x_i = ( (y_i \oplus c) \cdot k^{-1}) \bmod 2^{32}
$$
3) Pack `x_i` as 44 bytes LE → valid input.
### Solver (Python)
```python
#!/usr/bin/env python3
import re, struct, math, subprocess
from pathlib import Path
BIN = "./subway"
MOD = 2**32
def u32(x): return x & 0xffffffff
def inv_mod_2pow(a):
a &= 0xffffffff
if a % 2 == 0:
raise ValueError("pivot even (no inverse mod 2^32)")
return pow(a, -1, MOD)
def mat_inv_mod(M):
n = len(M)
A = [[u32(M[r][c]) for c in range(n)] for r in range(n)]
I = [[0]*n for _ in range(n)]
for i in range(n): I[i][i] = 1
for col in range(n):
pivot = None
for r in range(col, n):
if A[r][col] & 1:
pivot = r
break
if pivot is None:
raise ValueError(f"No odd pivot in col {col}")
if pivot != col:
A[col], A[pivot] = A[pivot], A[col]
I[col], I[pivot] = I[pivot], I[col]
invp = inv_mod_2pow(A[col][col])
for c in range(n):
A[col][c] = u32(A[col][c] * invp)
I[col][c] = u32(I[col][c] * invp)
for r in range(n):
if r == col: continue
factor = A[r][col]
if factor == 0: continue
for c in range(n):
A[r][c] = u32(A[r][c] - u32(factor * A[col][c]))
I[r][c] = u32(I[r][c] - u32(factor * I[col][c]))
return I
def mat_vec_mul(A, v):
n = len(A)
out = [0]*n
for i in range(n):
s = 0
for j in range(n):
s = u32(s + u32(A[i][j] * v[j]))
out[i] = s
return out
# - call 0x10a0 = malloc(0x58) -> base output
# - call 0x1430 = SUB (a-b) mod 2^32
# - call 0x14c2 = ADD (a+b) mod 2^32
def parse_instr_lines():
out = subprocess.check_output(
["objdump", "-d", "--start-address=0x167a", "--stop-address=0x2092", BIN],
text=True
).splitlines()
instrs = []
for ln in out:
m = re.match(r"\s*([0-9a-f]+):\s+[0-9a-f ]+\s+([a-z][a-z0-9]+)\s*(.*)$", ln)
if m:
addr = int(m.group(1), 16)
mnem = m.group(2)
ops = m.group(3).strip()
instrs.append((addr, mnem, ops))
return instrs
mem_re = re.compile(r'^(?:(-?0x[0-9a-f]+)|(-?\d+))?\((%[a-z0-9]+)\)$')
reg_alias = {
'rax':'rax','eax':'rax','rbx':'rbx','ebx':'rbx','rcx':'rcx','ecx':'rcx','rdx':'rdx','edx':'rdx',
'rsi':'rsi','esi':'rsi','rdi':'rdi','edi':'rdi','rbp':'rbp','ebp':'rbp','rsp':'rsp','esp':'rsp'
}
def reg_size(name):
name = name.strip('%')
return 32 if name.endswith('d') or name in ('eax','ebx','ecx','edx','esi','edi','ebp','esp') else 64
def parse_reg(op):
op = op.strip()
if op.startswith('%'):
n = op[1:]
return reg_alias[n], reg_size(n)
return None
def parse_mem(op):
op = op.strip()
m = mem_re.match(op)
if not m: return None
dh, dd, base = m.groups()
if dh is not None:
disp = int(dh, 16) if not dh.startswith('-') else -int(dh[1:], 16)
elif dd is not None:
disp = int(dd, 10)
else:
disp = 0
base = reg_alias[base[1:]]
return base, disp
def parse_imm(op):
op = op.strip()
if op.startswith('$'):
s = op[1:]
return int(s, 16) if s.startswith('0x') or s.startswith('-0x') else int(s, 10)
return None
def split_ops(ops):
if not ops: return []
return [o.strip() for o in ops.split(',')]
def emu_mix(instrs, input_words):
regs = {r:0 for r in reg_alias.values()}
mem = {}
heap = 0x60000000
base_in = 0x50000000
for i,w in enumerate(input_words):
mem[base_in + i*8] = u32(w)
regs['rdi'] = base_in # arg1
for addr, mnem, ops in instrs:
opsl = split_ops(ops)
def read_op(op):
imm = parse_imm(op)
if imm is not None:
return imm & 0xffffffffffffffff
m = parse_mem(op)
if m is not None:
base, disp = m
a = (regs[base] + disp) & 0xffffffffffffffff
return mem.get(a, 0)
r = parse_reg(op)
if r is not None:
canon, bits = r
v = regs[canon]
return (v & 0xffffffff) if bits == 32 else (v & 0xffffffffffffffff)
raise ValueError(f"read fail: {op} at {addr:x}")
def write_op(op, val):
val &= 0xffffffffffffffff
m = parse_mem(op)
if m is not None:
base, disp = m
a = (regs[base] + disp) & 0xffffffffffffffff
mem[a] = val
return
r = parse_reg(op)
if r is not None:
canon, bits = r
regs[canon] = (val & 0xffffffff) if bits == 32 else val
return
raise ValueError(f"write fail: {op} at {addr:x}")
if mnem == 'mov':
src, dst = opsl
write_op(dst, read_op(src))
elif mnem in ('add','sub'):
src, dst = opsl
a = read_op(dst); b = read_op(src)
write_op(dst, (a + b) if mnem == 'add' else (a - b))
elif mnem == 'call':
target = int(opsl[0].split()[0], 16)
if target == 0x10a0: # malloc@plt
size = regs['rdi'] & 0xffffffff
regs['rax'] = heap
# init 0x58 bytes as 0
for off in range(0, size, 8):
mem[heap + off] = 0
heap += 0x1000
elif target == 0x1430: # SUB
regs['rax'] = u32((regs['rdi'] & 0xffffffff) - (regs['rsi'] & 0xffffffff))
elif target == 0x14c2: # ADD
regs['rax'] = u32((regs['rdi'] & 0xffffffff) + (regs['rsi'] & 0xffffffff))
else:
raise ValueError(f"unexpected call {target:x} at {addr:x}")
elif mnem in ('push','pop','ret'):
if mnem == 'ret':
break
# stack ops ignored (not needed for our dataflow)
else:
raise ValueError(f"unsupported {mnem} at {addr:x}")
base_out = regs['rbx']
out = [u32(mem.get(base_out + i*8, 0)) for i in range(11)]
return out
def main():
b = Path(BIN).read_bytes()
# weights float[352] in rodata: 0x3040..0x35c0
weights = struct.unpack("<352f", b[0x3040:0x35c0])
z = []
for i in range(11):
wblk = weights[i*32:(i+1)*32]
word = 0
for j, w in enumerate(wblk):
word |= ((1 if w > 0 else 0) << j)
z.append(u32(word))
instrs = parse_instr_lines()
M = [[0]*11 for _ in range(11)]
for i in range(11):
vec = [0]*11
vec[i] = 1
out = emu_mix(instrs, vec)
for r in range(11):
M[r][i] = out[r]
Minv = mat_inv_mod(M)
y = mat_vec_mul(Minv, z)
# invert affine: y_i = (x_i * k) XOR c
k = 0x0215abf3
c = u32(u32(64*k) - 1)
invk = inv_mod_2pow(k)
x = [u32(u32((yi ^ c) * invk)) for yi in y]
flag = b"".join(struct.pack("<I", w) for w in x)
print(flag.decode())
if __name__ == "__main__":
main()
```
Result:
```
Alpaca{l0gic4l_0per4ti0ns_by_subtr4ction_;D}
```
---
# Notes
Thanks for reading! If you like Reverse Engineering writeups like this, consider following for more CTF notes, tooling tips, and reversing tricks.