fun-reversing-challenge

# fun-reversing-challenge This is an EVM reverse-engineering challenge. The server will show the flag if `isSolved()` returns true. ```solidity import "Challenge.sol"; interface ChallengeInterface { function solved() external view returns (bool); } contract Setup { ChallengeInterface public challenge; constructor() { challenge = ChallengeInterface(address(new Challenge())); } function isSolved() public view returns (bool) { return challenge.solved(); } } ``` To solve this, we need to analyze the `Challenge` contract and make `challenge.solved()` true. The source code of `Challenge` is not given to participants. Only EVM opcodes could be collected using JSON-RPC API ([eth_getCode](https://docs.alchemy.com/reference/eth-getcode)). ## High-level operations The contract accepts two function selectors: - **0x799320bb** aka `solved()` returns `uint8/bool` value from storage slot 0x0 (`bool solved`), - **0xc64b3bb5** accepts an input in unknown format and sets `solved = 1` if the input meets certain condition. The rest of this article discusses about **0xc64b3bb5**. ## Bottom-up analysis As part of bottom-up analysis, we first collected possible entry offsets of functions by static analysis and decompiled each function. Then we filtered out some known library-generated functions. - Safe mathematics - func_cd3: `+` - func_cb4: `*` - func_ca1: `-` - func_ce6: `++` - Others - func_{b3d,8c5,9d4} calculates something - func_a85 compares the calculation result with constant values - func_888 calls func_a85 and sets `solved = 1` if the returned value is 1 ### func_b3d: modular multiplication over GF(2^8) with modulo 0x11b ```solidity function func_b3d (uint256 a0, uint256 a1, uint256 a2) returns (uint256) { uint256 v0; uint256 v1; uint256 v2; v0 = a1; v1 = a2; v2 = 0x0; do { if (v1 & 0x1) { v2 = v0 ^ v2; } if (v0 & 0x80) { v0 = 0x1b ^ ((uint8)v0 << 1); } else { v0 = (uint8)v0 << 1; } v1 = (uint8)v1 >> 0x1; } while ((uint8)v1); return v2; } ``` This routine seemed like a multiplication over the finite field GF(2^8), which is described in [Wikipedia](https://en.wikipedia.org/wiki/Finite_field_arithmetic#C_programming_example). The whole routine was the same, including the modulo used in the multiplication. ### func_a85: comparison over two 6x6 matrices ```solidity function func_a85 (uint256 a0, uint256 a1, uint256 a2) returns (uint8) { ... v0 = a1; v1 = a2; v2 = 0x0; while (v2 < 0x6) { v3 = 0x0; while (v3 < 0x6) { if (v2 < 0x6) { v4 = MLOAD((v2 << 0x5) + v1); if (v3 < 0x6) { v5 = (uint8)MLOAD((v3 << 0x5) + v4); if (v2 < 0x6) { v6 = MLOAD((v2 << 0x5) + v0); if (v3 < 0x6) { if ((uint8)MLOAD((v3 << 0x5) + v6) != v5) { return 0x0; } v3 = safe_increment(v3); continue; } } } } MSTORE32(0x0, 0xe0 << 0x4e487b71); MSTORE32(0x4, 0x32); revert(0x0, 0x24); } v2 = safe_increment(v2); } return 0x1; } ``` The decompiled code above has weird redundancies; while iterating `v2` over `[0, 6)`, it checks if `v2 < 0x6` before each memory access: `MLOAD`, `MSTORE32`. If the comparison fails, it simply reverts. From this behavior, we can guess them as boundary checks against fixed-size array like `uint8[6][6]`. Then it accesses `uint256` values at the offset `MEM[v2 << 0x5] + (v3 << 5)` from `a1(=v0)` and `(a2=v1)`. There are two solidity-specific design decisions to note (from [Solidity documentation](https://docs.soliditylang.org/en/v0.8.15/internals/layout_in_memory.html)): - Each `uint8` actually occupies 32 bytes in the memory (to avoid additional shift operations?) > Elements in memory arrays in Solidity always occupy multiples of 32 bytes ... - Instead of using `6x6 (36)` elements, it uses the pointers to `uint8[6]` arrays, like double pointers in other languages. > Multi-dimensional memory arrays are pointers to memory arrays. Also after googling `0x4e487b71`, we learned that this `revert()` is [generated by solidity](https://blog.soliditylang.org/2020/10/28/solidity-0.8.x-preview/) for array bound checks. With this in mind, we can strip the code as below, which becomes a matrix comparison: ```solidity function func_a85(uint8[6][6] a1, uint8[6][6] a2) { for(uint v2 = 0; v2 < 6; v2++) { for(uint v3 = 0; v3 < 6; v3++) { if(a1[v2][v3] != a2[v2][v3]) { return 0; } } } return 1; } ``` ### func_8c5, func_9d4: matrix multiplication and addition This is what we got after removing those redundancies from another two functions: ```solidity function func_8c5(a1, a2) { func_b92(); v2 = func_b92(); for(uint v3 = 0; v3 < 6; v3++) { for(uint v4 = 0; v4 < 6; v4++) { for(uint v5 = 0; v5 < 6; v5++) { v9 = func_b3d(a1[v3][v5], a2[v5][v4]); v2[v3][v4] ^= v9 } } } } function func_9d4(a1, a2) { for(uint v2 = 0; v2 < 6; v2++) { for(uint v3 = 0; v3 < 6; v3++) { a1[v2][v3] ^= a2[v2][v3] } } } ``` In finite field arithmetic, xor (`^`) is used for addition, so `func_9d4` is matrix addition and `func_8c5` is multiplication. ## Analyzing the remaining parts Then, from the function trace, we found out that it calls 1. matrix multiplication with M0 2. matrix addition with M1 3. comparison with M2 where M0 ~ M2 is hardcoded matrices. Maybe we can just reverse these steps in 3-2-1 and check the result. Before that, we should dump the matrices. Since the current EVM debuggers does not have interactive memory viewer/dumper, we patched `0x8c5` to log the whole memory into a log, using `LOG0` instruction. For testing we used `forge test -vvvv` inside [Foundry](https://getfoundry.sh/). Code patch: ``` PUSH3 0x100000 ; memory length PUSH1 0x0 ; memory offset LOG0 ; dump memory to log SELFDESRTUCT ; for cleanup => 621000006000A0FF ``` Dumped matrices: ```python M0 = [ [0xbe, 0x9a, 0xc2, 0x24, 0x7f, 0x4d], [0x59, 0xde, 0x3b, 0x61, 0x0a, 0x1a], [0xc8, 0x18, 0x96, 0x0e, 0x94, 0x4d], [0xe3, 0x64, 0x8c, 0x6d, 0x76, 0xfe], [0x16, 0xd1, 0x41, 0x8e, 0x0e, 0x50], [0xe7, 0x42, 0xa4, 0x87, 0x8e, 0x6b], ] M1 = [ [0x23, 0xab, 0x1e, 0x4c, 0xe9, 0x0e], [0xef, 0x53, 0xb4, 0xac, 0x18, 0xb1], [0x3c, 0xc2, 0x2f, 0x34, 0x4a, 0x18], [0x65, 0x94, 0x67, 0xd3, 0x59, 0x29], [0xa0, 0x27, 0x4a, 0x73, 0xcd, 0x88], [0x5e, 0x32, 0x50, 0x20, 0x80, 0x0e], ] M2 = [ [0x62, 0x62, 0x2d, 0x57, 0x82, 0x2a], [0xc8, 0xb8, 0x6b, 0x66, 0x8b, 0x19], [0x49, 0x8b, 0x3a, 0x88, 0xd9, 0x81], [0xdb, 0x15, 0x6b, 0xcc, 0x12, 0xdb], [0x91, 0xc0, 0x11, 0x56, 0xa6, 0xd9], [0x28, 0xf8, 0x2b, 0x47, 0x5d, 0xe2], ] ``` Because the decompiler emitted the top-level function as one large function, we gave it a try without further analysis. Unfortunately, just running the steps in reverse order didn't gave us a printable text or flag (which was technically not required but sus). Since we also need to figure out the input format required by `0xc64b3bb5`, we decided to analyze how the functions above are called exactly using EVM disassembler e.g., [ida-evm](https://github.com/crytic/ida-evm). ![](https://i.imgur.com/PIMV0kx.png) ^ EVM control flow graph generated by IDA Pro and ida-evm Reading the assembly, we found the following input-related checks: - The selector takes single argument in `bytes` format with the length of 42 ``` evm:00000071 JUMPDEST evm:00000072 PUSH1 0x2a evm:00000074 DUP2 evm:00000075 EQ ; func_bdd() == 0x2a ``` - The input is in `PCTF{...}` format ``` ; Checks if the 1st byte is 0x50 (P) evm:000000C2 PUSH1 0x5 evm:000000C4 PUSH1 0xfc evm:000000C6 SHL evm:000000C7 EQ ; 0x5 << 0xfc == input[0] << 0xf8 ; the 2nd byte is 0x43 (C) evm:000000F6 PUSH1 0x43 evm:000000F8 PUSH1 0xf8 evm:000000FA SHL evm:000000FB EQ ; 0x43 << 0xf8 == input[1] << 0xf8 ... ``` Then the character inside `PCTF{...}` is 36 bytes, which fits into 6x6 matrix. Now we'll analyze how each functions are called. ```js evm:0000081C routine: ; evm:0000081C PUSH2 0x838 ; #2 JUMP at 0x837 returns to 0x838 evm:0000081F DUP4 evm:00000820 PUSH2 0x832 ; #1 JUMP at 0x831 returns to 0x832 evm:00000823 DUP7 evm:00000824 DUP6 evm:00000825 PUSH2 0x8c5 ; multiply matrices evm:00000828 SWAP1 ; x, func_8c5 evm:00000829 SWAP2 ; y, func_8c5, x evm:0000082A SWAP1 ; func_8c5, y, x evm:0000082B PUSH4 0xffffffff evm:00000830 AND evm:00000831 JUMP evm:00000832 ; ---------------------------------------------------------------------- evm:00000832 JUMPDEST evm:00000833 SWAP1 evm:00000834 PUSH2 0x9d4 ; add matrices evm:00000837 JUMP evm:00000838 ; ---------------------------------------------------------------------- evm:00000838 JUMPDEST evm:00000839 SWAP4 evm:0000083A POP evm:0000083B PUSH2 0x848 ; #4 JUMP at 0x837 returns to 0x848 evm:0000083E DUP4 evm:0000083F PUSH2 0x832 ; #3 JUMP at 0x847 returns to 0x832 evm:00000842 DUP5 evm:00000843 DUP8 evm:00000844 PUSH2 0x8c5 evm:00000847 JUMP evm:00000848 ; ---------------------------------------------------------------------- ... repeated 6 times evm:00000888 JUMPDEST evm:00000889 SWAP4 evm:0000088A POP evm:0000088B PUSH2 0x894 evm:0000088E DUP5 evm:0000088F DUP3 evm:00000890 PUSH2 0xa85 ; compare matrices evm:00000893 JUMP ``` `0000081C` ~ `00000831` sets up EVM stack like: ``` | func_8c5 (multiply) | arg1 | arg2 | 00000832 | 00000838 | ``` After func_8c5 is finished, it returns to the caller function by executing `JUMP`, which is `00000832`. Then `00000832` sets up EVM stack like below: ``` | func_9d4 (add) | arg1 | arg2 | 00000838 | ``` So we can see that multiplication comes first, followed by addition. Interestingly, `00000838` ~ `00000847` uses `00000832` again after calling the multiplication for the second time. So, `00000832` is used like a reusable piece of code for matrix addition. This pattern is repeated 6 times, then the comparison function at 0xa85 is called. From the above, we now know that the step 1, 2 is repeated 6 times. By doing this in reverse step, we can get the flag. ## Solution ```python M0 = [ [0xbe, 0x9a, 0xc2, 0x24, 0x7f, 0x4d], [0x59, 0xde, 0x3b, 0x61, 0x0a, 0x1a], [0xc8, 0x18, 0x96, 0x0e, 0x94, 0x4d], [0xe3, 0x64, 0x8c, 0x6d, 0x76, 0xfe], [0x16, 0xd1, 0x41, 0x8e, 0x0e, 0x50], [0xe7, 0x42, 0xa4, 0x87, 0x8e, 0x6b], ] M1 = [ [0x23, 0xab, 0x1e, 0x4c, 0xe9, 0x0e], [0xef, 0x53, 0xb4, 0xac, 0x18, 0xb1], [0x3c, 0xc2, 0x2f, 0x34, 0x4a, 0x18], [0x65, 0x94, 0x67, 0xd3, 0x59, 0x29], [0xa0, 0x27, 0x4a, 0x73, 0xcd, 0x88], [0x5e, 0x32, 0x50, 0x20, 0x80, 0x0e], ] M2 = [ [0x62, 0x62, 0x2d, 0x57, 0x82, 0x2a], [0xc8, 0xb8, 0x6b, 0x66, 0x8b, 0x19], [0x49, 0x8b, 0x3a, 0x88, 0xd9, 0x81], [0xdb, 0x15, 0x6b, 0xcc, 0x12, 0xdb], [0x91, 0xc0, 0x11, 0x56, 0xa6, 0xd9], [0x28, 0xf8, 0x2b, 0x47, 0x5d, 0xe2], ] # Setup GF(2^8) finite field with modulus 0x11b (x^8 + x^4 + x^3 + x + 1) F.<x> = GF(2^8, modulus=GF(2^9).fetch_int(0x11b)) M0 = Matrix(F, Matrix(M0).apply_map(F.fetch_int)) M1 = Matrix(F, Matrix(M1).apply_map(F.fetch_int)) M2 = Matrix(F, Matrix(M2).apply_map(F.fetch_int)) def print_flag(flag): flag_int = bytearray(36) for i in range(6): for j in range(6): flag_int[6*i+j] = flag[i, j].integer_representation() print(flag_int) flag = M2 for i in range(6): flag = flag + M0 flag = (M1 ^ -1) * flag print_flag(flag) ``` Flag: **PCTF{this is my flag 1236a9erqA346924ASDA}**