About uGoku - HackMD

## Introduction [uGoku](https://dreamhack.io/wargame/challenges/2577) is a reversing challenge that I created. (Give it a try!) I designed this problem out of a personal interest in [obfuscation tools](https://github.com/xoreaxeaxeax/movfuscator), so it didn't actually require a significant amount of effort to create. Therefore, I would like to briefly analyze the obfuscator used in this challenge. ## Movfuscator ![{3A414A63-71F8-4934-9329-897D1FA4D9F5}](https://hackmd.io/_uploads/HJZ-I7BPbe.png) The visual contrast in the Control Flow Graphs above—complex branching versus a single straight line—is not merely aesthetic it represents a fundamental shift in how the CPU processes logic. To understand the `uGoku` challenge, we must analyze how the Movfuscator simulates a CPU's execution unit entirely through memory transport. ### Simulation of the ALU via Lookup Tables In a standard compiled binary, an operation like addition or exclusive-or is handled directly by the CPU's Arithmetic Logic Unit. The Movfuscator, however, is restricted from using these instructions. To bypass this, it implements a software-based ALU using massive lookup tables stored in the data section. Instead of asking the CPU to compute `A + B`, the compiler treats A and B as coordinates within a pre-calculated memory array. The result is not computed at runtime but is simply fetched from a specific memory address, effectively turning calculation into a data retrieval task. ### Flattening Control Flow with Predicated Execution The most significant hurdle for reverse engineers is the complete elimination of branching instructions like `JMP` or `JNE`. Standard programs rely on these to skip code based on conditions (e.g., if-else statements). The Movfuscator adopts a strategy known as predicated execution, where the program executes the code paths for both the True and False conditions linearly. This is why the graph on the right appears as a single, unbroken block. The program runs every instruction from top to bottom, but it uses a pointer-based selector to determine which result is kept. If a condition is false, the results of that code block are silently written to a dummy memory location and discarded, while the valid results are written to the actual variables. ### Conversion of Logic to Data Dependencies Ultimately, the Movfuscator converts Control Flow dependencies into Data dependencies. Boolean logic is no longer represented by CPU flags but by memory indices. This effectively blinds standard static analysis tools like IDA Pro or Ghidra, which rely on identifying branching patterns to reconstruct functions and loops. For the analyst, this means the challenge shifts from tracing code execution to understanding complex memory access patterns. ## Demovfuscator <center> <img src="https://hackmd.io/_uploads/BJ4ACmHDbx.png" width="30%"> <p><em>Example of CFG recovered by Demovfuscator (<a href="https://blog.ammaraskar.com/swampctf-future/">Ammar Askar</a>)</em></p> </center> If Movfuscator is the spear that destroys control flow, Demovfuscator is the shield designed to reconstruct it. Since the Movfuscator transformation is deterministic—meaning it follows a specific set of rules to convert logic into data movement—it is theoretically possible to reverse this process. This is where Demovfuscator comes in. ### The Principle of Reconstruction Demovfuscator operates on the premise that the flattened control flow still contains the original logic, just hidden within the data. Its primary goal is to identify the dispatcher—the mechanism the Movfuscator uses to decide which block of code effectively runs and which does not. To achieve this, Demovfuscator typically utilizes Symbolic Execution. Instead of running the program with real data, it runs it with symbolic variables. By tracking how these symbols move through the infinite stream of `MOV` instructions, the tool attempts to pinpoint the moments where a decision is made (e.g., If variable X is 1, write to memory A if 0, write to memory B). ### Patching the Binary Once these decision points are identified, Demovfuscator attempts to patch the binary. It replaces the complex MOV-based selection logic with standard x86 branching instructions (`JMP`, `JE`, `JNE`). - Goal: To regenerate a Control Flow Graph that a human can read. - Result: Ideally, the massive linear block is broken back down into functions and loops. ## Limitations and Reality However, as experienced in this challenge, Demovfuscator is not a magic bullet. It often fails to produce a clean, readable CFG for several reasons: 1. **Complex Arithmetic Dependencies**: If the lookup tables or the arithmetic used to calculate the jump targets are sufficiently complex, the symbolic execution engine may time out or fail to resolve the path. 2. **Partial Reconstruction**: Often, it can only recover parts of the graph, leaving large chunks of code as unintelligible islands of linear instructions. 3. **Heuristic Failures**: Since it relies on pattern matching specific behaviors of the Movfuscator, custom tweaks or specific compiler versions can break the tool's assumptions. In many cases, including this challenge, the output of Demovfuscator might still be too messy to analyze directly, forcing the reverse engineer to fall back on dynamic analysis techniques like execution tracing. ## Conclusion Reverse engineering is the art of perseverance if you endure the struggle, the answer will eventually reveal itself. While obfuscators are designed to provoke frustration and rage, the sheer euphoria of dismantling them is what ultimately drives us. This concludes my brief analysis of the obfuscator used in this challenge. ## Related Talk [DEF CON 23 - Chris Domas - Repsych: Psychological Warfare in Reverse Engineering](https://www.youtube.com/watch?v=HlUe0TUHOIc)