[Rev] Oak Official Writeup

[Rev] Oak Official Writeup ========================= This is the writeup from oak's challenge author. oak is a rev challenge coming with two binaries `client`, `server`, a log file `access.log`, and a script file `run.sh`. By inspecting the `run.sh`, we can know how the programs were run: 1. Start the `server` binary. 2. Execute the `client` binary with `flag.jpg` as the input file. However the `flag.jpg` is not included in the challenge files. With a quick decompiling on `server` (which is small), we can find that `access.log` is generated by `server` while processing `client` requests. Therefore, the goal is recovering `flag.jpg` from `access.log`. Explanation =========== Server logic ------------ The logic in `server` is quite simple. It receives and responses the requests from `client` in the following format: ``` Read request: |op code (1byte) = 0|index (4bytes)| Read response: |len (2bytes)|data (len bytes)| Write request: |op code (1byte) = 1|index (4bytes)|len (2bytes)|data (len bytes)| Write response: (No response) ``` The logic of request processing is also simple. For write, it stores the data into an array at `index`. For read, it returns the data from the array at `index`. If there is no data at `index`, read returns empty data. For each read/write, the `server` also records the operation in `access.log`, in the following format: ``` |op code (1byte)|index (4bytes)| if op code = 1 (write operation): |len (2bytes)|data (len bytes)| ``` The concept of `server` is basically a RAM and logs every read/write operation. Client logic ------------ The major part of this challenge is in `client`. First by inspecting the func `client::main`, we can know that it reads the input file and processes byte-by-byte. Each byte is sent to the func `encoder::Encoder::encode`. The return value of `encode` is discarded, as it is written to `/dev/null`. **Func encoder::Encoder::encode** There are lots of things happening in `encode`. With the hints from symbol names, we can know that there is a `node` struct (size: 0x40c) and two helper funcs: `load_node` and `create_node`. This is actually the `Trie` data structure [1]. Each node contains: 1. `address` (4bytes) 2. `symbol` (4bytes) 3. `dict` (pointers to children nodes, 4bytes * 256) By thoroughly inspecting the `encode`, the logic is something link: 1. Get the `current_node` by checking if the `previous_node` is set. a. If not, take the root node. b. If set, load the `previous_node`. 2. If `current_node.dict[input_byte]` is 0. a. Create the new node, `current_node.dict[input_byte] = new node address`. b. `previous_node = root_node.dict[input_byte]`. c. Return the `current_node.symbol`. 3. If not `previous_node = current_node.dict[input_byte]`, then return. The func `create_node` is basically creating an empty node with its `symbol = counter++`, where `counter` is incremental from 0. This part is actually the LZW compression algorithm [3]. The binary `client` is a simple LZW file compressor (but the output is written to `/dev/null`). The Trie is the LZW dictionary and grows in compression process. RAM logic --------- So now the func `encode` basically runs LZW algorithm, but the Trie data structure (LZW dictionary) is stored on and accessed through `ram::client::RamClient`. To know how to recover the LZW dictionary from `access.log`, we need to understand how `RamClient` interacts with the `server`. The `RamClient` is a complicated class. It contains two methods: `ram::client::RamClient::write` and `ram::client::RamClient::read`. And both of them call `_ZN6client3ram6client9RamClient7process17h4efae0e402f6c495E.llvm.7732481198818634916 (ram::client::RamClient::process)`. Fully understanding `process` is not required to solve this challenge. It is actually an oblivious ram [4] implementation, called Path ORAM. The algorithm details can be found in the original paper [2]. At high level, for each read/write operation (read/write at `address`), `process` does the following steps: 1. Convert `address` to its `oram_position` with an array. 2. Calculate an `oram_index` corresponding to the `oram_position` at each "height", with the func `client::ram::client::get_idx`. All `oram_index` are on a path of a binary tree from the root to the leaf, as the figure below. ![](https://i.imgur.com/pPbgPYo.png) 3. Read each "bucket" at `oram_index` by querying the `server` (`oram_index` is the `index` for the server request). And hold them in a "stash" 4. Decrypt each bucket with a hard-coded key in the `client` binary. 5. Find the target block with `address` in the stash. a. If it is a write operation, update the block content. 6. Write back some buckets from the stash to `server`. The list of `index` to write to is the same as step 2, from the leaf to the root. a. Buckets are encrypted again with the same key, the nonce for each bucket is `|nonce_counter (8bytes)|oram_index (4bytes)|`. The `nonce_counter` is incremented by 1 from 0 each time the `process` is called. b. The written-back blocks are removed from stash. Notice that the stash is written back in a FIFO manner, and there won't be too many blocks staying in the stash (proof in the Path ORAM paper). Solution ======== 1. Decrypt the written buckets in the `access.log` a. Ignore all read requests. b. We know for each call to `process`, ORAM will issue a sequence of write requests (as the step 6 above), and the end must write to index 1. So we can group each write sequence. c. For each write sequence, the encryption key is known (hard-coded in `client`), the nonce is described in the step 6 above. The encryption method `ChaCha20-Poly1305` and the key can be found in the func `client::ram::client::RamClient::new`. 2. Now we know the content in each written bucket. Each bucket contains 2 blocks and the memory addresses associated to them, in the following format: ``` |block A address (4bytes)|block A data (32bytes)| |block B address (4bytes)|block B data (32bytes)| ``` 3. With step 1 and 2, we know where did the LZW write Trie nodes and the content of each node. But due to the protection from ORAM, we cannot recover the precise order of writes. That is, two memory writes issued by `encode` are not guaranteed to show up in `access.log` in the correct order, and might be merged if they write to the same address. The precise addresses of memory reads are also not recoverable. 4. The trick is in the design of LZW. LZW always finds the longest string in the dictionary (Trie) which matches the current input. Then it will insert `the longest matched string + the next character (which is not in the matched string)` back to the dictionary, with the next symbol. On the Trie, it creates a new child node on the leaf of the longest matched string, with the next symbol. As the description above, the next symbol is incremental. So we only need to observe the final state of the Trie in the memory and sort all nodes by their symbols, then the exact process of how the Trie (dictionary) was built by LZW can be recovered. 5. To get the final state of the memory, we can simply replay all writes in `access.log`. Although the order is scrambled, the memory data in the final state will be correct (might be a few blocks still in the "stash", but most of the data is up to date). With the trick in step 4, we can know each string LZW was matching when compressing the `flag.jpg`, and being able to restore the file (just by concating each longest string LZW was matching). References ========== 1. Trie, Wikipedia [[link](https://en.wikipedia.org/wiki/Trie)] 2. Path ORAM: An Extremely Simple Oblivious RAM Protocol, Journal of the ACM, 2018 [[link](https://dl.acm.org/doi/10.1145/3177872)] 3. LZW algorithm, Wikipedia [[link](https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch)] 4. Oblivious RAM, Wikipedia [[link](https://en.wikipedia.org/wiki/Oblivious_RAM)]