--- tags: decompiler --- # To do (outdated): - [ ] Cover samples to extracte MWEs **[In Progress]** - [ ] Changing program flows via uncompyle6 **[In Progress]** - [ ] Log various program flows - [ ] Create timebomb malware that misleads forensic analyst - [ ] Use different Python versions on same MWE to generate different bytecode **[In Progress]** - [ ] Check variation in bytecode - [x] Check if can be decompiled - [x] Check to see if the remaining non-decompiled code can use other pyhton-version decompiler - [ ] Contextualize instructions around error-prone instructions - [ ] Get MWEs w.r.t errors on same instruction - [ ] Compare surrounding instructions for all MWEs - [ ] Infer rules for same context - [ ] Prune out samples to make analysis faster - [x] Setup improvements **[Done - July 9]** - [x] Change OS to linux - [x] Use docker instead of pyenv - [ ] Other: - [ ] See problematic grammar. One [case](https://hackmd.io/@aliahad97/rkPmyJCpO#Error-115). # End goal - Create tool to manipulate bytecode - Utilise AST and focus on leaf nodes responsible for errors - Backtrack to points to manipulate sub-tree to logically equivalent sub-tree (must be decompilable) - Acts as an optimizer for decompilers # Analysis Current Stats for Error 1: | MWEs Extracted | Unique samples | Samples Covered | Cannot Recreate|Total Samples| | -------- | -------- | -------- | -------- |-------- | | 89 | 67 | 71 | 8 | 2190| - Reasons for cannot recreate: - Uses python 3.4 version and fails for only pyhton 3.4 - Uses python 2.7 but recompiling code with python 2.7 passes (Eg [here](https://hackmd.io/@aliahad97/Hk4eJlCCO#Error-165---Cannot-recreate-Py-27)) - Note that the magic number in [this example](https://hackmd.io/@aliahad97/BJNrSJ0pd#Error-156---Cannot-Recreate-Py-27) is peculiar and is larger than what I could look up. Need to confirm if this is really valid or not. - Other interesting samples that need to be revisited: - [1/38](https://hackmd.io/@aliahad97/r1feMvmRd#Error-138) - [1/48](https://hackmd.io/@aliahad97/BJNrSJ0pd#Error-148---Py-34) - Analysis 1 [July 16] - [link](https://hackmd.io/@aliahad97/BkjdsWkAd) - Logical breakdowns: [link](https://hackmd.io/@yonghwikwon/BytjRwVAu) # Changing control flows: Here is the link to logging to down changes in control flow through decompilation: [link](https://hackmd.io/@aliahad97/ry3JqUjad) # My MWEs - Link to MWEs extracted by UTD: [here](https://hackmd.io/@aliahad97/Hyu8mVX6O) ## Error Template Source: [link](https://svn.apache.org/repos/infra/infrastructure/trunk/projects/asfpy/asfpy/ldap.py) Python version: 3.8 Decompyle3: Failed Uncompyle6: Failed Error: `Deparsing stopped due to parse error` | Py3.8 | Py3.7 | Py3.6 | Py2.7 | | -------- | -------- | -------- | -------- | | Pass | Fail | Pass | Pass | ### MWE: ```python= ``` #### Closest Solution: ```python= ``` ```c= ``` #### MWE data: Bytecode for MWE: ```c= ``` Output uncompyle6: ```python= ``` Output Decompyle3: ```python= ```