---
tags: comp-decomp
title: Overview
---
# Impact of slicing Code object and decompiling
We slice code object into permutations .i.e. a given code object will all possible sub-codeobjects possible. This code object is then fed to decompiler.

- Take a single code object
- Create permutations of code objects
- Run decompiler on each chunk of code object`
- Collect all string code outputs
- Slice each by line ('\n' character)
- Create a set of all the lines collected
All slicing experiments are continued [**here**](https://hackmd.io/leaart_aT3qfitcuvvyt5A?view#Overview).
:::info
Do note that bigger the code, the larger will be the set of permutations. Need to optimize this to prune out garbage code.
Example: If the Code object starts with `STORE_FAST`, this never decompiles and so can prune all of them out.
:::
## Calculating similarity b/w code objects
The similarity between code objects is acheived by applying substring matching on opcodes of the two code objects.
- Each matching opcode is scored 1
- The ordinal positioning of opcodes is considered per match
Recursive solution for score:
```
score (c1, c2):
if len(c1) == 0 or len(c2) == 0: return 0
if c1[0] == c2[0]:
return max(
1 + score(c1[2:], c2[2:]),
score(c1[2:], c2),
score(c1, c2[2:])
)
else:
return max(
score(c1[2:], c2),
score(c1, c2[2:])
)
```
Final result is: `score(c1,c2) / (max(len(c1), len(c2))/ 2)` .i.e. # of same opcodes / max between the # of opcodes in the two code objects.
:::info
Note that we take the "max between the # of opcodes in the two code objects" to avoid false alarms .i.e. one code object can be a subset of the other code object.
Example [here](https://hackmd.io/leaart_aT3qfitcuvvyt5A?view#Example-4). In the example, Additional return instruction is added in decompiled code while changing logic. This would either require CFG analysis or trivially can be fixed by checking for subsets.

(Left: Decompiled code. Right: Original code.)
:::
## Optimization:
The following instructions cause infinite loop at start of slice:
- `STORE_FAST`
- `JUMP_ABSOLUTE`
## New instructions discovered?
- Aggregated `if` conditions in loops are broken down as shown in [**here**](https://hackmd.io/leaart_aT3qfitcuvvyt5A?view#Example-3)
-
## Failed decompilation. Can they be fixed?
# Lit-review:
- [Dr. Eric Schulte](https://eschulte.github.io/):
- [Evolving exact binaries](https://eschulte.github.io/data/bed.pdf)
- [AST manipulation](https://grammatech.github.io/prj/sel/)
-