# Reversing Binary Obfuscation
### Sudhakar Verma
---
## whoami
+ Engineer(3) at Crowdstrike
+ Occasional (Reverse)Engineer
+ IIT Indore CSE 2017
+ CTFs
---
## Why this topic
+ Compiler Course
+ Malware Analysis
+ Software Protection
---
## Binaries
+ PE(EXE/DLL/SYS) in Windows
+ ELF in Unix-like systems
+ Mach-O in macOS/iOS
---
### Only C/C++/ASM examples
---
## Reverse Engineering
Wikipedia
>is a process or method through the application of which one attempts to understand through deductive reasoning how a device, process, system, or piece of software accomplishes a task with very little (if any) insight into exactly how it does so.
---
## Reverse Engineering
In this talk's context - RE is just deducing the functionality of understanding the high level code of some binary or assembly code
---
### Uses (White/Gray)
+ Understanding Closed Source Systems
+ Malware Analysis
+ OS internals
+ Vulnerability Research
---
### Uses (Black)
+ Game Hacking - Cheats, Hacks etc.
+ Cracking Protections
+ Vulnerability Research
---
## Simple Terms
---
### asm
> is any low-level programming language in which there is a very strong correspondence between the instructions in the language and the architecture's machine code instructions. Because assembly depends on the machine code instructions, every assembly language is designed for exactly one specific computer architecture
x86/64 here - will explain as they come
---
### x86
See https://en.wikibooks.org/wiki/X86_Assembly/X86_Architecture
+ Registers
+ GPRs : (e)a/b/c/d(x), (e)si/di, (e)bp/sp
+ Instruction Pointer : (e)ip
+ Misc : FLAGs/Segments
---
### x86
+ Instructions
+ Very long list : https://en.wikipedia.org/wiki/X86_instruction_listings
+ Normal - mov, add, sub, mul, xor, push, pop, call, leave, ret etc.
---
### x86
+ Stack
+ Calling Conventions
---
### Basic Block
A chunk of code with no jumps in between
```c=
a = b + c;
b = b * b;
c = x(b);
a = b + c;
```
---

---
### Control Flow Graph
A control flow graph shows the control flow representation of a program. It is a directed graph where basic blocks are individual nodes.
---

---
## Tools
---
### Disassembler
Analyzes machine code and displays assembly code
---
### Decompiler
Analyzes machine code and displays a higher level code
---
### Debugger
Analyzes the program at runtime
---
### Popular
+ IDA Pro - disassembler, decompiler, debugger - GUI ~$5000
+ Ghidra - disassembler, decompiler, debugger - GUI Free
+ Radare2 - disassembler, decompiler, debugger - GUI+CMD Free
+ windbg - debugger windows - proprietary
+ x64dbg - debugger windows - Open Source
+ gdb - debugger linux - Open Source
---
### Misc
Others - Binary Ninja, Hopper, ImmunityDbg, OllyDbg etc.
* Best are those which let you script operations *
---
## Obfuscation
>the action of making something obscure, unclear, or unintelligible.
> In software development, obfuscation is the deliberate act of creating source or machine code that is difficult for humans to understand.
---
### Why
+ Sofware Protection
+ Anti Analysis - Malware
+ Compressing Binary sizes
+ DRM
---
## Compiler level Obfuscations
---
### Compilers
Compilers are the best program analysis software. They already have a lot of infrastructure around analyzing patterns in a high level code and generating low level code for them.
---
### LLVM
> The LLVM Project is a collection of modular and reusable compiler and toolchain technologies
LLVM is very extensible - you can write plugins that can hook various steps during linting, analysis or code generation.
---
### Passes
>Passes perform the transformations and optimizations that make up the compiler, they build the analysis results that are used by these transformations, and they are, above all, a structuring technique for compiler code.
---
## Techniques
---
### Instruction Substitution
Simple binary operations usually have instructions in x86
add, sub, mul, div, or, xor etc. These can be written in other ways too.
```
a = b | c
a = (b & c) | (b ^ c)
```
---
### Instruction Substitution
```c=
int a, b, c;
...
a = b + c;
```
---
### Instruction Substitution
can be written as
```c=
int a, b, c;
...
int x = rand()
a = b + x;
a += c;
a -= x;
```
---
### Instruction Substitution
```c=
int sum(int a, int b) {
return a+b;
}
```
---
### Instruction Substitution

---
### Instruction Substitution

---
### Analysis
The compiler emits expressions which are a bit different. However if a optimizing decompiler(such as IDA) sees such instances, they can simplify such expressions for the analysts.
---
### Bogus Control Flow
Normal functions have a set of local variables which contribute to the return value. A compiler can create extra variables and logic around them that doesn't change the return value in any way.
This increases the function size and the analyst's time to reverse the simpler smaller logic.
---
### Bogus Control Flow
```c=
char * mod2(int a) {
if(a%2==0) {
return "even";
} else if(a%2==1) {
return "odd";
}
return "impossible";
}
```
---
### Bogus Control Flow

---
### Bogus Control Flow
\# Show IDA
---
### Analysis
+ Taint analysis - MIASM, Symbolic Execution
---
### Control Flow Flattening
Adds additional control variables and change the nodes in a CFG to a dispatcher style switch case
---
### Control Flow Flattening
```
int foo(int a) {
if (a==1) {
return a;
}
for(int i=2; i<10; i++) {
a+=i;
}
return a;
}
```
---
### Control Flow Flattening

---
### Control Flow Flattening

---
### Control Flow Flattening
This won't be handled by an optimizing decompiler or taint analysis as the control variables created taint the input or output somewhat.
This needs lifting the machine code to a higher level, writing analysis passes to detect the control variable and merge the actual logic basic blocks.
---
## Binary Level Protections
---
### Packing
Also known as compressing, here the program is compressed to a new binary which unpacks the binary on runtime and loads it in memory.
Interesting read : open source UPX packer.
---
### Analysis
Since the program unpacks the actual program, analysts can automate this unpacking in a tool of their choice
---
### Encryption
The program encrypts parts of the program and decrypts them at runtime when needed.
---
# DEMO
---
## Questions
---
## Thanks
{"metaMigratedAt":"2023-06-16T00:56:09.830Z","metaMigratedFrom":"YAML","title":"Reversing Binary Obfuscation","breaks":true,"description":"CHARUSAT talk","contributors":"[{\"id\":\"f41e1afe-84d9-46f4-ab07-3940dc41035c\",\"add\":6924,\"del\":86}]"}