Artifact Overview === This document is an overview of the artifact that accompanies the OOPSLA19's conditionally accepted paper > *BDA: Practical Dependence Analysis for Binary Executables by Unbiased Whole-program Path Sampling and Per-path Abstract Interpretation* The authors are *Zhuo Zhang*, *Wei You*, *Guanhong Tao*, *Guannan Wei*, *Yonghwi Kwon* and *Xiangyu Zhang*. ## Get Started The artifact is provided as a [*VirtualBox Appliance*](https://purdue0-my.sharepoint.com/:u:/g/personal/zhan3299_purdue_edu/Ea9iP5J1D7lFo9tGntfozFEBF-eeIJ7qiwaSM3Ckit_lKw?e=5cbuND) named *BDA-Artifact.ova*, with Ubuntu 18.04.2 LTS installed. Necessary software such as `radare2`, `python2.7` and `sqlite3` are pre-installed. Additionally, `Rust` nightly compiling-chain is also installed. The *md5* of provided appliance is `MD5 (BDA-Artifact.ova) = 2bde8ade5728d447eaf0da81323b3023`. ### Setup Virtual Machine For any desktop user such as Windows, Linux Desktop, Macintosh and Solaris, download Oracle VirtualBox from [www.virtualbox.org](www.virtualbox.org) and import the VM following instructions mentioned [here](https://docs.oracle.com/cd/E26217_01/E26796/html/qs-import-vm.html). For Linux terminal users, you can ```bash # Install Oracle VirtualBox sudo apt install # Install pre-built appliance vboxmanage import BDA-Artifact.ova # Start Virtual Machine VBoxHeadless --startvm BDA-Artifact & ``` **Note** that current virtual machine needs 15G memory (10G might work, users can adjust this before starting VM using GUI or following command) and nearly 20G disk space to run. Please make sure your environment satisfies the above requirement. ```bash # Reset memory to 10G if necessary VBoxManage modifyvm BDA-Artifact --memory 10240 ``` **Moreover**, when client VM is set up, VirtualBox would forward port *12345* of the host machine to port *22* of the client machine. Please make sure host machine's port *12345* is not occupied. ### Login Virtual Machine Desktop users could use Oracle VirtualBox GUI to login virtual machine, using following *username* and *password*. > username: bda > password: bda Terminal users could use SSH from the host machine to login with the above certification. ```bash ssh -p 12345 bda@127.0.0.1 ``` ## Descrption We will briefly describe our implementation of *BDA* in this section, which locates at `~/sabre` in VM. *BDA* is written in Rust, with 15K *LoC*. Following are the important parts in *BDA*. `src/` Main implementation of *BDA*, which we named as **sabre** framework, short for *Sampling-based Analysis for Binary Reverse Engineering*. `src/analyzer/` Analysis plugins for *sabre*, including postier analysis plugin for *BDA* (*Section 6*). `src/engine/` Radare2-based pre-path abstract interpreter, which combined of *emulator* and *path\_sampler* (*Section 5*). `src/medium/` Intermediate Representation (IR) and Abstract Domains. `src/medium/metadata/` Abstract Domains for pre-path abstract interpreter (*Section 5*). `src/medium/HIG/` Customized low-level IR of *sabre*, which is able to represent *CFG* and *DDG*. `src/medium/UBG/` Weighted high-level IR of *sabre*, used for unbiased whole-program path sampling (*Section 4*). `artifacts/` Artifact working directory. `artifacts/clean.sh` Bash script for cleaning current analysis results. `artifacts/run.sh` Bash script for running analysis. `artifacts/rebuild.sh` Bash script for rebuilding *BDA*. ## Step by Step Instructions ### Build BDA Please cd to *Artifact Directory* via ```bash cd ~/sabre/artifacts ``` Bash scripts would get in trouble if user's `$PWD` is not `$HOME/sabre/artifacts`. To (re-)build *BDA*, users could ```bash ./rebuild.sh ``` Note that this step might take 10 to 30 minutes, and need network connection to download third-party packages. ### Explanation There are five figures/tables in *Section 7 Evaluation*, which are + **Fig. 9. Path coverage** + **Fig. 10. Effect of sampling** + **Table 4. Memory Dependence** + **Table 5. Effect of posterior analysis** + **Table 6. Runtime overhead** In this section, we will describe how to reproduce the empirical evaluation for *Fig. 9.*, *Table 4.* and *Table 5.*. Additionally, according to *Table 6. Runtime overhead*, some of the benchmarks need *BDA* to run for a long time (more than 24h per executable in VM) and consume large memory (more than 50G), whose results would also occupy more than 10G disk space. It's very inconvenient to prepare and distribute such a VM and reproduce the results. Thus, we picked several relatively small benchmarks to be evaluated. Note that the size of the benchmark would not influence *BDA's* accuracy theoretically. We don't evaluate `Table 6. Runtime overhead` due to performance claims cannot be reproduced in VM. We ignore `Fig. 10. Effect of sampling` too. Because it needs iterating normal analysis for 10 times, leading much longer time than single analysis (around one whole day for small executable and more than a week for large ones). If any user thinks it's necessary to evaluate above missing parts, please contact `zhan3299@purdue.edu` for future discussion. ### Instructions #### Evaluated Benchmarks One By One As mentioned above, evaluating *BDA* needs a long time and large memory. Thus, we recommend users to evaluate *BDA* one target by one target. ```bash # cd to Artifact Directory cd ~/sabre/artifacts # show all analysis targets ls 181.mcf/ # Time: 0.5 - 1.5 h; Memory: 1 - 2 GB 164.gzip/ # Time: 3 - 5 h; Memory: 3 - 5 GB 256.bzip2/ # Time: 3 - 5 h; Memory: 3 - 5 GB 254.gap/ # Time: 4 - 8 h; Memory: 4 - 9 GB 252.eon/ # Time: 5 - 9 h; Memory: 8 - 12 GB ``` Every directory under `~/sabre/artifacts` is a pre-prepared analysis target. We offer above five benchmarks to evaluate. We use `181.mcf` as an example to show how to evaluate *BDA*. ```bash cd ~/sabre/artifacts ./run.sh 181.mcf ``` After that, we will show the estimated time and memory consumption for the given analysis target as a warning ```bash ./run.sh 181.mcf Start to run analysis for 181.mcf This analysis might take 30 to 90 minutes, and consume 1 to 2 GB memory Please make sure whether environment is valid, and continue? (y/n) ``` Press `y` to begin the analysis, and wait for the result. During analysis, we will print a few logs. Note that if memory requirement is not satisfied, the analysis would be very slow due to memory swapping. ```bash ./run.sh 181.mcf Start to run analysis for 181.mcf This analysis might take 30 to 90 minutes, and consume 1 to 2 GB memory Please make sure whether environment is valid, and continue? (y/n) y Running migration ValueInit_create_table Running migration VariableInit_create_table [1/4] Sampling done. [2/4] Calculating dependence with analysis done. [3/4] Calculating dependence without analysis done. ``` When analysis ends, it would output results as ```bash ./run.sh 181.mcf Start to run analysis for 181.mcf This analysis might take 30 to 90 minutes, and consume 1 to 2 GB memory Please make sure whether environment is valid, and continue? (y/n) y Running migration ValueInit_create_table Running migration VariableInit_create_table [1/4] Sampling done. [2/4] Calculating dependence with analysis done. [3/4] Calculating dependence without analysis done. [4/4] Testing intra-procedure paths coverage done. Finial Report: # This data is for Table 4. (181.mcf BDA part) # Due to randomization, following is accepted range: # MISS: 0 ~ 15 # Extra: 1K ~ 3K # Mistyped: 10% ~ 20% # *Less* Miss/Extra/Mistyped means BDA is more accurate. Report (Analysis Enable: true): FOUND: 4554 MISS: 2(0.10%) EXTRA: 2506 MISTYPED: 682(14.98%) ---------------------------------------------------------------------- # This data is for Table 5 w/o analysis (181.mcf part). # Due to randomization, this data will vary in a large range. # *More* Miss means posterior analysis is more necessary. Report (Analysis Enable: false): FOUND: 2506 MISS: 36(1.76%) EXTRA: 492 MISTYPED: 88(3.51%) ---------------------------------------------------------------------- # This data is for Fig 9 (181.mcf part). Covered Rate: 0.0: 0% 0.1: 0% 0.2: 0% 0.3: 0% 0.4: 0% 0.5: 0% 0.6: 0% 0.7: 0% 0.8: 0% 0.9: 0% 1.0: 100% ====================================================================== ``` The main concern is that our analysis is randomized, which means the final results should fall into a suitable range rather than an accurate number. We will show this range in comments for Table 4. (which is our final result). e.g. ```bash # Due to randomization, following is accepted range: # MISS: 0 ~ 15 # Extra: 1K ~ 3K # Mistyped: 10% ~ 20% ``` For Table 5., due to lack of postier analysis and the uncountable whole-program paths, the result might vary in a large range (so we do not give it a suitable range). For Fig. 9, the covered rate indicates the percentage of functions for which *BDA* has achieved various levels of coverage. The first number is the coverage level, and the second number is the percentage of functions. Taking `0.9: 6%` as an example, it means *6%* functions have achieved 0.9 coverage level (*BDA* covered *90%*~*99%* intra-procedure paths). If users want to show more internal log, they could ```bash DUMP_LOG=true ./run.sh 181.mcf ``` Other benchmarks could be evaluated via ```bash ./run.sh 181.mcf/ ./run.sh 164.gzip/ ./run.sh 256.bzip2/ ./run.sh 254.gap/ ./run.sh 252.eon/ ``` ### Evaluated All Together We also offer the capability to run all pre-prepared targets. ```bash # Clean current analysis results ./clean.sh # Run all ./run.sh ``` It might take a whole day to run analysis. After finishing, run ```bash # Gather analysis result ./result.sh cat reports.txt ```