FIRMCORN: Vulnerability-Oriented Fuzzing of IoT Firmware via Optimized Virtual Execution

# FIRMCORN: Vulnerability-Oriented Fuzzing of IoT Firmware via Optimized Virtual Execution --- ## Abstract - FIRMCORN is a vulnerability-oriented fuzzer for IoT firmware **主要改進三種問題** - It focuses on three typical problems of IoT firmware fuzzing 1. high throughput required by fuzzing 2. inaccuracy of emulation compared with real devices 3. instability of emulation due to lack of hardware - code設計成觀察entry porint 根據fuzzing characteristics 找漏洞的IOT fuzzing 工具 - FIRMCORN runs for only 2 hours to mine two 0-day vulnerabilities on a machine --- ## Introduction ==**前言**== - Due to the limited resources of IoT devices, the protection mechanisms commonly used in desktop devices, such as ASLR and Stack Canary are not widely used for IoT devices; therefore, exploitation of IoT devices becomes even easier ==**Fuzzing 介紹**== - FIRM-AFL is based on FIRMADYNE, which applies AFL to IoT firmware vulnerability mining through greybox fuzzing ，but this tool can only test firmware that FIRMADYNE can emulate. [FIRMADYNE](https://github.com/firmadyne/firmadyne) ==**Fuzzing 過程中可能會遇到的問題**== 1. because accurate testing of IoT firmware needs to be based on real devices, large-scale parallel testing requires considerable real hardware 2. The scheme based on emulation is computing-resource intensive and uses an emulation environment different from the actual device operating environment 3. The hardware dependence of device firmware, emulation sometimes crashes when encountered with lack of hardware 4. 接著提到做 gray box fuzzing 提升code覆蓋率效果也不好，因為僅有少數的code是有漏洞，因此如果做全˙面效率很低 ==**跟其他幾篇論文比較**== - FIE can only support automated analysis of firmware using MSP430 microcontrollers. - FIRMADYNE can analyze firmware based on full-system emulation, but **NVRAM emulation failures often cause mining process crashes.** - IoTFuzzer can only test security for App-based IoT devices ==**接著提出他們主要解決問題**== 1. high-throughput requirement for fuzzing 2. inaccuracy of emulation compared with real devices 3. instability due to emulation crashes caused by lack of hardware. We propose ==optimized virtual execution==, with the main idea of ==optimizing the virtual execution initial environment== and the ==execution process==. More specifically, ==we improve emulation accuracy by incorporating the real context of the actual device==, ==system throughput by using heuristic algorithms to skip unnecessary functions of fuzzing==, and ==fuzzing stability by hooking hardware dependency functions==. ==We design a vulnerable-code search algorithm to determine the vulnerabilities of firmware for vulnerability-oriented fuzz testing== ==Contribution== We summarize existing methods for analyzing IoT firmware. Most existing methods do not sufficiently solve the typical problems of firmware fuzzing; therefore, we propose a novel technology called optimized virtual execution, and use it as the basis for firmware fuzzing We propose and implement a vulnerable-code search algorithm that performs static analysis on IoT firmware to obtain vulnerable parts --- ## Background ==**Introduction to Firmware**== firmware 分成兩類 => high-level 和 low-level Low-level firmware mainly exists in an EEPROM, and it is difficult to modify or update high-level firmware usually resides in Flash. Firmware works between the underlying hardware and the upper layer software, and provides a simple call interface for the software by effectively managing the hardware ==**Introduction to Fuzzing**== Fuzz testing technology can be classified into whitebox, blackbox, and greybox fuzzing according to the mastery of target program behavior and information [fuzzing測試](https://www.itread01.com/content/1550332088.html) 分成三種:whitebox 、blackbox、graybox --- ## OVERVIEW 1. Hardware Interface Debugging > The debugging method directly debugs an IoT device through its hardware interface, such as the UART or JTAG debug interface [13]. This method is accurate and reliable 2. Full Static Analysis 3. User-Mode Emulation 4. Full-system Emulation 5. Augmented Process Emulation 6. Multi-target Orchestration Analysis Before the fuzz testing process starts, FIRMCORN first analyzes the firmware using the vulnerable-code search algorithm and determines the entry point of fuzzing, runs to the entry point in the actual device, and dumps context information of the location as the initial state of fuzzing. Then, FIRMCORN sets up the registers and memory layout at the entry point in the CPU emulator. It then uses heuristic algorithms to collect functions that cannot be emulated or do not need to be emulated before fuzzing. These include hardware-dependent functions, functions that read and write to dynamically allocated memory space, and functions that are not necessary for fuzz testing; these are hereafter called hardware-specific, unresolved, and unnecessary functions, respectively. Finally, starting from the entry point, hooks are added to the above functions or filters to start fuzz testing for vulnerable codes **Optimized Virtual Execution:** Virtual execution technology does not actually execute firmware, but uses a technique to read and execute some firmware instructions through a CPU emulator [QEMU](https://www.qemu.org/) 但cpu emulator 會不穩定且不準確，且有些輸出功能度不需要(e.g. puts) 優化的虛擬執行通過使用實際的IoT設備轉儲上下文來優化虛擬執行的初始環境。啟發式算法用於搜索三種類型的函數以優化虛擬執行過程，從而實現更快，更準確和更穩定的虛擬執行。在FIRMCORN的實現中，我們採用優化的虛擬執行作為模糊測試的基礎 --- ![](https://i.imgur.com/tZex2Mx.png) ## deital design - **PREANALYSIS** - complexity grouping > 在此階段，根據複雜度對韌體中的所有函數進行分類。 > 對於函數，其複雜度越高，表示邏輯越複雜，並且功能中出現漏洞的可能性也越高。 > 為了不遺漏低複雜度功能中的漏洞，我們將所有函數分組並在每組中判斷漏洞特徵。 > 函數的複雜度體現在兩個方面：函數本身的邏輯複雜度和參考關係的複雜度，分別由CYCLOMATIC COMPLEXITY和NUMBER OF TIMES A FUNCTION IS CALLED來衡量 - **CYCLOMATIC COMPLEXITY** ： We calculate the cyclomatic complexity of a function according to the number of points and edges of the function control flow graph - **NUMBER OF TIMES A FUNCTION IS CALLED**　： If a function is vulnerable and is called multiple times, it implies that there are multiple ways to trigger its vulnerability - vulnerability-feature ranking stages >在漏洞特徵排名階段，根據漏洞特徵索引對每組中的函數進行排序，並確定每組中最易受攻擊的函數。函數的漏洞功能可以體現在兩個方面：SENSITIVITY FUNCTION CALL INDEX和NUMBER OF MEMORY OPERATIONS 一些敏感的function call 或是記憶體存取不當的操作 ![](https://i.imgur.com/ojZ85XB.png) 概念: preanalysis 事先分組完後，再用演算法下去搜索 - **DUMP CONTEXT** 為了確保emulation enviroment 的環境的準確性他們 dump出暫存器記憶體的架構資訊內容這邊設計一些演算法是為了像是 D-Link DIR系列的router 只有提供用戶web介面，簡化一些配置的工作因此透過幾種方式來執行system command - BASED ON TELNET/SSH SERVICE：we can easily get the shell of the device and execute system commands - BASED ON DEVICE DEBUG INTERFACE : We can generally obtain the shell of the device through the UART port - BASED ON FIRMWARE UPDATE MECHANISM : 如果韌體在更新過程中未驗證，則可以提取他，修改啟動腳本rcS，並在打開設備電源時提供Telnet或SSH服務。重新打包固件，最後將修改後的固件更新到設備。獲取設備外殼後，我們通過網絡將靜態鏈接的gdbserver上傳到設備，然後通過gdbserver指定設備端的調試端口，通過gdb連接主機端的遠程調試端口，然後運行到入口點開始準備轉儲上下文。意思即為:修改更新的韌體，並將RCS寫入開啟TELENT OR ssh的服務再用gdbserver去連接到他等到debugging port等 - **HOOK** > 在開始emulate之前，框架會分析韌體程式的GOT資訊。設置完CPU的初始環境後，我們遍歷GOT表並讀取內存以獲取GOT每個條目的地址，以獲得函數的實際地址。但是，由於ELF的惰性綁定機制[19]，對於GOT中的某些功能，地址的綁定將無法完成。接下來，透過解析固件中的動態鏈接庫文件的符號表信息，可以獲得動態鏈接庫中的庫函數的偏移量。 > 選擇一個表示為funcX的函數；該函數在內存中的地址定義為mem_addrfuncX，而動態鏈接庫中funcX的偏移地址為offsetfuncX。因此，我們獲得內存libcaddr中動態鏈接庫文件的實際加載地址，如下所示重點：FIRMCORN將為用戶提供一個接口，以用自定義函數替換原始函數或跳過__dl_runtime_resolve [21]的地址解析，並直接跳轉到內存中該函數的實際地址。同樣，FIRMCORN可以基於GOTmem和GOTorig在模糊測試過程中自動識別和跳過不必要的功能，從而提高了虛擬執行的速度。上述過程的實現原理如圖2所示。對於靜態鏈接的二進製文件，編譯器在可執行程序的編譯過程中將所需的庫文件編譯到程序中。這種方法仍然可以自動識別功能並通過分析二進制符號表來添加鉤子 ==簡略:做hook是為了替幻他接下來要用 Unicorn Engine (CPU emulator )的客製化函數去替換一些原本的函數，可以在模糊測試時跳過不必要的功能，增加虛擬執行速度== 在FIRMCORN的hook子模塊中，我們使用Unicorn Engine [20]提供的hook_add函數添加類型為UC_HOOK_CODE的回調函數來監視固件的運行地址是GOTmem還是GOTorig - **OPTIMIZED VIRTUAL EXECUTION** - CPU EMULATOR 主要核心利用Unicorn Engine API ，Unicorn Engine [22]僅保留了QEMU的CPU仿真器部分，刪除了其他設備的仿真，並提供了python接口綁定 - MULTI ARCHITECTURE x86傳參和 MIPS環境中傳參的不同，因此提供一致的街口，供不同架構執行 - HEURISTIC OPTIMIZATION：we specifically describe the three types of functions, namely unresolved, unnecessary, and hardware-specific functions - UNRESOLVED FUNCTION > Although the context information is extracted as much as possible, dynamically allocated memory, such as heap space, may not be initialized and thus not obtained at the entry point; therefore, import of this part of memory in advance is not possible. If the library function reads and writes this part of the memory, it will cause errors in the emulation process. We define these functions as unresolved functions in the framework. 簡要:一些動態分配的記憶體配置(像是heap)，不會初始化，因此可能會導致框架錯誤就，他們系統無法解決的部分，就被分配到這一塊 - UNNECESSARY FUNCTION > 模糊處理過程中不需要一些功能，例如puts功能和類似功能。我們將這些功能定義為不必要的功能。為了實現更有效的模糊測試，我們的框架為用戶提供了跳過某些功能的界面。當執行這些功能時，程序計數器（PC）寄存器將被設置為下一條指令的地址，並且堆棧將被平衡。 - HARDWARE-SPECIFIC FUNCTION > oT設備固件可以訪問硬件功能，例如讀取GPIO引腳或NVRAM區域；但是，在仿真過程中缺少這些硬件引腳將導致程序停止運行並崩潰 - ** Fuzz test** --- ## Evalution ![](https://i.imgur.com/XFpdVy9.png) 這邊他用Benchmark測試的程式 nbench (效能測試的工具)來評估虛擬執行的效率(EFFICIENCY) ![](https://i.imgur.com/sg1du46.png) --- ###### tags: `thesis`