2019q1 Homework1 (lab0)

# 2019q1 Homework1 (lab0) contributed by < `cjwind` > ## 環境 * Debian 9.8 * ~~Linux #1 SMP Debian 4.9.65-3+deb9u1 (2017-12-23) x86_64~~ * Linux 4.19.0-0.bpo.2-amd64 #1 SMP Debian 4.19.16-1~bpo9+1 (2019-02-07) x86_64 :::info 為了後續的實驗能夠順利，建議 Linux 核心升級到 v4.15 以上 :notes: jserv ::: ### 升級 kernel ``` $ echo "deb http://ftp.debian.org/debian stretch-backports main" | sudo tee -a /etc/apt/sources.list > /dev/null $ sudo apt-get update $ sudo apt-get -t stretch-backports upgrade $ sudo apt-get install linux-image-4.19.0-0.bpo.2-amd64 Upgrade nvidia driver $ sudo apt-get install linux-headers-4.19.0-0.bpo.2-amd64 $ sudo apt-get -t stretch-backports install nvidia-driver ``` ## Requirements 做個 element data 是 C string 的 queue，支援 FIFO 跟 LIFO。 * NULL queue：`queue_t *` 是 `NULL` * empty queue：`queue_t *` 指向 valid structure，其中 `list_ele_t *head` 為 `NULL` queue operations： * q_new: Create a new, empty queue. * q_free: Free all storage used by a queue. * 會變成 NULL queue 還是 empty queue？ * `queue_t *` 被 free 了 * q_insert_head: Attempt to insert a new element at the head of the queue (LIFO discipline). * q_insert_tail: Attempt to insert a new element at the tail of the queue (FIFO discipline). * need O(1) * q_remove_head: Attempt to remove the element at the head of the queue. * q_size: Compute the number of elements in the queue. * need O(1) * q_reverse: Reorder the list so that the queue elements are reversed in order. * 不能 allocate 跟 free list element ## qtest 運作原理 ### check `remove_head` overflow 實作過程中在 `q_remove_head()` 犯了很蠢的 `sp[bufsize] = '\0';`，`make test` 得到錯誤訊息 `ERROR: copying of string in remove_head overflowed destination buffer.` trace `qtest.c` 的 `do_remove_head()` 了解它如何 check overflow： 1. 先 `malloc` 一個 size 為 `string length + padding (1024) + 1` 的 buffer `removes`，存放 `q_remove_head()` 的 string。 2. initialize removes 第一個位置及最後一個位置為 `'\0'`，其餘為 `'X'`。 3. call `q_remove_head()` 後檢查 `removes[string_length + 1]` 是否還是 `'X'`，否則表示 overflow。以這個檢查方式來說，如果 `q_remove_head()` 實作寫成 `sp[bufzie + 1] = '\0'`，雖然是有問題的程式碼，卻可以通過檢查。針對這個問題的 [PR](https://github.com/sysprog21/lab0-c/pull/7)。 :::warning 請提交 pull request 並參與其後 GitHub 討論 :notes: jserv ::: ### Signal handler & Exception Handling 實作過程用到忘記檢查的 pointer，`qtest` 會顯示錯誤訊息 `Segmentation fault occurred. You dereferenced a NULL or invalid` 並且繼續執行，通常 segmentation fault 會直接 crash 的。 `qtest.c` 的 `queue_init()` 用 `signal()` 設定 `SIGSEGV` 以及 `SIGALRM` 的 signal handler。signal handler 裡 call `trigger_exception()`。 `signal()` 在不同平台甚至不同 Linux 版本有不同行為。基於 portability，manual 建議 `signal()` 只用在將 signal handler 設為 default（`SIG_DFL`）或忽略（`SIG_IGN`），需要改變 signal handler 的情境中要用 `sigaction()`。 #### Nonlocal gotos `sigsetjmp()` & `siglongjmp()` > The functions described on this page are used for performing "nonlocal gotos": transferring execution from one function to a predetermined location in another function. The setjmp() function dynamically establishes the target to which control will later be transferred, and longjmp() performs the transfer of execution. `setjmp(jmp_buf env)` 會在 `env` 裡儲存 calling 環境的資訊，包含 stack pointer、instruction pointer、signal mask 以及其他 register 的內容。`longjmp(jmp_buf env, int val)` 使用 `env` 儲存的資訊來回到 call `setjmp()` 的地方繼續執行。 call `setjmp()` 時它會 return 0，而 call `longjmp()` 後程式會以「看起來像從 `setjmp()` return」的方式繼續執行，此時 return 值會是 call `longjmp()` 指定的 `val`，如果不小心指定成 0 還會很貼心的幫你 return 1。 `setjmp()` 在不同平台上對於是否存 signal mask 有不同行為。在 `qtest` 這種 error handling 的使用情境上，用 `sigsetjmp()` 及 `siglongjmp()` 來保存及恢復 signal mask。 #### Exception handling `qtest` 的 exception handling 機制是在 call queue operation 前先 call `exception_setup()`，之後再 call `exception_cancel()` 清理 exception handling 相關變數。`exception_setup()` 裡使用 `sigsetjmp()` 設定 nonlocal goto 回到的點，而自訂的 signal handler call 的 `trigger_exception()` 裡會 call `siglongjmp()` 讓執行流程回到 `trigger_exception()`，最後再回到原本 call queue operation 前，以 `q_free()` 為例： ```c if (exception_setup(true)) q_free(q); exception_cancel(); ``` `exception_setup()` 在正常 call `sigsetjmp()` 時 return true，由 `siglongjmp()` 進入時 return false。像上面進行檢查就能在發生 `SIGSEGV` 以及 `SIGALRM` signal 後仍然繼續正常執行。 :::info Android 底層有個名為 `debuggerd` 的特殊程式，可用以捕捉其他程式運作過程中的 SIGSEGV，可參見: * [Diagnosing Native Crashes](https://source.android.com/devices/tech/debug/native-crash) * [debuggerd_handler.cpp](https://android.googlesource.com/platform/system/core/+/master/debuggerd/handler/debuggerd_handler.cpp) * [理解 Native Crash 處理流程](http://gityuan.com/2016/06/25/android-native-crash/) 搭配對照，思考這類例外處理機制對於發展一個完整的 framework 的作用 :notes: jserv ::: ### `test_malloc()` & `test_free()` 在 free queue 尚未實作完成時，測試會印出 `ERROR: Freed queue, but X blocks are still allocated` 訊息。 `harness.h` 裡用 marco define 被測試程式的 `malloc` 為 `test_malloc`、`free` 為 `test_free`，所以在 `queue.c` 裡寫 `malloc()` 會 call 到 `test_malloc()`。它使用 `INTERNAL` 決定是測試程式還是被測試程式： ```c #ifdef INTERNAL // ... #else #define malloc test_malloc #define free test_free #endif ``` `qtest.c` 及 `harness.c` 有 define `INTERNAL`。用 `test_malloc()` 把 `malloc()` 包起來，在測試時可以控制 `malloc()` 的行為，就能模擬不同情境進行測試。有一點像是將被測試程式跟真正記憶體操作隔開，在中間加了一層中介。 `test_malloc()` 會 allocate `size + sizeof(block_ele_t) + sizeof(size_t)` 大小的記憶體，size 是原本打算 allocate 的大小，由 `block_ele_t *` 指向這塊記憶體。 ```c typedef struct BELE { struct BELE *next; struct BELE *prev; size_t payload_size; size_t magic_header; /* Marker to see if block seems legitimate */ unsigned char payload[0]; /* Also place magic number at tail of every block */ } block_ele_t; ``` `block_ele_t` 是個 doubly linked list 的 element，新 allocate 的 block 會接在這個 list 的前面。可以看到其中記錄 payload size，是原本要 allocate 的記憶體大小。還有 magic header 以及 `test_malloc()` 實作中會存的 magic footer，因為要存 magic footer 而在 allocate 時多 `sizeof(size_t)`。記憶中 structure 內 field 順序在 memory 中是照順序排下來的，用 gdb 看看： ``` new_block $1 = (block_ele_t *) 0x555555761900 // 指向 structure 第一個位置 &(new_block->next) $2 = (struct BELE **) 0x555555761900 &(new_block->prev) $3 = (struct BELE **) 0x555555761908 &(new_block->payload_size) $4 = (size_t *) 0x555555761910 &(new_block->magic_header) $5 = (size_t *) 0x555555761918 &(new_block->payload[0]) $6 = (unsigned char *) 0x555555761920 "" &(new_block->payload) $7 = (unsigned char (*)[]) 0x555555761920 size $8 = 24 ``` magic header 跟 magic footer 是用來檢查 allocated memory 有沒有被搞壞。 `test_free(void *p)` 會檢查 `*p` 是否為 NULL 以及要 free 的 memory 是否依然合法（沒有被弄壞），沒問題的話會將對應的 `blok_ele_t` 從 list 移除，最後 free memory。一開始提到 `qtest` 能夠知道是否有 memory block 沒 free，是透過 `test_malloc()` 及 `test_free()` 記錄 allocate memory block 的數量來檢查的。 ## Android `debuggerd` Native crash 是 C/C++ 層面的 crash。 [Debugging Native Android Platform Code](https://source.android.com/devices/tech/debug/index.html) 提到一個 dynamically linked 的執行檔開始執行時會 register 一些 signal handler，crash 時 signal、把 crash dump 寫到 logcat（大概是某種 logging 機制）以及稱為 tombstone（墓碑XD）的檔案。tombstone 包含 crashed process 的各種額外資訊，像是所有 thread 的 stack trace、full mempry map 以及開啟的 file descriptor。開發人員可以透過 crash dump 跟 tombstone 分析 crash 原因。 `debuggerd` 跟 `debuggerd64` 是 Android 8.0 前處理 crash 的 daemon。Android 8.0 及之後版本，會依需要產生 `crash_dump32` 跟 `crsah_dump64`。 ### Diagnosing Native Crashes [Diagnosing Native Crashes](https://source.android.com/devices/tech/debug/native-crash) 提到各種 crash 情況。除了常見的 `SIGABT` 跟 `SIGSEGV`，還有安全性檢查（像是 C library 檢查 buffer、有沒有 call 不能 call 的 system call、是否發生 stack buffer overflow），以及 fd 有沒有被誤用（use-after-close、double-close 等）都有各自的 signal。看起來是使用 signal 跟 crash 機制對程式進行更多檢查，來補充語言本身沒有直接提供的功能？ compiler `-fstack-protector` 選項可以檢查 function stack 是否 overflow。它透過在 [function prologue](https://en.wikipedia.org/wiki/Function_prologue) 加些用來檢查的 data，只要檢查那些 data 有沒有改變就能知道是否有 buffer overflow。 > Emit extra code to check for buffer overflows, such as stack smashing attacks. This is done by adding a guard variable to functions with vulnerable objects. ### 機制作用 & 往事不堪回首(?) 這樣的機制可以讓開發者在程式 crash 或者 hang 住的時候擁有更多資訊，更容易找出問題所在。 `debuggerd` 可以用來取得正在執行的 process 的 stack trace 及 tombstone。這表示它可以拿到 hang 住（例如 deadlock 造成的）的 process 的 stack trace？有點不記得以前在 Windows 上 application hang 住是不是每次都拿得到 full dump（至少包含所有 thread call stack 的 dump）。application 本身有做類似 crash dump 記錄的機制，但 full dump 要用別的方式取得（都是 QA 生好的...不清楚怎麼生的...）有時候可以從 full dump 分析出 deadlock（好像有即使有 full dump 還是什麼都看不出來的情況...也不知道是不會看還是真的沒資訊...）。沒有 full dump 的時候...呃...只能從操作步驟、log、hang 住時整個系統的反應猜測可能有關的 code，再從 code 去看哪裡有問題，非常吃靈感的一件事。後來 application porting 到 Linux 上，crash 只有基本的 core dump，遇到 hang 住或沒有 core dump 的時候......就是場災難的通靈大會。*（雖然 `debuggerd` 是在 Android 上，但為什麼沒有早點知道有這種東西...）* ### Trace note [debugger_handler.cpp](https://android.googlesource.com/platform/system/core/+/master/debuggerd/handler/debuggerd_handler.cpp) 的 `debuggerd_signal_handler()` 用 `clone()` 生了 child process 執行 `debuggerd_dispatch_pseudothread()`，裡面再用 `__fork()` 生了 child process 去 `execle()` `crash_dump32` 或 `crash_dump64`。`debuggerd_dispatch_pseudothread()` 之後會再生一隻 child process 給 `crash_dump` 用。 *中間有生 process 跟 pipe 的操作，沒有很懂生 process 的目的跟各個 pipe 連到哪裡去了。* *`debuggerd_init()` 會 register signal handler，但除了 `debuggerd/crasher/crasher.cpp` 找不到 call 它的地方。* `crash_dump` 會 `ptrace` parent process、暫停 parent process 所有 thread。從 [crash_dump.cpp](https://android.googlesource.com/platform/system/core/+/refs/heads/master/debuggerd/crash_dump.cpp) 來看，中斷 target process 後會從 pipe 讀 crash info，接著從 `tombstoned` 拿到 output fd 再寫 crash info 到 output fd。 `debuggerd` command 主要做事的 function 是 `debuggerd_trigger_dump()`，看起來先用 socket 連到 `tombstoned`，接著送 signal 給 command line 指定的 pid 的 process，最後把從 pipe read 到的東西寫出去。 --- ``` $ gdb ./qtest (gdb) run < trace.txt (gdb) source gdb-script ``` :::info 可撰寫 [GDB script](https://sourceware.org/gdb/download/onlinedocs/gdb/Extending-GDB.html)，用以自動化特定操作 :notes: jserv ::: ###### tags: `Linux 核心設計 2019`