2018q3 Homework3 (review)

# 2018q3 Homework3 (review) contributed by < [`dange0`](https://github.com/dange0) > ###### tags: `sysprog2018` [TOC] ## 第二週 ### [測驗 `1`](https://hackmd.io/s/BJA6LjbK7#%E6%B8%AC%E9%A9%97-1) ```clike= const char *p; void dont_do_this(void) { const char c_str[] = "This will change"; p = c_str; } ``` C99/C11 6.2.4 object referred out of lifetime is undefined behavior #### AddressSanitizer [AddressSanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer) 是一款 c/c++ 的記憶體偵錯工具，而這款工具可以偵測到的錯誤有以下這幾類： - Use after free (dangling pointer dereference) - Heap buffer overflow - Stack buffer overflow - Global buffer overflow - Use after return - Use after scope - Initialization order bugs - Memory leaks 於測驗題中的漏洞則是被歸類於 `use after return`。 Stack-use-after-return 的錯誤發生於使用一個已經被 return 的 stack object 以下為官方文件的範例： **Use-after-return.c** ```clike int *ptr; void FunctionThatEscapesLocalObject() { int local[100]; ptr = &local[0]; } int main(int argc, char **argv) { FunctionThatEscapesLocalObject(); return ptr[argc]; } ``` 接著可以嘗試使用 `AddressSanitizer` 觀察 `use after return` 會產生怎麼樣的錯誤訊息與報告依照官方說明，只需要在編譯時期加入 flag `-fsanitize=address` 編譯器就會將 AddressSanitizer 模組加入執行檔 ```shell $ gcc -g -o Use-after-return -fsanitize=address Use-after-return.c ``` 接著需要在全域變數內設定 ASAN_OPTIONS，因為 AddressSanitizer 預設是沒有開啟 stack-use-after-return 的偵測 ```shell $ export ASAN_OPTIONS=detect_stack_use_after_return=1 ``` 直接執行之後，就可以看到錯誤報告 ```shell $ ./Use-after-return ================================================================= ==1478==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f5a40e00024 at pc 0x0000004009d4 bp 0x7ffc5b3f3110 sp 0x7ffc5b3f3100 READ of size 4 at 0x7f5a40e00024 thread T0 #0 0x4009d3 in main /tmp/Use-after-return.c:10 #1 0x7f5a4468e82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #2 0x400748 in _start (/tmp/Use-after-return+0x400748) Address 0x7f5a40e00024 is located in stack of thread T0 at offset 36 in frame #0 0x400825 in FunctionThatEscapesLocalObject /tmp/Use-after-return.c:3 This frame has 1 object(s): [32, 432) 'local' <== Memory access at offset 36 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-use-after-return /tmp/Use-after-return.c:10 main Shadow bytes around the buggy address: 0x0febc81b7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0febc81b7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0febc81b7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0febc81b7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0febc81b7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x0febc81b8000: f5 f5 f5 f5[f5]f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 0x0febc81b8010: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 0x0febc81b8020: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 0x0febc81b8030: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00 0x0febc81b8040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0febc81b8050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe ==1478==ABORTING ``` - AddressSanitizer 發現該程式碼具有 `use after return` 之問題 - 於錯誤報告中間描述記憶體行為的部分有很多 f5，對應到下面的表可以發現 f5 代表的就是 Stack after return #### CVE 於 CVE 網站上搜尋到的多為攻擊 heap 的 `use after free` 類型攻擊，並未找到 `use after return` 之攻擊案例，目前也沒有想到這類漏洞的攻擊情境。 --- ### [測驗 `3`](https://hackmd.io/s/BJA6LjbK7#%E6%B8%AC%E9%A9%97-3) Linux 核心程式碼 [include/linux/list.h](https://github.com/torvalds/linux/blob/master/include/linux/list.h) 提到以下程式碼，為何每個 `head` 使用時都要先加上 `()` 呢？ ```clike #define list_for_each_prev(pos, head) \ for (pos = (head)->prev; pos != (head); pos = pos->prev) ``` :::success 延伸問題: 在 Linux 核心原始程式碼找出類似上述「走訪節點」的片段，討論其實作技巧和考量點 ::: --- ### [測驗 `5`](https://hackmd.io/s/BJA6LjbK7#%E6%B8%AC%E9%A9%97-5) 以下程式是合法 C99 程式嗎？ **1.** ```clike #include <stdio.h> int main() { return (********puts)("Hello"); } ``` 依據 C99 規格書的解釋 :::info **C99 6.5.3.1** The unary `*` operator denotes indirection. ==If the operand points to a function, the result is a function designator==; ::: 當 dereference function designator 時，依舊會得到 function designator 因此 `(********puts)` 可視為 `(*(*(*(*(*(*(*(*puts))))))))` ，其結果仍然為 `puts` **2.** ```clike #include <stdio.h> int main() { return (**&puts)("Hello"); } ``` 依據 C99 規格書的解釋 :::info **C99 6.5.3.2** Thus, &*E is equivalent to E (even if E is a null pointer), and &(E1[E2]) to ((E1)+(E2)). It is always true that if E is a function designator or an lvalue that is a valid operand of the unary & operator, ==\*&E is a function designator or an lvalue equal to E==. If *P is an lvalue and T is the name of an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points. ::: 對 function designator 做 `*&` 操作，其結果仍然為 function designator，因此該程式碼依舊合法。 **3.** ```clike #include <stdio.h> int main() { return (&**&puts)("Hello"); } ``` 依據 C99 規格書的解釋 :::info **C99 6.5.3.2** Thus, ==&*E is equivalent to E (even if E is a null pointer)==, and &(E1[E2]) to ((E1)+(E2)). It is always true that if E is a function designator or an lvalue that is a valid operand of the unary & operator, \*&E is a function designator or an lvalue equal to E. If *P is an lvalue and T is the name of an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points. ::: 對 function designator 做 `&*` 操作，其結果仍然為 function designator，因此該程式碼合法。 :::warning **Question** ```clike #include <stdio.h> int main() { return (&(&puts))("Hello"); } ``` 編譯階段的錯誤訊息為： ``` error: lvalue required as unary ‘&’ operand ``` 為什麼連續兩個 `&`，程式碼就變成非法呢？ ::: 依據 C99 規格書對於 `&` 的限制 :::info **C99 6.5.3.2** 1. Constraints The operand of the unary & operator shall be either a ==function designator==, the result of a [] or unary * operator, or an ==lvalue== that designates an object that is not a bit-field and is not declared with the register storage-class specifier. ::: 只有 function designator 或是 lvalue 等等才可以使用 unary `&` operator，其中並不包含 `rvalue`。回到上面的問題，經過第一次的 &puts 之後得到的結果為 puts function 開始地址的值，為 `rvalue`，因此與上述條件抵觸，故不合法。 ## 第三週 ### [測驗 `2`](https://hackmd.io/s/BkyQskjK7#%E6%B8%AC%E9%A9%97-2) 以下程式碼，在 little-endian 硬體上，會返回 `1`，反之，在 big-endian 硬體上會返回 `0`: ```clike int main() { union { int a; char b; } c = { .a = 1 }; return c.b == 1; } ``` #### Big-endian v.s Little-endian - Little-endian - 字串中的高位放在記憶體的高位 - 字串中的低位放在記憶體的低位 - 解析的時候從高位往低位解析 - Big-endian - 字串中的低位放在記憶體的高位 - 字串中的高位放在記憶體的低位 - 解析的時候從低位往高位解析 ![](https://i.imgur.com/UlIHwb5.png) [source](https://agilescientific.com/blog/2017/3/31/little-endian-is-legal) #### little-endian 中不同型態的解析 ```clike int main() { int a = 0x12345678; printf("char\t%x\n",(char)a); printf("short\t%x\n",(short)a); printf("int\t%x\n",(int)a); } ``` Result： ```shell char 78 short 5678 int 12345678 ``` ![](https://i.imgur.com/nDYZeBM.png =350x) 從上述結果，可以得知以下結論： - 在取記憶體的值時，會依照 pointer 的型態決定取多少位置的值 - 取值時會從最低位開始往高位取 #### 回到 union ```clike int main() { union { int a; char b; } c = { .a = 1 }; return c.b == 1; } ``` 在 little-endian 架構下，記憶體狀況如下： ![](https://i.imgur.com/IqT60HR.png =350x) 在 big-endian 架構下，記憶體狀況如下： ![](https://i.imgur.com/81HEKxG.png =350x) 因此當 `c` 用 char 解析時，little-endian 得到之結果為 1，反之，則為 0 #### 類似程式碼 ```cpp #include <stdio.h> int main() { int i = 1; char *c = (char *) &i; if (*c) printf("LITTLE_ENDIAN\n"); else printf("BIG_ENDIAN\n"); return 0; } ``` [source](https://notfalse.net/19/byte-order#C) 原理與 `union` 相似，只是他使用的是透過強制轉型改變 pointer type --- ### [測驗 `3`](https://hackmd.io/s/BkyQskjK7#%E6%B8%AC%E9%A9%97-3) 以下程式碼計算 parity bit: ```cpp #include <stdint.h> int parity(uint32_t x) x ^= x >> 1; x ^= x >> 2; x = (x & 0x11111111U) * 0x11111111U; return (x >> 28) & 1; } ``` #### 原理解釋 ![](https://www.electronicshub.org/wp-content/uploads/2015/07/TRUTH-TABLE-1.jpg) XOR 的結果會將奇數與偶數的特性保留，奇數個 1 XOR 完的結果為 1，偶數個 1 XOR 完的結果為 0 因此經過 `x ^= x >> 1` 與 `x ^= x >> 2` 後，每個 bit 都保留了 4 個 bit 奇偶的結果把 `0x11111111U` 轉為二進位表示 `00010001000100010001000100010001` 因為剛剛的 XOR 已經讓第一個 bit 保存了 1~4 個 bit 的奇偶結果，接下來只需要每隔 4 個 bit 取一次就好了。 x \* 00010001000100010001000100010001 可以看成是做 8 次的 `x += x<<4` 最後，第 28 bit 會紀錄著所有結果奇偶值的結果。 #### SIMD