# 2018q3 Homework3 (review)
contributed by < [`dange0`](https://github.com/dange0) >
###### tags: `sysprog2018`
[TOC]
## 第二週
### [測驗 `1`](https://hackmd.io/s/BJA6LjbK7#%E6%B8%AC%E9%A9%97-1)
```clike=
const char *p;
void dont_do_this(void) {
const char c_str[] = "This will change";
p = c_str;
}
```
C99/C11 6.2.4
object referred out of lifetime is undefined behavior
#### AddressSanitizer
[AddressSanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer) 是一款 c/c++ 的記憶體偵錯工具,而這款工具可以偵測到的錯誤有以下這幾類:
- Use after free (dangling pointer dereference)
- Heap buffer overflow
- Stack buffer overflow
- Global buffer overflow
- Use after return
- Use after scope
- Initialization order bugs
- Memory leaks
於測驗題中的漏洞則是被歸類於 `use after return`。
Stack-use-after-return 的錯誤發生於使用一個已經被 return 的 stack object
以下為官方文件的範例:
**Use-after-return.c**
```clike
int *ptr;
void FunctionThatEscapesLocalObject() {
int local[100];
ptr = &local[0];
}
int main(int argc, char **argv) {
FunctionThatEscapesLocalObject();
return ptr[argc];
}
```
接著可以嘗試使用 `AddressSanitizer` 觀察 `use after return` 會產生怎麼樣的錯誤訊息與報告
依照官方說明,只需要在編譯時期加入 flag `-fsanitize=address` 編譯器就會將 AddressSanitizer 模組加入執行檔
```shell
$ gcc -g -o Use-after-return -fsanitize=address Use-after-return.c
```
接著需要在全域變數內設定 ASAN_OPTIONS,因為 AddressSanitizer 預設是沒有開啟 stack-use-after-return 的偵測
```shell
$ export ASAN_OPTIONS=detect_stack_use_after_return=1
```
直接執行之後,就可以看到錯誤報告
```shell
$ ./Use-after-return
=================================================================
==1478==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f5a40e00024 at pc 0x0000004009d4 bp 0x7ffc5b3f3110 sp 0x7ffc5b3f3100
READ of size 4 at 0x7f5a40e00024 thread T0
#0 0x4009d3 in main /tmp/Use-after-return.c:10
#1 0x7f5a4468e82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
#2 0x400748 in _start (/tmp/Use-after-return+0x400748)
Address 0x7f5a40e00024 is located in stack of thread T0 at offset 36 in frame
#0 0x400825 in FunctionThatEscapesLocalObject /tmp/Use-after-return.c:3
This frame has 1 object(s):
[32, 432) 'local' <== Memory access at offset 36 is inside this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-return /tmp/Use-after-return.c:10 main
Shadow bytes around the buggy address:
0x0febc81b7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0febc81b7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0febc81b7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0febc81b7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0febc81b7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0febc81b8000: f5 f5 f5 f5[f5]f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
0x0febc81b8010: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
0x0febc81b8020: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
0x0febc81b8030: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00
0x0febc81b8040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0febc81b8050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
==1478==ABORTING
```
- AddressSanitizer 發現該程式碼具有 `use after return` 之問題
- 於錯誤報告中間描述記憶體行為的部分有很多 f5,對應到下面的表可以發現 f5 代表的就是 Stack after return
#### CVE
於 CVE 網站上搜尋到的多為攻擊 heap 的 `use after free` 類型攻擊,並未找到 `use after return` 之攻擊案例,目前也沒有想到這類漏洞的攻擊情境。
---
### [測驗 `3`](https://hackmd.io/s/BJA6LjbK7#%E6%B8%AC%E9%A9%97-3)
Linux 核心程式碼 [include/linux/list.h](https://github.com/torvalds/linux/blob/master/include/linux/list.h) 提到以下程式碼,為何每個 `head` 使用時都要先加上 `()` 呢?
```clike
#define list_for_each_prev(pos, head) \
for (pos = (head)->prev; pos != (head); pos = pos->prev)
```
:::success
延伸問題: 在 Linux 核心原始程式碼找出類似上述「走訪節點」的片段,討論其實作技巧和考量點
:::
---
### [測驗 `5`](https://hackmd.io/s/BJA6LjbK7#%E6%B8%AC%E9%A9%97-5)
以下程式是合法 C99 程式嗎?
**1.**
```clike
#include <stdio.h>
int main() { return (********puts)("Hello"); }
```
依據 C99 規格書的解釋
:::info
**C99 6.5.3.1**
The unary `*` operator denotes indirection. ==If the operand points to a function, the result is a function designator==;
:::
當 dereference function designator 時,依舊會得到 function designator
因此 `(********puts)` 可視為 `(*(*(*(*(*(*(*(*puts))))))))` ,其結果仍然為 `puts`
**2.**
```clike
#include <stdio.h>
int main() { return (**&puts)("Hello"); }
```
依據 C99 規格書的解釋
:::info
**C99 6.5.3.2**
Thus, &*E is equivalent to E (even if E is a null pointer), and &(E1[E2]) to ((E1)+(E2)). It is always true that if E is a function designator or an lvalue that is a valid operand of the unary & operator, ==\*&E is a function designator or an lvalue equal to E==. If *P is an lvalue and T is the name of an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points.
:::
對 function designator 做 `*&` 操作,其結果仍然為 function designator,因此該程式碼依舊合法。
**3.**
```clike
#include <stdio.h>
int main() { return (&**&puts)("Hello"); }
```
依據 C99 規格書的解釋
:::info
**C99 6.5.3.2**
Thus, ==&*E is equivalent to E (even if E is a null pointer)==, and &(E1[E2]) to ((E1)+(E2)). It is always true that if E is a function designator or an lvalue that is a valid operand of the unary & operator, \*&E is a function designator or an lvalue equal to E. If *P is an lvalue and T is the name of an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points.
:::
對 function designator 做 `&*` 操作,其結果仍然為 function designator,因此該程式碼合法。
:::warning
**Question**
```clike
#include <stdio.h>
int main() { return (&(&puts))("Hello"); }
```
編譯階段的錯誤訊息為:
```
error: lvalue required as unary ‘&’ operand
```
為什麼連續兩個 `&`,程式碼就變成非法呢?
:::
依據 C99 規格書對於 `&` 的限制
:::info
**C99 6.5.3.2**
1. Constraints
The operand of the unary & operator shall be either a ==function designator==, the result of a [] or unary * operator, or an ==lvalue== that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
:::
只有 function designator 或是 lvalue 等等才可以使用 unary `&` operator,其中並不包含 `rvalue`。
回到上面的問題,經過第一次的 &puts 之後得到的結果為 puts function 開始地址的值,為 `rvalue`,因此與上述條件抵觸,故不合法。
## 第三週
### [測驗 `2`](https://hackmd.io/s/BkyQskjK7#%E6%B8%AC%E9%A9%97-2)
以下程式碼,在 little-endian 硬體上,會返回 `1`,反之,在 big-endian 硬體上會返回 `0`:
```clike
int main() {
union { int a; char b;
} c = { .a = 1 };
return c.b == 1;
}
```
#### Big-endian v.s Little-endian
- Little-endian
- 字串中的高位放在記憶體的高位
- 字串中的低位放在記憶體的低位
- 解析的時候從高位往低位解析
- Big-endian
- 字串中的低位放在記憶體的高位
- 字串中的高位放在記憶體的低位
- 解析的時候從低位往高位解析
![](https://i.imgur.com/UlIHwb5.png)
[source](https://agilescientific.com/blog/2017/3/31/little-endian-is-legal)
#### little-endian 中不同型態的解析
```clike
int main() {
int a = 0x12345678;
printf("char\t%x\n",(char)a);
printf("short\t%x\n",(short)a);
printf("int\t%x\n",(int)a);
}
```
Result:
```shell
char 78
short 5678
int 12345678
```
![](https://i.imgur.com/nDYZeBM.png =350x)
從上述結果,可以得知以下結論:
- 在取記憶體的值時,會依照 pointer 的型態決定取多少位置的值
- 取值時會從最低位開始往高位取
#### 回到 union
```clike
int main() {
union { int a; char b;
} c = { .a = 1 };
return c.b == 1;
}
```
在 little-endian 架構下,記憶體狀況如下:
![](https://i.imgur.com/IqT60HR.png =350x)
在 big-endian 架構下,記憶體狀況如下:
![](https://i.imgur.com/81HEKxG.png =350x)
因此當 `c` 用 char 解析時,little-endian 得到之結果為 1,反之,則為 0
#### 類似程式碼
```cpp
#include <stdio.h>
int main() {
int i = 1;
char *c = (char *) &i;
if (*c)
printf("LITTLE_ENDIAN\n");
else
printf("BIG_ENDIAN\n");
return 0;
}
```
[source](https://notfalse.net/19/byte-order#C)
原理與 `union` 相似,只是他使用的是透過強制轉型改變 pointer type
---
### [測驗 `3`](https://hackmd.io/s/BkyQskjK7#%E6%B8%AC%E9%A9%97-3)
以下程式碼計算 parity bit:
```cpp
#include <stdint.h>
int parity(uint32_t x)
x ^= x >> 1;
x ^= x >> 2;
x = (x & 0x11111111U) * 0x11111111U;
return (x >> 28) & 1;
}
```
#### 原理解釋
![](https://www.electronicshub.org/wp-content/uploads/2015/07/TRUTH-TABLE-1.jpg)
XOR 的結果會將奇數與偶數的特性保留,奇數個 1 XOR 完的結果為 1,偶數個 1 XOR 完的結果為 0
因此經過 `x ^= x >> 1` 與 `x ^= x >> 2` 後,每個 bit 都保留了 4 個 bit 奇偶的結果
把 `0x11111111U` 轉為二進位表示 `00010001000100010001000100010001`
因為剛剛的 XOR 已經讓第一個 bit 保存了 1~4 個 bit 的奇偶結果,接下來只需要每隔 4 個 bit 取一次就好了。
x \* 00010001000100010001000100010001 可以看成是做 8 次的 `x += x<<4`
最後,第 28 bit 會紀錄著所有結果奇偶值的結果。
#### SIMD