2018q3 E05 - HackMD

# 2018q3 E05 contributed by <[jesus255221](https://github.com/jesus255221)> ## 測驗 `1` 正確答案為： ```C= #include <stdint.h> int64_t abs64(int64_t x) { int64_t y = x >> (64 - 1); return (x ^ y) - y; } ``` 概念是把`y`設成`x`的 sign bit，如果`x`為負數，`y`將為 1 那將`x`與 1 xor 將會產生所有 bits 皆相反的`x`，屆時在加一極為二補數如果`x`為正數，sign bit 將為 0，將`x` xor 0 還是`x`，減零當然還是`x` ### Underflow 的問題我們都知道二補數的負數可以表示比正數多一個，假設現在有四位元，正數最高表示到`0111`，也就是7。負數最小表示到`1000`，也就是 -8 。假設上述程式`int`只有四位元，將 -8 也就是`1000`傳入此程式會產生`1111`，也就是最小的負數`-1` 當然是不對的結果。也就是説最小的負數將會造成 underflow ### PRNG 的測試 ```clike= #include<stdio.h> #include<stdint.h> static uint64_t r = 0xdeadbeef; int64_t rand64() { r ^= r >> 12; r ^= r << 25; r ^= r >> 27; return (int64_t) (r * 2685821657736338717); } int64_t abs64(int64_t x) { int64_t y = x >> (64 - 1); return (x ^ y) - y; } int main(){ int i; for(i=0;i<__LONG_MAX__;i++){ abs64(rand64()); } } ``` 利用 PRNG 來測試有一個缺點，因為是隨機變數產生，如果產生相同的數字便會導致有些數字沒有測試到。如果要全部測到，連續的數字可能是比較好的選擇。當然，在有限的時間還是測得完，但是這個有限的時間可能還是太長，可能要挑 corner case 來測試是比較經濟的做法。 ### Timing attack 事實上常數運算時間例如上面的`abs64`可以有效避免 timing attack 也就是 Side-channel attack 的一種。利用計算「計算的時間」藉以推敲密碼等敏感資訊。[這邊](https://codahale.com/a-lesson-in-timing-attacks/)有一個 java 的弱點闡述，詳細說明了網路也可以用 timing attack 來推敲密碼。因此，許多密碼學的 github 專案都採用 constant time calculation 來比較字串相同與否。在 C 密碼學套件 [libsodium](https://github.com/jedisct1/libsodium) 裡面也有這樣的程式碼，在密碼學中，比較字串是否相同是最基礎的功能，驗證的基礎，裡面就有常數時間的比較 ```clike= int sodium_memcmp(const void *const b1_, const void *const b2_, size_t len) { #ifdef HAVE_WEAK_SYMBOLS const unsigned char *b1 = (const unsigned char *) b1_; const unsigned char *b2 = (const unsigned char *) b2_; #else const volatile unsigned char *volatile b1 = (const volatile unsigned char *volatile) b1_; const volatile unsigned char *volatile b2 = (const volatile unsigned char *volatile) b2_; #endif size_t i; volatile unsigned char d = 0U; #if HAVE_WEAK_SYMBOLS _sodium_dummy_symbol_to_prevent_memcmp_lto(b1, b2, len); #endif for (i = 0U; i < len; i++) { d |= b1[i] ^ b2[i]; } return (1 & ((d - 1) >> 8)) - 1; } ``` 可以觀察到對每個位元都進行 xor 運算來檢查，雖然慢但是卻可以確保不論字串長度如何，比較時間都相同，不會因為正確的字元比對而改變執行時間提早結束，防止利用時間差來逆運算出此密碼的值，避免 timing attack。 ## 測驗 `2` ```C= int *global_var; void f(void) { int x = 3; global_var = &x; ... /* Can the compiler perform TCO here? */ g(); } ``` 思考程式註解，在第 8 行能否施加 TCO 呢？選出最適合的解釋。 ==作答區== * `(a)` 編譯器不可能施加 TCO * `(b)` 編譯器一定可施加 TCO * `(c)` 只要函式 `g` 沒有對 `global_var` 指標作 dereference，那麼 TCO 就有機會 :::success 延伸問題: 1. 探討 TCO 和遞迴程式的原理 2. 分析上述實驗的行為和解釋 gcc 對 TCO 的操作 3. 在 [Android 原始程式碼](https://android.googlesource.com/) 裡頭找出 [__builtin_return_address](https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html) 的應用並解說 ::: ### TCO 是什麼？可以吃嗎？不行。TCO 的全名是 Tail Call Optimization，顧名思義也就是尾呼叫優化，也就是說把函數的呼叫放到尾巴的地方來執行，那這樣的好處是什麼呢？根據 stackoverflow 論壇上的回答 > Tail-call optimization is where you are able to avoid allocating a new stack frame for a function because the calling function will simply return the value that it gets from the called function. The most common use is tail-recursion, where a recursive function written to take advantage of tail-call optimization can use constant stack space. > 以下來自 [wikipedia](https://zh.wikipedia.org/wiki/尾调用) 以連加為例，在 python 裡面沒有 tail recursion 的版本： ```python= def recsum(x): if x == 1: return x else: return x + recsum(x - 1) ``` Recursion 會展開成 ``` recsum(5) 5 + recsum(4) 5 + (4 + recsum(3)) 5 + (4 + (3 + recsum(2))) 5 + (4 + (3 + (2 + recsum(1)))) 5 + (4 + (3 + (2 + 1))) 5 + (4 + (3 + 3)) 5 + (4 + 6) 5 + 10 15 ``` 上面每個遞迴都必須用一個區域變數來紀錄呼叫前的變數數值，有五個變數就必須記五次。有 tail recursion 的版本： ```python= def tailrecsum(x, running_total=0): if x == 0: return running_total else: return tailrecsum(x - 1, running_total + x) ``` Recursion 會展開成 ``` tailrecsum(5, 0) tailrecsum(4, 5) tailrecsum(3, 9) tailrecsum(2, 12) tailrecsum(1, 14) tailrecsum(0, 15) 15 ``` 最下面並沒有用區域變數來計算，需要控制的變數就是兩個，不需要額外用 stack 空間來存放，用完上一次的變數就可以將其替代，不必紀錄任何資訊。達到近乎 iteration 的效能，同時又擁有簡潔的程式碼。 ## 測驗 `3` 分別考慮以下 4 個程式，探討其行為。 - [ ] `ptr1.c` ```C int main() { return *((int *) 0); } ``` 根據 C99 6.6.9 > An address constant is a null pointer, a pointer to an lvalue designating an object of static storage duration, or a pointer to a function designator; it shall be created explicitly using the unary & operator or an integer constant cast to pointer type, or implicitly by the use of an expression of array or function type. The array-subscript [] and member-access . and -> operators, the address & and indirection * unary operators, and pointer casts may be used in the creation of an address constant, but the value of an object shall not be accessed by use of these operators. > 言下之意就是你不應該用這些運算子來得到 object 的內容。上面就是宣告一個位址為 0 的指標位址去取它的值，根據規格書，不應該這樣存取。所以遇到 Segementation fault 是合理的。 --- - [ ] `ptr2.c` ```C int main() { return &*((int *) 0); } ``` 根據 C99 6.5.3.3 註解 87 提到 > It is always true that if E is a function designator or an lvalue that is a valid operand of the unary & operator, *&E is a function designator or an lvalue equal to E. > ```clike= int function(){ return (int *) 0; } int main(){ int i = function(); printf("%d\n",i); return 0; } ``` ``` test_2.c:13:9: warning: return makes integer from pointer without a cast [-Wint-conversion] return (int *) 0; ``` 執行結果依舊是 0 可見`int`的回傳型態強至把`int *`轉型成`int`，保留原本的值零，再回傳，所以適合法的程式碼。 --- - [ ] `ptr3.c` ```C #include <stddef.h> int main() { return &*NULL; } ``` 根據 C99 6.5.3.3 註解 87 提到 > Thus, &*E is equivalent to E (even if E is a null pointer) > ```clike= #include<stdio.h> #include <stddef.h> int function(){ return NULL; } int main(){ int i = function(); printf("%d\n",i); return 0; } ``` ```w test_2.c:3:21: warning: return makes integer from pointer without a cast [-Wint-conversion] int main() { return &*NULL; } ``` 上述輸出為零，代表`return NULL`，與`return 0`作用相同，不過會有轉型的警告 --- - [ ] `ptr4.c` ```C #include <stddef.h> int main() { return &*(*main - (ptrdiff_t) **main); } ``` `main`是 function designator，根據C99規格書 6.5.3.2 第四點表示 > The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; > 又根據 C99 6.3.2.1 表示 >a function designator with type ''function returning type'' is converted to an expression that has type ''pointer to function returning type''. > 所以 pointer to function 就變成 function designator ，然後 function designator 又被轉成 pointer to function `ptrdiff_t`型態則是被定義為`long`，因此`*main - (ptrdiff_t) **main`就等於 main 的地址檢調 main 的地址，也就是零。根據 C99 6.5.3.3 註解 87 提到 >It is always true that if E is a function designator or an lvalue that is a valid operand of the unary & operator, *&E is a function designator or an lvalue equal to E. Thus, &*E is equivalent to E (even if E is a null pointer) > 所以就等於回傳 0 ```warning test_2.c:7:12: warning: return makes integer from pointer without a cast [-Wint-conversion] return &*(*main - (ptrdiff_t) **main); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` ## 姬兮兮的錯誤訊息 ```clike int main() { return &*NULL; } ``` ``` test_2.c: In function ‘main’: test_2.c:3:22: warning: dereferencing ‘void *’ pointer int main() { return &*NULL; } ^ test_2.c:3:21: warning: return makes integer from pointer without a cast [-Wint-conversion] int main() { return &*NULL; } ``` 第一個錯誤訊息是因為 `NULL`被定義成 `((void*) 0)`，直接將 void pointer dereference 會出現警告，第二個錯誤訊息則是因為回傳值為`int`可是我們回傳的卻是`(void*) 0`，型態並不對。 ```clike int main() { return (int)(long) &*(int*)NULL; } ``` 這樣就可以將警告消除 ``` int main() { return &*(*main - (ptrdiff_t) **main); } ``` ```clike test_2.c:6:12: warning: return makes integer from pointer without a cast [-Wint-conversion] return &*(*main - (ptrdiff_t) **main); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` ```clike int main() { return (int)(long)&*(*main - (ptrdiff_t) **main); } ``` 這樣就可以把警告消除，基本上警告都是說你型態並不對，必須轉型。不過都還是可以編譯過並且順利執行。 --- ## 閱讀感想其實我自己也有這樣的經歷，之前電機系有一門課程要我們自己發想然後做出一個成品，我和同學一起想做腳踏車的方向燈和大燈。實際在組裝電路的時候才發現自己學的知識完全派不上用場，到底 LED 要用幾歐姆的電阻呢？到底要怎麼樣線才能整齊？要怎麼樣才能把線材固定在腳踏車上面？甚至我們學了三年連一個焊接都沒有辦法焊的堪用，常常接觸不良。實作與理論的鴻溝需要非常努力才能跨越。

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.