2016q3 Homework1 (phonebook)

# 2016q3 Homework1 (phonebook) contributed by <`LanKuDot`> ## Git Hook * 當某事件發生時，git 可以透過 hook 觸發對應腳本 * 所有 hook 腳本被存放在 `.git/hooks` 中，可以為 shell script, Perl, Python 等 * 在 `.git/hooks` 中，以 `.sample` 為副檔名為 git 提供的範例腳本，要使用去掉 `.sample` 即可 * 一些 hook * `pre-commit`：在 commit 之前執行，可以用來檢查程式碼可否運行。當 hook 回傳非 0 值，git 會放棄這次的 commit，但可以用 `git commit --no-verify` 來忽略。 * `commit-msg`：可以用來查 commit message 是否符合特定格式。 * `post-commit`：commit 後執行，可以用作通知。 ## Astyle * 查看作業 `pre-commit` 提供的 script * astyle option * \-\-style=kr：用 k&r 風格的 brackets ```c int Foo(int bar) { /* Function, namespace, class 要斷行 */ if (bar != 0) { /* 其他的不能斷行 */ Foo2(); return 0; } else return 1; } ``` * \-\-indent=space=4：縮排用 4 空格 * \-\-indent-switchs：switch statement 中的所有 case block 都要 indent ```c switch (foo) { case 1: x += 19; break; default: x += 1; break; } ``` * 實做 * [L17] 先用 `git diff`，指定 ACMR (Added, Copied, Modified, Renamed)，配合 `grep` 將檔名為 `*.cpp, *.c, *.h` 的檔案列出來 * 然後針對每一個檔案 * [L19] 先將指定檔案從 index 複製到暫存檔中，輸出內容為 `暫存檔名原本檔名`，所以利用 `cut -f 1` 濾出第一個 section 的文字，也就是暫存檔名 * [L20] 做一份這個暫存檔的暫存檔 * [L21] `astyle [options] < inputFile > beautifuledFile`：將其中一個暫存檔丟給 astyle 做美化並輸出到另一個暫存檔 * [L22] 只要比較這兩個檔案，有不同就代表被美化過，所以就讓回傳值為 1，並輸出訊息 * 在 vim 將 tab 轉換為 space ``` :set tabstop=4 "設定一個 tab 為 4 個 space :set expandtab "設定按下 tab 時為插入 spaces :retab "將既有的 tab 轉換成 spaces :.retab "只轉換目前所在之行的 tab ``` ## Hash table 這次使用 hash table 來建立搜尋的 index * hash function：將不同長度的資料轉換成相同大小的函式。以 hash 出來的結果呈 uniform distribution 為佳。 * hash table：一種資料結構，元素為 key-vaule pair，以 hash 過後的資料作為 key，原始資料作為 value。通常會以 key 作為 index，而使用 linked list 儲存擁有同樣 key 的 values。 ### Array of pointers * 使用 array of pointers 作為 hash table。array 的index 就是 hash table 的 index，每個 index 都指向一個 linked list。 * 必需要 pass "pointer to array of pointers" 到 append function，也就是 `append(char lastname[], entry **e)`，e 是 pointer of pointer to entry，所以只要 pass array of pointers 的名子到 function 就可以了。參考下面程式： ```c= void printElement(int **e, int index) { printf("e: %p\n", e); printf("e[%d]: %p\n", index, e[index]); printf("*e[%d]: %d\n", index, *e[index]); } int main(void) { int a[5] = {1, 2, 3, 4, 5}; int *b[5] = {a, a+1, a+2, a+3, a+4}; printf("a: %p\n", a); printf("b: %p\n", b); printElement(e, 0); return 0; } ``` * 輸出 ``` a: 0x7ffc69a05830 b: 0x7ffc69a05850 e: 0x7ffc69a05850 e[0]: 0x7ffc69a05830 *e[0]: 1 ``` ### Build hash table 1. **Get hash value**：以 `lastName` 作為 key，丟入 hash function 得到 hash value 2. **Get hash index**：將 hash value 依照 hash table 的大小做 modulo，找到對應的 index。如大小是 4096，則 `index = (hashValue & 0xFFF)` 3. **Update linked list**：將新的 entry 放到 linked list 的頭，所以越早加入的資料，會被放到越後面 ### Search in the hash table 1. **Get hash index**：跟 build hash table 中一樣 2. **Search the entry**：在指定 index 的 linked list 中搜尋 entry 這邊我將搜尋函式分成 * `entry *findName(char lastName[], entry **pHashTable)`：用於外部呼叫，傳入整個 hash table 的指標 * `static entry *findEntry(const char value[], entry * const pHead)`：`findName` 找到 hash index 後將對應的 linked list 傳入即可。可以在這裡實作搜尋 linked list 的優化。 ### 效能分析 #### 原始 vs [Jenkins](https://en.wikipedia.org/wiki/Jenkins_hash_function#one-at-a-time) 觀察發現優化版的 `append` 所花的時間比原始版略高，推測因為其中要計算 hash value 而消耗比較多的時間。至於 `findName` 則有顯著改進，主要是因為在這裡指定的搜尋字詞為字典最後一個字，而以我更新 linked list 的方式，最後一個字會在最前頭。另外 cache-miss 在優化版本中少了 10%，branch-misses 則略高於原始版本 ![](https://i.imgur.com/zlOl9uA.png) * 原始版本 ``` Performance counter stats for './phonebook_orig' (100 runs): 875,020 cache-misses # 41.627 % of all cache refs ( +- 0.47% ) [33.30%] 2,102,031 cache-references ( +- 0.34% ) [33.88%] 266,077,121 instructions # 0.97 insns per cycle ( +- 0.26% ) [51.48%] 273,433,837 cycles ( +- 0.18% ) [50.88%] 1,536,735 branch-misses # 3.05% of all branches ( +- 0.20% ) [50.94%] 50,326,010 branches ( +- 0.10% ) [33.10%] 0.217956438 seconds time elapsed ( +- 0.17% ) ``` * 優化版本 ``` Performance counter stats for './phonebook_opt' (100 runs): 610,906 cache-misses # 33.933 % of all cache refs ( +- 0.71% ) [33.52%] 1,800,335 cache-references ( +- 0.34% ) [33.70%] 308,570,975 instructions # 1.00 insns per cycle ( +- 0.16% ) [50.48%] 309,035,581 cycles ( +- 0.13% ) [50.37%] 1,815,016 branch-misses # 3.47% of all branches ( +- 0.38% ) [50.72%] 52,364,968 branches ( +- 0.09% ) [33.47%] 0.242260209 seconds time elapsed ( +- 0.11% ) ``` #### 在 hash function 中使用 while 或 for 迴圈有趣的是，我本來以為使用 while 迴圈來取得每一個 character 效能會比較好，但結果發現 cache-reference 變高了： ``` For loop: 610,906 cache-misses # 33.933 % of all cache refs ( +- 0.71% ) [33.52%] 1,800,335 cache-references ( +- 0.34% ) [33.70%] While loop: 658,116 cache-misses # 34.765 % of all cache refs ( +- 0.58% ) [33.80%] 1,893,031 cache-references ( +- 0.28% ) [34.57%] ``` 推測使用 while loop 必須要一直檢測指向 key 的 pointer 是否為 null character，所以在每次計算迴圈會 access cache 兩次，一次是為了 while loop 條件判斷，一次是計算 hash value。而如果使用 for loop 則只要判斷目前的 iteration 有沒有超過範圍就好了。 #### 原始 vs [djb2](http://www.cse.yorku.ca/~oz/hash.html) 由於 djb2 hash function 演算法迴圈中計算次數比 jenkins 少，可以發現 `append` 的耗費時間少了約 0.03 秒。另外 cache 的 reference 次數也降低了 ![](https://i.imgur.com/dEI9fcx.png) ``` Performance counter stats for './phonebook_opt' (100 runs): 504,582 cache-misses # 29.810 % of all cache refs ( +- 0.86% ) [32.27%] 1,692,631 cache-references ( +- 0.40% ) [33.96%] 291,627,296 instructions # 1.09 insns per cycle ( +- 0.15% ) [52.19%] 267,873,460 cycles ( +- 0.19% ) [52.11%] 1,781,568 branch-misses # 3.36% of all branches ( +- 0.52% ) [51.16%] 53,045,564 branches ( +- 0.11% ) [32.76%] 0.214523920 seconds time elapsed ( +- 0.33% ) ``` #### 原始 vs [lose-lose](http://www.cse.yorku.ca/~oz/hash.html) lose-lose hash function 就更簡單，純粹將每個字元的值加起來，作為 hash value。主要進步為 cache 的 reference 次數又更少了。 ``` Performance counter stats for './phonebook_opt' (100 runs): 457,842 cache-misses # 35.023 % of all cache refs ( +- 0.56% ) [33.77%] 1,307,246 cache-references ( +- 0.59% ) [35.36%] 275,425,010 instructions # 1.04 insns per cycle ( +- 0.06% ) [52.46%] 263,738,002 cycles ( +- 0.09% ) [51.85%] 1,814,252 branch-misses # 3.38% of all branches ( +- 0.33% ) [50.02%] 53,643,758 branches ( +- 0.08% ) [31.75%] 0.205535865 seconds time elapsed ( +- 0.11% ) ``` ### Hash function 資料分佈觀察到 Jenkins 跟 djb2 分佈得很平均，約在 80~100 之間。但是 lose-lose 因為只是將字串的值做相加，所以集中分佈在某些區塊，當遇到 worst case，就要搜索 1300 多個 entry。 * Jenkins ![](https://i.imgur.com/GiaRPIN.png) * djb2 ![](https://i.imgur.com/mVhyzwh.png) * lose-lose ![](https://i.imgur.com/HQtRFSP.png) ## 表格整理 ### Zyxel * 執行時間 (sec) | hash | `append()` | `find()` | | :---- | ----: | ----: | |原始|0.141627|0.022409| |Jenkins|0.178912|0.000001| |djb2|0.150415|0.000001| |lose-lose|0.148396|0.000001| * Perf (K) | hash | CM | CR | % | Instr. | Cycles | I/C | | :---- | ----: | ----: | ----: | ----: | ----: | ----: | |原始|875|2102|41.6|266077|273433|0.97| |Jenkins|610|1800|33.9|308570|309035|1.00| |djb2|504|1692|29.8|291627|267873|1.09| |lose-lose|457|1307|35.0|275425|263738|1.04| ###### tags: `sysprog2016`