2017q3 Homework1 (phonebook)

# 2017q3 Homework1 (phonebook) contributed by <`louis222220`> ## [Github](https://github.com/louis222220/phonebook) ## 開發環境 - 筆電, i7-2630QM - 四核8執行緒 - 8GB記憶體 - Ubuntu 16.04 `$ lspci` ``` Architecture: x86_64 CPU 作業模式： 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 每核心執行緒數：2 每通訊端核心數：4 Socket(s): 1 NUMA 節點： 1 供應商識別號： GenuineIntel CPU 家族： 6 型號： 42 Model name: Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz 製程： 7 CPU MHz： 799.926 CPU max MHz: 2900.0000 CPU min MHz: 800.0000 BogoMIPS: 3991.31 虛擬： VT-x L1d 快取： 32K L1i 快取： 32K L2 快取： 256K L3 快取： 6144K NUMA node0 CPU(s): 0-7 ``` ## 理解原本的程式 ### 首先有幾個東西要先去看懂 1. 工具部份： 1. Perf 的使用 2. gnuplot 的使用 2. 程式部份： 1. Makefile 2. main.c 3. calculate.c 3. phonebook_orig.[ch] 4. 各個程式的 output 3. 基礎觀念 1. [Cache](https://hackmd.io/s/S14L26-_l) ### 實際執行 - 原本的程式有72%的 cache miss ``` Performance counter stats for './phonebook_orig' (100 runs): 1,441,116 cache-misses # 72.629 % of all cache refs ( +- 0.23% ) 1,984,214 cache-references ( +- 0.21% ) 271,843,344 instructions # 1.31 insn per cycle ( +- 0.02% ) 207,662,600 cycles ( +- 0.23% ) 0.079798692 seconds time elapsed ( +- 0.46% ) ``` - 平均的執行時間 - append(): 0.057s - findname(): 0.0063s - 看`main.c`可以發現，`findname()`所尋找的字串是`zyxel`，如果字典`append()`的方式是依照字母順序排列，`zyxel`會形成 worst case ![](https://i.imgur.com/xwyjtqv.png) --- ## 可修改的方向根據 [phonebook](https://hackmd.io/s/HJJUmdXsZ) 所提到的，考慮以下幾個方向的實作 - 使用 binary search tree 或其他資料結構改寫 - 藉由 hash function 來加速查詢 - 使用字串壓縮的演算法，降低資料表示的成本 - 建立 phone book 時，既然 lastName 是索引值，可以優先建立搜尋表格，其他資料可稍後再補上 --- ## `Hash Table` 實作 ~~(超過作業截止時間了)~~ - 嘗試了用`Hash Table`重建 phonebook，對 C 的 struct, malloc, pointer array 的使用還不太熟，花了不少時間 - 關於字串的 Hash Function 參考了 [Hash Functions](http://www.cse.yorku.ca/~oz/hash.html) 的 `djb2` 演算法，計算出的 hash value(`unsigned long int`)再取餘數，餘數最初嘗試用 `10,000` - `djb2`: 字串中每個 `char` 以$c_0...c_i...c_n$表示 $i=0: hash_0 = 0$ $i>0: hash_i = hash_{i-1} \times 33 + c_{i-1}$ $hash value = hash_{n+1}$ - `append()` 字串取出來的 Hash value 一定會出現 `collision`，因此 `bucket` 寫成 `linked list`，當目前取到的 `bucket` 已有存放值，則前往 list 中的下一個 bucket - `findName()` 取出 hash value，取 hash table 中相對應的 bucket，依序對 list 中的 lastName 做比較 **實際寫完，執行 `./phonebook_opt` 時，發現 `append()` 跟 `findName()` 的時間都超乎想像** - `append()` 0.5秒，幾乎接近了 `orig` 版本的10倍時間 - `findName()`0.000001秒，卻只用了 `orig` 的0.016% ``` ./phonebook_opt (Hash 餘數 Modulo 取10,000) execution time of append() : 0.548063 sec execution time of findName() : 0.000001 sec ./phonebook_orig execution time of append() : 0.061525 sec execution time of findName() : 0.006193 sec ``` 又嘗試了調整 Modulo 的數字大小，改成`100,000`，`append()`的時間就縮短了，不過依然大約是 `orig` 的3倍時間 ``` ./phonebook_opt (Hash 餘數 Modulo 取100,000) execution time of append() : 0.176066 sec execution time of findName() : 0.000000 sec ``` 用 `$ make plot` 畫出來的圖 ![](https://i.imgur.com/C3aC5gm.png) `perf stat` 的值 cache misses 的數量增加了，但是比例下降，猜測是`append()`的時間上升了，而 `findName()` 所需效能節省到原本的0.02% ``` Performance counter stats for './phonebook_orig' (100 runs): 1,453,152 cache-misses # 72.732 % of all cache refs ( +- 0.20% ) 1,997,960 cache-references ( +- 0.15% ) 271,930,707 instructions # 1.31 insn per cycle ( +- 0.02% ) 207,438,507 cycles ( +- 0.15% ) 0.079685426 seconds time elapsed ( +- 0.38% ) ``` ``` Performance counter stats for './phonebook_opt' (100 runs): 1,848,849 cache-misses # 51.985 % of all cache refs ( +- 0.14% ) 3,556,481 cache-references ( +- 0.07% ) 296,866,655 instructions # 0.67 insn per cycle ( +- 0.02% ) 444,283,193 cycles ( +- 0.16% ) 0.166358651 seconds time elapsed ( +- 0.26% ) ``` - 應該要再找出有什麼部份可以使 `append()` 的時間縮短 - 可以嘗試讓 Hash Table存有每一個 list 的最後一個 node - 嘗試改成 `List` 指向頭跟尾的兩個 `bucket` 的確縮短了`append()`的時間 ![](https://i.imgur.com/uMxR3lP.png) - `cache miss`也確實減少了 ``` Performance counter stats for './phonebook_opt' (100 runs): 621,450 cache-misses # 41.212 % of all cache refs ( +- 0.79% ) 1,507,930 cache-references ( +- 0.15% ) 279,136,739 instructions # 1.04 insn per cycle ( +- 0.02% ) 267,269,180 cycles ( +- 0.44% ) 0.101254126 seconds time elapsed ( +- 0.50% ) ``` --- ## Reference 1. [cache 原理和實驗](https://hackmd.io/s/S14L26-_l) 1. [Hash Functions](http://www.cse.yorku.ca/~oz/hash.html)