# 2017q3 Homework1 (phonebook) contributed by < `changyuanhua` > ## 開發環境 * 輸入指令 ` lscpu ` ``` Architecture: x86_64 CPU 作業模式: 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 每核心執行緒數:1 每通訊端核心數:4 Socket(s): 1 NUMA 節點: 1 供應商識別號: GenuineIntel CPU 家族: 6 型號: 94 Model name: Intel(R) Core(TM) i5-6300HQ CPU @ 2.30GHz 製程: 3 CPU MHz: 799.890 CPU max MHz: 3200.0000 CPU min MHz: 800.0000 BogoMIPS: 4608.00 虛擬: VT-x L1d 快取: 32K L1i 快取: 32K L2 快取: 256K L3 快取: 6144K NUMA node0 CPU(s): 0-3 ``` ## 未優化前效能分析 * code ```clike= /* original version */ entry *findName(char lastName[], entry *pHead) { while (pHead != NULL) { if (strcasecmp(lastName, pHead->lastName) == 0) return pHead; pHead = pHead->pNext; } return NULL; } entry *append(char lastName[], entry *e) { /* allocate memory for the new entry and put lastName */ e->pNext = (entry *) malloc(sizeof(entry)); e = e->pNext; strcpy(e->lastName, lastName); e->pNext = NULL; return e; } ``` entry *findName function 是一個重頭找 lastname 到尾的程式碼 entry *append function 是一個分配記憶體以及存放 lastName 的程式碼 * 要先清空 cache ` $ echo 1 | sudo tee /proc/sys/vm/drop_caches ` 再分析效能 * 輸入指令 ` ./phonebook_orig ` ``` size of entry : 136 bytes execution time of append() : 0.046820 sec execution time of findName() : 0.005392 sec ``` 顯示執行時間,以及 entry 大小 * 當輸入指令 ` make cache-test ` 後,會顯現下面資訊,是執行效能分析 `perf stat --repeat 100 -e cache-misses,cache-references,instructions,cycles ./phonebook_orig ` ``` Performance counter stats for './phonebook_orig' (100 runs): 4,616,916 cache-misses # 92.740 % of all cache refs ( +- 0.18% ) 4,978,335 cache-references ( +- 0.07% ) 262,592,789 instructions # 1.54 insn per cycle ( +- 0.02% ) 171,063,276 cycles ( +- 0.07% ) 0.059616342 seconds time elapsed ( +- 0.23% ) ``` 可以看出 cache-misses 蠻大的 * plot ![](https://i.imgur.com/MMlYw64.png) 這圖是2個相同的程式碼,是我用來試試 gnuplot 繪圖 ## 實驗 #### 實驗1: phonebook_opt.h 檔,刪除沒有用到的陣列,減少structure大小 >請注意程式碼縮排,為四個空白[name=課程助教][color=red] >>ok, 謝謝[name=changyuanhua][color=black] * code ```clike= typedef struct __PHONE_BOOK_ENTRY { char lastName[MAX_LAST_NAME_SIZE]; /* char firstName[16]; char email[16]; char phone[10]; char cell[10]; char addr1[16]; char addr2[16]; char city[16]; char state[2]; char zip[5]; */ struct __PHONE_BOOK_ENTRY *pNext; } entry; ``` * size 縮減 ``` size of entry : 24 bytes execution time of append() : 0.079211 sec execution time of findName() : 0.003613 sec ``` 因為把目前沒用的到陣列註解掉,所以 entry size 下降,但我發現下降的 size 跟他的陣列大小並沒有相對應,像是我單獨註解 char state[2] 整體的 size 就沒有下降,依然為136 bytes * cache misses ``` Performance counter stats for './phonebook_opt' (100 runs): 1,010,356 cache-misses # 62.711 % of all cache refs ( +- 0.70% ) 1,611,123 cache-references ( +- 0.12% ) 241,093,207 instructions # 2.21 insn per cycle ( +- 0.02% ) 108,989,869 cycles ( +- 0.07% ) 0.037727117 seconds time elapsed ( +- 0.51% ) ``` * plot ![](https://i.imgur.com/9i3vt7O.png) * 觀察與結論 與原始檔案相互比較,可以看到 cache misses 值明顯的下降(92.740% -> 62.711%),也可以看到 size of entry 從 136 bytes -> 24 bytes ,以及 append() 和 findName() 的時間下降 >記得push回repo歐 [name=亮谷][color=#0fff56] >ok,謝謝[name=changyuanhua][color=black] #### 實驗2: phonebook_opt.h 檔,減少structure大小 * code ```clike= typedef struct __PHONE_BOOK_DETAIL { char firstName[16]; char email[16]; char phone[10]; char cell[10]; char addr1[16]; char addr2[16]; char city[16]; char state[2]; char zip[5]; } phonebookdetail; typedef struct __PHONE_BOOK_ENTRY { char lastName[MAX_LAST_NAME_SIZE]; phonebookdetail *detail; struct __PHONE_BOOK_ENTRY *pNext; } entry; ``` * size 縮減 ``` size of entry : 32 bytes execution time of append() : 0.032439 sec execution time of findName() : 0.001829 sec ``` * cache misses ``` Performance counter stats for './phonebook_opt' (100 runs): 1,425,648 cache-misses # 63.673 % of all cache refs ( +- 0.48% ) 2,239,025 cache-references ( +- 0.14% ) 244,508,202 instructions # 2.13 insn per cycle ( +- 0.02% ) 114,603,354 cycles ( +- 0.10% ) 0.039536986 seconds time elapsed ( +- 0.72% ) ``` * plot ![](https://i.imgur.com/QT1ZJcs.png) * 觀察與結論 與原始檔案相互比較,可以看到 cache misses 值明顯的下降(92.740% -> 63.673%),也可以看到 size of entry 從 136 bytes -> 32 bytes ,以及 append() 和 findName() 的時間下降,再與實驗一做比較,除了 size of entry 值,可以發現並沒有明顯的差異,只是稍長一些,因為 size of entry 值變大,但有保有其他資訊,比較符合真實電話簿的性能 #### 實驗3:DKBRHash * code ```cstyle= unsigned int BKDRHash(char lastName[]) { unsigned int seed=131; // 31 131 1313 13131 etc.. unsigned int hash=0; while(*lastName) hash = hash * seed + (*lastName++); return hash; } ``` * ``` size of entry : 136 bytes execution time of append() : 0.047629 sec execution time of findName() : 0.000001 sec ``` * cache misses ``` Performance counter stats for './phonebook_hash' (100 runs): 1,243,609 cache-misses # 47.014 % of all cache refs ( +- 0.10% ) 2,645,181 cache-references ( +- 0.10% ) 262,250,871 instructions # 1.74 insn per cycle ( +- 0.02% ) 150,897,816 cycles ( +- 0.05% ) 0.051010663 seconds time elapsed ( +- 0.26% ) ``` * plot ![](https://i.imgur.com/tKjkihJ.png) * 觀察與結論 與原始檔案相互比較,可以看到 cache misses 值明顯的下降(92.740% -> 47.014%),也可以看到 findName() 的時間顯著下降,但 append() 時間卻略為上升,因為多了些額外的運算 #### 實驗4:DKBRHash + structure 減少 * ``` size of entry : 32 bytes execution time of append() : 0.042075 sec execution time of findName() : 0.000001 sec ``` * cache miss ``` Performance counter stats for './phonebook_hash' (100 runs): 470,619 cache-misses # 31.964 % of all cache refs ( +- 0.22% ) 1,472,352 cache-references ( +- 0.19% ) 244,445,874 instructions # 1.98 insn per cycle ( +- 0.02% ) 123,749,401 cycles ( +- 0.06% ) 0.042495388 seconds time elapsed ( +- 0.27% ) ``` * plot ![](https://i.imgur.com/0BDdPG0.png) * 觀察與結論 與原始檔案相互比較,可以看到 cache misses 值明顯的下降(92.740% -> 31.964%),也可以看到 size of entry 從 136 bytes -> 32 bytes ,以及 findName() 的時間顯著下降,但 append() 時間卻略為上升 ## 問題討論 1.回答以下問題:本例選用的 dataset (也就是輸入電話簿的姓名) 是否存在問題? * 有代表性嗎?跟真實英文姓氏差異大嗎? 在 word.txt 中可以看到很多字都是無意義的,只是英文字母的排列組合 * 資料難道「數大就是美」嗎? 並非數大就是美,我認為是看你目前的需求,因為當資料量龐大資訊齊全,你所需要花費搜尋的時間也較多,而假設你只需要姓名與電話,但資料中多了生日與地址,你將多花費不必要的時間 * 如果我們要建立符合真實電話簿情境的輸入,該怎麼作? 電話簿裡應該要是真實的英文姓名,而不是一些英文字母的排列組合 ## 參考資料 [nekoneko](https://hackmd.io/s/rJCs_ZVa#) [hash function](https://hackmd.io/s/HJln3jU_e#) [bkdrhash](http://blog.csdn.net/wanglx_/article/details/40400693) [jkrvivian](https://hackmd.io/s/SyOQgOQa#) [hash table 簡單說明](http://www.csie.ntnu.edu.tw/~u91029/Set.html) [hash table 說明](http://alrightchiu.github.io/SecondRound/hash-tableintrojian-jie.html) [字符串 hash 函數比較](https://www.byvoid.com/zhs/blog/string-hash-compare) [BKDRhash 實現](http://www.codeceo.com/article/hash-based-on-bkdrhash.html)