2018q1 Homework1 (phonebook)

# 2018q1 Homework1 (phonebook) contributed by <`potter903p`> ### Reviewed by `rex662624` * git commit message 好像只有標題，沒有內文敘述做了甚麼。詳閱[如何寫好 Git Commit Message](https://blog.louie.lu/2017/03/21/%E5%A6%82%E4%BD%95%E5%AF%AB%E4%B8%80%E5%80%8B-git-commit-message/) * 提到了原本的 dataset 有可能是用來做網路的 id ，但網路的 id 常見會混雜著數字，因此可以去分析收集網路遊戲常出現的 id ，做出適用於線上遊戲的 phonebook 。 * 可以詳述為何從 elf_hash 換成 bkdr_hash 後， append 的時間會下降。 ### Reviewed by `f74034067` * 可以解釋 hash function 中選擇 TABLE_SIZE 大小的理由，並且比較不同 TABLE_SIZE 對效能的影響。 ### Reviewed by `chasingfar` * 使用 hash function 的 findName 出現執行時間 0 秒的情況，相當不合理，可以試著改變 `main.c` 的實驗方式改善，例如增加 `findName()`執行次數 ## 開發環境 `$ uname -a` ``` Linux ubuntu 4.13.0-36-generic ``` `$ lscpu` ``` Architecture: x86_64 CPU 作業模式： 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 每核心執行緒數： 1 每通訊端核心數： 4 Socket(s): 1 NUMA 節點： 1 供應商識別號： GenuineIntel CPU 家族： 6 型號： 94 Model name: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz 製程： 3 CPU MHz： 200.000 CPU max MHz: 3600.0000 CPU min MHz: 800.0000 BogoMIPS: 6384.00 虛擬： VT-x L1d 快取： 32K L1i 快取： 32K L2 快取： 256K L3 快取： 6144K NUMA node0 CPU(s): 0-3 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti retpoline intel_pt rsb_ctxsw tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp ``` ## 案例討論 Phonebook 透過模擬查詢電話簿的動作來探討 [cache-miss](https://hackmd.io/s/S14L26-_l) 的問題。先透過閱讀 [Programming Small](https://hackmd.io/s/SkfffA-jZ) 來找尋改進方向。 ### 優化前首先先透過 `$ make cache-test` 來得知優化前原程式的 **cache-miss** ``` Performance counter stats for './phonebook_orig' (100 runs): 4,682,968 cache-misses # 93.690 % of all cache refs ( +- 0.18% ) 4,998,381 cache-references ( +- 0.08% ) 262,912,912 instructions # 1.42 insn per cycle ( +- 0.02% ) 184,722,190 cycles ( +- 0.10% ) 0.056613553 seconds time elapsed ( +- 0.35% ) ``` 以及執行指令所需的**時間**和**資料大小** `$ ./phonebook_orig` ``` size of entry : 136 bytes execution time of append() : 0.055452 sec execution time of findName() : 0.004388 sec ``` ### 優化後 --- * **第一種優化方式：使用體積較小的 struct** 由於查詢時只有使用到 last name 因此這裡把它獨立出來製作一個體積較小的 struct 供查詢時使用，並且再用一個指標指回原本完整的資料，藉此降低每一筆暫存在 cache 中的資料大小。修改前 ```C= typedef struct __PHONE_BOOK_ENTRY { char lastName[MAX_LAST_NAME_SIZE]; char firstName[16]; char email[16]; char phone[10]; char cell[10]; char addr1[16]; char addr2[16]; char city[16]; char state[2]; char zip[5]; struct __PHONE_BOOK_ENTRY *pNext; } entry; ``` 修改後（下方之 __PHONE_BOOK_LAST_NAME） ```C= typedef struct __PHONE_BOOK_DETAIL { char firstName[16]; char email[16]; char phone[10]; char cell[10]; char addr1[16]; char addr2[16]; char city[16]; char state[2]; char zip[5]; } detail; typedef struct __PHONE_BOOK_LAST_NAME { char lastName[MAX_LAST_NAME_SIZE]; struct __PHONE_BOOK_DETAIL *pDetail; struct __PHONE_BOOK_LAST_NAME *pNext; } entry; ``` cache miss 由原本的 **93.690%** 降低到 **66.647%** ``` Performance counter stats for './phonebook_opt' (100 runs): 1,502,093 cache-misses # 66.647 % of all cache refs ( +- 0.54% ) 2,253,802 cache-references ( +- 0.13% ) 245,431,736 instructions # 2.07 insn per cycle ( +- 0.02% ) 118,527,900 cycles ( +- 0.06% ) 0.035853028 seconds time elapsed ( +- 0.22% ) ``` 資料大小順利的由原本的 **136bytes** 降低到 **32bytes** ``` size of entry : 32 bytes execution time of append() : 0.040925 sec execution time of findName() : 0.001571 sec ``` 執行時間差異比較如下圖 ![](![](https://i.imgur.com/Ge2OPRT.png) --- * **第二種優化方式：使用 hash function** 在縮小了 struct 的大小後有了明顯的改善因此在實作 hash function 的版本時仍然沿用了縮小版的結構，而在查找資料時發現 bkdr_hash 能大大的減少碰撞的發生，而同樣屬於 hash 其中的 elf_hash 效能卻顯的較差，因此會在後段進行兩種作法的效能差異比較。 **首先是 elf_hash** ```c= unsigned int elf_hash(char *str) { unsigned int g, h = 0; int ch; while ((ch = *str++) != '\0') { h = (h << 4) + ch; if ((g = (h & 0xf0000000)) != 0) { h ^= g >> 24; h ^= g; } } return h % TABLE_SIZE; } ``` cache miss 由優化過的 **66.647%** 再下降到了 **31.659%** ``` Performance counter stats for './phonebook_hash' (100 runs): 594,410 cache-misses # 31.659 % of all cache refs ( +- 0.68% ) 1,877,542 cache-references ( +- 0.36% ) 258,716,620 instructions # 1.89 insn per cycle ( +- 0.02% ) 136,849,497 cycles ( +- 0.15% ) 0.042056547 seconds time elapsed ( +- 0.32% ) ``` ``` size of entry : 32 bytes execution time of append() : 0.052029 sec execution time of findName() : 0.000000 sec ``` 執行時間差異比較如下圖 ![](https://i.imgur.com/7alPcQI.png) **再來是 bkdr_hash** ```c= unsigned int bkdr_hash(char *str) { unsigned int seed = 131; unsigned int hash = 0; while (*str) { hash = hash * seed + (*str++); } return hash % TABLE_SIZE; } ``` cacahe miss 又從 **31.659%** 下降到了 **30.992%** ``` Performance counter stats for './phonebook_hash' (100 runs): 619,045 cache-misses # 30.922 % of all cache refs ( +- 1.51% ) 2,001,965 cache-references ( +- 0.23% ) 239,384,042 instructions # 1.84 insn per cycle ( +- 0.02% ) 130,308,336 cycles ( +- 0.21% ) 0.040043888 seconds time elapsed ( +- 0.55% ``` append 時間也從 **0.05** 秒下降到了 **0.04** 秒 ``` size of entry : 32 bytes execution time of append() : 0.039979 sec execution time of findName() : 0.000000 sec ``` 執行時間差異比較如下圖 ![](https://i.imgur.com/IMueOt3.png) --- * ### 問題回答：本例選用的 dataset (也就是輸入電話簿的姓名) 是否存在問題？ * **有代表性嗎？跟真實英文姓氏差異大嗎？** 在 `word.txt` 的開頭以及結尾中不難發現存在著一些不合常理的名字(如下) ``` aaaa aaaaa aaaaaa aaaaaaa aaaaaaaa . . . . zzzz zzzzz zzzzzz zzzzzzz zzzzzzzz ``` 所以我認為這份資料跟真實英文姓氏是有蠻大的差異，但是如果把他看做是網路上註冊會員時填寫的資訊的話又會覺得是有可能的，因為有些人並不想填上真實的個人資料所以選擇填上不真實的個人資訊，而如果網站又沒有實名制的需求時這種資料的輸入往往是無法被阻止的。 :::warning 知道有問題，為何不更換再來做實驗呢？等你！ :notes: jserv ::: * **「數大就是美」嗎？** 這端看於所紀錄的資料是要用在何種層面的，以現實生活中需要使用到的電話簿來說，數據資料的真實性以及準確性是比資料多寡來的重要的，想想看一本宣稱收錄了全台 20 萬筆個人資料的電話簿但是裏面滿滿的都是如上述那種不真實的數據或是和使用者本身（我）不相干的資料，相比之下實用性可能會比只收集了 5 萬筆成大校友的資料還來的不實用。況且過多的資料也會拖慢了搜尋時的速度，因此資料的數量不是愈多愈好，而是要考量到實際用途的需要。 * **如果我們要建立符合真實電話簿情境的輸入，該怎麼作？** 第一時間想到的方法是去戶政機關或是政府的開放資源去尋找最真實的資料，但是想了想這是關係到個人的隱私權問題，基本上是不會讓外人去隨意查詢或存取的，因此退而求其次就想到了透過查詢[線上的命名網站](http://ename.dict.cn/list/all)去找尋最接近真實生活中會使用到的人名。 [因為自動飲料機而延畢的那一年心得和啟發](https://hackmd.io/hFEEJ-fhR4KfkvPjPJbQlA) --- ## 參考資料 [ryanpatiency 開發紀錄](https://hackmd.io/s/B1SSiTM_G) [raygoah 開發紀錄](https://hackmd.io/s/r1vwfqSdf) [hash functions](https://sourceware.org/ml/binutils/2006-06/msg00424.html) [hash function 觀念和實務](https://hackmd.io/s/HJln3jU_e) [hash function 效能比較](http://blog.csdn.net/wanglx_/article/details/40300363)