Phonebook - HackMD

# Phonebook * Original version 把實做交錯在 main.c 與 phonebook_orig.c，導致後續要加入優化版本較不方便，目前有兩項解決辦法： 1. 在 main.c 裡將各自相關的部份使用巨集判斷式隔離，在編譯指令加入定義`gcc -D`，決定該用哪個版本，也可定義在標頭檔。這方法會在 main.c 放一堆巨集判斷式，看起來非常凌亂。 2. main.c 裡使用一致介面，定義實做在對應 .c 檔內： ```c void *init(); void *findName(char [], void *); void *append(char [], void *); void memory_free(void *); ``` 這個方式會需要對 original version 稍微進行修改。為了拿掉 entry *e 這個指標，在 append 時選擇對頭端進行，鍊結方式變成反向，而搜尋字串則是等效改成搜尋 "aaaaaaugh"。 * **Perf** (Performance Event) * **cache-misses**: Usually this indicates **Last Level Cache** misses. * **cache-references**: Usually this indicates **Last Level Cache** accesses but this may vary depending on your CPU. Cache misses 並不是所有種類的 cache miss 次數的總和，而是最後一層 cache 的 miss 次數。我認為這是優化版本其 cache-references 次數會降低的原因。透過更聰明的程式編排，能在更上層的 cache (smaller but faster) 裡找到需要的資料，所以參照 Last Level Cach 的次數與在此發生 miss 的次數也就跟著降低。 **References**: * http://web.eece.maine.edu/~vweaver/projects/perf_events/perf_event_open.html * http://wiki.csie.ncku.edu.tw/embedded/perf-tutorial * Original version 的動態記憶體釋放應該有誤： ```c if(pHead->pNext) free(pHead->pNext); free(pHead); ``` * __builtin___clear_cache(char *begin, char *end); This function is used to flush the processor's **instruction cache** for the region of memory between begin inclusive and end exclusive. * **Gnuplot** command for element distribution ``` > set title 'element distribution' > set xrange[0 : 65535] > set terminal png > set output 'xxx.png' > plot "xxx.txt" with points pointtype 0 notitle ``` ## Original Version ``` size of entry : 136 bytes execution time of append() : 0.035750 sec execution time of findName() : 0.003906 sec ``` ``` Performance counter stats for './phonebook_orig' (100 runs): 2,010,723 cache-misses #87.332% of all cache refs ( +- 0.13% ) 2,302,381 cache-references ( +- 0.07% ) 387,959,515 instructions #1.51 insns per cycle ( +- 0.01% ) 256,717,122 cycles ( +- 0.13% ) 0.070331387 seconds time elapsed ( +- 1.31% ) ``` ## Optimized Version ### Hash 以下使用了三種雜湊函式，且分別對原始版本進行效能比較。雜湊表大小為 **2^16^** (65536) ，碰撞(collision)的情形以鍊結串列方式處理。於 Makefile 內編譯選項定義使用雜湊函式符號，全域函數指標 ```uint32_t (*hash_func)(unsigned char *)```會指向對應函式。以下為個雜湊函式優化情況： * **Fowler–Noll–Vo**(**FNV**) FNV hashes are designed to be **fast** while maintaining a **low collision rate**. The high dispersion of the FNV hashes makes them well suited for hashing nearly identical **strings** such as URLs, hostnames, filenames, text, IP addresses, etc. ``` size of entry : 32 bytes execution time of append() : 0.052116 sec execution time of findName() : 0.000000 sec ``` ``` Performance counter stats for './phonebook_opt' (100 runs): 820,913 cache-misses #58.291% of all cache refs ( +- 0.16% ) 1,408,308 cache-references ( +- 0.12% ) 280,872,352 instructions #1.10 insns per cycle ( +- 0.02% ) 255,427,037 cycles ( +- 0.28% ) 0.043733445 seconds time elapsed ( +- 1.57% ) ``` ![](https://i.imgur.com/SdlnQ4Z.png) ![](https://i.imgur.com/HMMvE3K.png) * **DJB2** ``` size of entry : 32 bytes execution time of append() : 0.048434 sec execution time of findName() : 0.000000 sec ``` ``` Performance counter stats for './phonebook_opt' (100 runs): 779,938 cache-misses #57.892% of all cache refs ( +- 0.12% ) 1,347,239 cache-references ( +- 0.09% ) 291,625,741 instructions #1.28 insns per cycle ( +- 0.02% ) 227,187,432 cycles ( +- 0.43% ) 0.076748609 seconds time elapsed ( +- 2.21% ) ``` ![](https://i.imgur.com/SA16jq6.png) ![](https://i.imgur.com/qkeYO43.png) * **SDBM** ``` size of entry : 32 bytes execution time of append() : 0.041675 sec execution time of findName() : 0.000000 sec ``` ``` Performance counter stats for './phonebook_opt' (100 runs): 862,401 cache-misses #64.358% of all cache refs ( +- 0.13% ) 1,340,008 cache-references ( +- 0.09% ) 294,506,718 instructions #1.18 insns per cycle ( +- 0.02% ) 248,938,359 cycles ( +- 0.22% ) 0.067740143 seconds time elapsed ( +- 1.24% ) ``` ![](https://i.imgur.com/wN1qlfO.png) ![](https://i.imgur.com/Pe3ahpw.png) **Source of this article:** https://hackmd.io/s/BJBBVaqT **Reference:** * FNV_1a_16: http://isthe.com/chongo/tech/comp/fnv/ * DJB2 & SDBM: http://www.cse.yorku.ca/~oz/hash.html