# Phonebook
* Original version 把實做交錯在 main.c 與 phonebook_orig.c,導致後續要加入優化版本較不方便,目前有兩項解決辦法:
1. 在 main.c 裡將各自相關的部份使用巨集判斷式隔離,在編譯指令加入定義`gcc -D`,決定該用哪個版本,也可定義在標頭檔。這方法會在 main.c 放一堆巨集判斷式,看起來非常凌亂。
2. main.c 裡使用一致介面,定義實做在對應 .c 檔內:
```c
void *init();
void *findName(char [], void *);
void *append(char [], void *);
void memory_free(void *);
```
這個方式會需要對 original version 稍微進行修改。為了拿掉 entry *e 這個指標,在 append 時選擇對頭端進行,鍊結方式變成反向,而搜尋字串則是等效改成搜尋 "aaaaaaugh"。
* **Perf** (Performance Event)
* **cache-misses**: Usually this indicates **Last Level Cache** misses.
* **cache-references**: Usually this indicates **Last Level Cache** accesses but this may vary depending on your CPU.
Cache misses 並不是所有種類的 cache miss 次數的總和,而是最後一層 cache 的 miss 次數。我認為這是優化版本其 cache-references 次數會降低的原因。透過更聰明的程式編排,能在更上層的 cache (smaller but faster) 裡找到需要的資料,所以參照 Last Level Cach 的次數與在此發生 miss 的次數也就跟著降低。
**References**:
* http://web.eece.maine.edu/~vweaver/projects/perf_events/perf_event_open.html
* http://wiki.csie.ncku.edu.tw/embedded/perf-tutorial
* Original version 的動態記憶體釋放應該有誤:
```c
if(pHead->pNext) free(pHead->pNext);
free(pHead);
```
* __builtin___clear_cache(char *begin, char *end);
This function is used to flush the processor's **instruction cache** for the region of memory between begin inclusive and end exclusive.
* **Gnuplot** command for element distribution
```
> set title 'element distribution'
> set xrange[0 : 65535]
> set terminal png
> set output 'xxx.png'
> plot "xxx.txt" with points pointtype 0 notitle
```
## Original Version
```
size of entry : 136 bytes
execution time of append() : 0.035750 sec
execution time of findName() : 0.003906 sec
```
```
Performance counter stats for './phonebook_orig' (100 runs):
2,010,723 cache-misses #87.332% of all cache refs ( +- 0.13% )
2,302,381 cache-references ( +- 0.07% )
387,959,515 instructions #1.51 insns per cycle ( +- 0.01% )
256,717,122 cycles ( +- 0.13% )
0.070331387 seconds time elapsed ( +- 1.31% )
```
## Optimized Version
### Hash
以下使用了三種雜湊函式,且分別對原始版本進行效能比較。雜湊表大小為 **2^16^** (65536) ,碰撞(collision)的情形以鍊結串列方式處理。
於 Makefile 內編譯選項定義使用雜湊函式符號,全域函數指標 ```uint32_t (*hash_func)(unsigned char *)```會指向對應函式。
以下為個雜湊函式優化情況:
* **Fowler–Noll–Vo**(**FNV**)
FNV hashes are designed to be **fast** while maintaining a **low collision rate**. The high dispersion of the FNV hashes makes them well suited for hashing nearly identical **strings** such as URLs, hostnames, filenames, text, IP addresses, etc.
```
size of entry : 32 bytes
execution time of append() : 0.052116 sec
execution time of findName() : 0.000000 sec
```
```
Performance counter stats for './phonebook_opt' (100 runs):
820,913 cache-misses #58.291% of all cache refs ( +- 0.16% )
1,408,308 cache-references ( +- 0.12% )
280,872,352 instructions #1.10 insns per cycle ( +- 0.02% )
255,427,037 cycles ( +- 0.28% )
0.043733445 seconds time elapsed ( +- 1.57% )
```


* **DJB2**
```
size of entry : 32 bytes
execution time of append() : 0.048434 sec
execution time of findName() : 0.000000 sec
```
```
Performance counter stats for './phonebook_opt' (100 runs):
779,938 cache-misses #57.892% of all cache refs ( +- 0.12% )
1,347,239 cache-references ( +- 0.09% )
291,625,741 instructions #1.28 insns per cycle ( +- 0.02% )
227,187,432 cycles ( +- 0.43% )
0.076748609 seconds time elapsed ( +- 2.21% )
```


* **SDBM**
```
size of entry : 32 bytes
execution time of append() : 0.041675 sec
execution time of findName() : 0.000000 sec
```
```
Performance counter stats for './phonebook_opt' (100 runs):
862,401 cache-misses #64.358% of all cache refs ( +- 0.13% )
1,340,008 cache-references ( +- 0.09% )
294,506,718 instructions #1.18 insns per cycle ( +- 0.02% )
248,938,359 cycles ( +- 0.22% )
0.067740143 seconds time elapsed ( +- 1.24% )
```


**Source of this article:** https://hackmd.io/s/BJBBVaqT
**Reference:**
* FNV_1a_16: http://isthe.com/chongo/tech/comp/fnv/
* DJB2 & SDBM: http://www.cse.yorku.ca/~oz/hash.html