2017q3 Homework1 (phonebook)

contributed by <`hfming225`>

開發環境

OS:ubuntu 16.04 LTS

$ lscpu

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 94
Model name:            Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
Stepping:              3
CPU MHz:               799.987
CPU max MHz:           4000.0000
CPU min MHz:           800.0000
BogoMIPS:              6816.00
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-7

原始版本

執行原始版本的執行檔
$ ./phonebook_orig

結果輸出

size of entry : 136 bytes
execution time of append() : 0.045451 sec
execution time of findName() : 0.005975 sec

利用 gun plot 畫出圖表比較
$ make plot

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

執行 perf 測試 cache 的使用情況
$ perf stat --repeat 100 -e cache-misses,cache-references,instructions,cycles ./phonebook_orig

Performance counter stats for './phonebook_orig' (100 runs):

          474,7523    cache-misses     # 94.549 % of all cache refs 
          502,1225    cache-references
       2,6221,1698    instructions     # 1.22  insn per cycle 
       2,1522,4338    cycles                                            
       
       0.062030741 seconds time elapsed

發現 cache miss 發生機率很高，32K byte 的 L1 cache 面對大量資料完全不夠

優化策略：減少 cache miss

優化

1.縮小 struct 體積

主要找的只有 lastName 而已，但因為原本的結構體中包含其他資訊，從原版的輸出結果也知道他的結構體有 size of entry : 136 bytes
~~最簡單的方法就是將他們都註解掉~~，但這樣一來，這本電話簿也沒用處

所以不影響功能的情況下，定義一個新的結構體 info 利用指標的方式定指到記憶體

















typedef struct __PHONE_BOOK_INFO {
    char firstName[16];
    char email[16];
    char phone[10];
    char cell[10];
    char addr1[16];
    char addr2[16];
    char city[16];
    char state[2];
    char zip[5];
} info;

typedef struct __PHONE_BOOK_ENTRY {
    char lastName[MAX_LAST_NAME_SIZE];
    info *data;
    struct __PHONE_BOOK_ENTRY *pNext;
} entry;

執行優化版本的執行檔
$ ./phonebook_opt1

結果輸出

size of entry : 32 bytes
execution time of append() : 0.055854 sec
execution time of findName() : 0.001749 sec

發現結構體大小減少到 32 bytes

Performance counter stats for './phonebook_opt1' (100 runs):

          164,9669   cache-misses        #70.497 % of all cache refs  
          234,0056   cache-references                           
       2,4459,1679   instructions        #1.79  insn per cycle 
       1,3664,7824   cycles                                               
       0.038676563 seconds time elapsed

cache-misses 從

94.549 %

降低到

70.497 %

利用 gun plot 畫出圖表比較
make plot

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

2.字首作為index

經過第一部的優化後，由於結構體的減小，在 append 與 find 上都有一定的提升
接下來，我參考像字典一樣的查找方式，以每一個字的第一個字母分類

建立 (append)： 多條連結表(link-list)來作為資料的儲存，將同樣字首的用連結表(link-list)放在一起
尋找 (find) : 先找到對應的一隻連結表(link-list)，接著查找那隻連結表(link-list)，以加快速度

利用 gun plot 畫出圖表比較
make plot

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

        53,6310    cache-misses    #70.457 % of all cache refs
        76,1186    cache-references
    1,9597,0674    instructions    #1.67 insn per cycle
    1,1747,5227    cycles    
    0.032382626 seconds time elapsed

cache-misses 次數從

164, 9669

降低到

53, 6310

3.hash function

雖然第二版的優化，已經有不錯的時間降低，但是 find 時間偏高，我認為在實際狀況中，查詢的次數會比較多，而且一連結表的資料結構來建立已經非常的容易加入新資料，所以接下來引入 hash function 建立資料，來減少 find 時間

利用 lastName 的字串當做搜尋的 key ，在 append() 運用到 hash function，建立一個 hash table 提供更快速的搜尋

 Performance counter stats for './phonebook_opt' (100 runs):

           67,9381      cache-misses  # 26.860 % of all cache refs
          252,9378      cache-references 
       2,4348,9124      instructions  # 1.64  insn per cycle
       1,4806,5325      cycles  
       0.040450149 seconds time elapsed

cache-misses 次數雖然較於法（二）提升，但是比例減少很多，從圖表中我們知道，建立資料庫的時間提升了，但是在查詢時間減少很多，這也說明了 cache-miss 多數在建立時發生，所以相較於 cache-refs 比例也減少

利用 gun plot 畫出圖表比較
make plot

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

修改 runtime.gp 計算總合，可以 gain 到好處

參考資料
perf
hash

2017q3 Homework1 (phonebook)

contributed by <hfming225>

開發環境

OS:ubuntu 16.04 LTS

原始版本

優化

1.縮小 struct 體積

2.字首作為index

3.hash function

contributed by <`hfming225`>