2016q3 Homework1 (phonebook)

contributed by <jeff6907>

請依照格式填寫標題和"contributed by"課程助教

預期目標

準備 GNU/Linux 開發工具
學習使用 Git 與 GitHub
學習效能分析工具
研究軟體最佳化

首先安裝開發環境 Lubuntu 16.04

先下載官方64bit安裝的ISO檔案
上網查如何建立一個開機USB 製作開機USB教學
進入BIOS選擇 UEFI USB開機依序進行安裝動作
開發環境建置完成

lscpu    查看CPU、快取內容 

CPU:Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz

L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K

下載開發編寫所需套件之前使用Vim 所以額外安裝

$ sudo apt-get update
$ sudo apt-get install build-essential
$ sudo apt-get install linux-tools-common linux-tools-generic
$ sudo apt-get install astyle colordiff gnuplot

$ sudo apt-get install vim

閱讀git基本教學
完成ssh綁定並fork專案 github
閱讀perf、gnuplot工具教學，觀看課程解說

Phonebook 分析

未優化前探討

$ make run

size of entry : 136 bytes
execution time of append() : 0.049215 sec
execution time of findName() : 0.004698 sec

$ make cache-test

  Performance counter stats for './phonebook_orig' (100 runs):

          191,9147      cache-misses              #   94.189 % of all cache refs      ( +-  0.64% )
          203,7547      cache-references                                              ( +-  0.63% )
       2,6132,5661      instructions              #    1.44  insns per cycle          ( +-  0.02% )
       1,8125,5269      cycles                                                        ( +-  0.31% )

       0.083815399 seconds time elapsed                                          ( +-  1.72% )

catche-miss 高達94% 代表大部分時間都是浪費掉的

運作中如何發生miss?硬體操作應該是怎樣進行
以前計算機結構學過的一些知識幾乎都忘光了，要找時間好好惡補一下
淺談memory cache

優化版本測試1

結構體的有效資料是一個問題，在搜尋過程中大部分的資料幾乎沒用到，只針對last name搜尋，嘗試改寫結構測試。

註解沒打開圖形一直跑出原本的樣子卻又看不出來那邊有問題差點崩潰 QQ

#define OPT 1

把用不到的資料成員自己存放在detail_entry，縮小搜尋的結構。













typedef struct __PHONE_BOOK_ENTRY {
    char lastName[MAX_LAST_NAME_SIZE];
    char firstName[16];
    char email[16];
    char phone[10];
    char cell[10];
    char addr1[16];
    char addr2[16];
    char city[16];
    char state[2];
    char zip[5];
    struct __PHONE_BOOK_ENTRY *pNext;
} detail_entry;






typedef struct _LASTNAME_ENTRY {
    char lastName[MAX_LAST_NAME_SIZE];
    detail_entry *detail;
    struct _LASTNAME_ENTRY *pNext;
} entry;

原本錯誤的資料

Performance counter stats for './phonebook_opt' (100 runs):
2,4047 cache-misses # 37.837 % of all cache refs ( +- 0.79% ) 6,3555 cache-references ( +- 0.58% ) 9177,8996 instructions # 1.68 insns per cycle ( +- 0.00% ) 5456,2927 cycles ( +- 0.24% ) 0.015713119 seconds time elapsed

結構改完原先94.189% 現在優化後37.837% cache-miss 大幅度降低
append()效能從0.067 -> 0.014 將近 4倍
findName()效能從 0.0058 ->0.000006 約960倍！！！

phonebook_opt.c 只有把phonebook_orig.c 填過去
但是看了幾位同學優化結構體出來的效果，
findName下降幅度沒有這麼誇張，覺得有點納悶陳致佑
再重新檢查後發現code寫錯誤 *append() 內一開始就回傳 NULL 所以後續根本沒有進行動作。

撰寫過程細節要注意！！！

修訂後正常版本的資料

size of entry : 32 bytes
execution time of append() : 0.058506 sec
execution time of findName() : 0.003626 sec

資料size從 136bytes 降為32bytes

 Performance counter stats for './phonebook_opt' (100 runs):

           33,0411      cache-misses              #   76.053 % of all cache refs      ( +-  0.25% )
           43,4447      cache-references                                              ( +-  0.34% )
       2,0402,4276      instructions              #    1.75  insns per cycle          ( +-  0.02% )
       1,1676,7439      cycles                                                        ( +-  0.30% )

       0.036660150 seconds time elapsed                                          ( +-  2.52% )

結構改完cache從94%降為76%
append()效能 0.0675 -> 0.0329
findName() 0.059 -> 0.0022
效能進步程度約一半

優化版本測試2

根據老師提示嘗試HASH的作法，先查資料複習hash概念跟作法
發現自己對於hash觀念非常薄弱，概念好像有聽過但卻很模糊
可能是過去沒有真正修過演算法，近日趕緊找線上課程補回來

透過關鍵字建立一個雜湊表，而不用完整一一比對目標資料，可以加速查詢的時間速度
依照不同得需求有許多演算法可以參考(應該要好好熟讀並嘗試實作看看T.T)

看了dada跟louielu的資料，都推薦使用BKDR hash function的演算法；查了一些資料發現還是有點不太懂，其他演算法也是蠻相近的，使用不同的乘數跟加法或者xor方法，這些

BKDR演算法效能比較高是因為使用 131 這個質數?

先試著模仿網路的程式碼實作看看效能









unsigned int BKDRHash(char *str)
{
    unsigned int hash = 0;
    while (*str) {
        hash = hash * 131 + (*str++);
    }
	
    return (hash & 0x7FFFFFFF);
}

參考dada的共筆資料
 參考louielu的共筆資料
 中文hash table wiki
hashc函式衝突率分析
 建中培訓講義

size of entry : 32 bytes
execution time of append() : 0.055206 sec
execution time of findName() : 0.000040 sec

 Performance counter stats for './phonebook_hash' (100 runs):

           32,1662      cache-misses              #   50.916 % of all cache refs      ( +-  0.10% )
           63,1753      cache-references                                              ( +-  0.32% )
       2,2877,0781      instructions              #    1.66  insns per cycle          ( +-  0.02% )
       1,3750,3651      cycles                                                        ( +-  0.74% )

       0.058662636 seconds time elapsed                                          ( +-  2.78% )

cache-miss 降為 50.916%

append() 上升幅度差異不大 0.0563
findName() 時間降到 0.000041

同樣的code 上面hashtable size 設定為1024 底下為4096
findName() 效能卻進步快一半，愈高愈好？還是有最佳化的範圍值？
待查詢驗證…

size of entry : 32 bytes
execution time of append() : 0.036046 sec
execution time of findName() : 0.000017 sec

 Performance counter stats for './phonebook_hash' (100 runs):

           32,5422      cache-misses              #   43.617 % of all cache refs      ( +-  0.44% )
           74,6086      cache-references                                              ( +-  0.15% )
       2,2930,0036      instructions              #    1.67  insns per cycle          ( +-  0.02% )
       1,3771,3216      cycles                                                        ( +-  0.17% )

       0.058282629 seconds time elapsed                                          ( +-  2.47% )

Hashtable size = 16384

size of entry : 32 bytes
execution time of append() : 0.036036 sec
execution time of findName() : 0.000010 sec

 Performance counter stats for './phonebook_hash' (100 runs):

           35,6412      cache-misses              #   36.596 % of all cache refs      ( +-  0.54% )
           97,3921      cache-references                                              ( +-  0.11% )
       2,3148,5077      instructions              #    1.60  insns per cycle          ( +-  0.02% )
       1,4431,1873      cycles                                                        ( +-  0.16% )

       0.064716269 seconds time elapsed                                          ( +-  2.40% )

Hashtable size = 32768

size of entry : 32 bytes
execution time of append() : 0.053946 sec
execution time of findName() : 0.000010 sec

 Performance counter stats for './phonebook_hash' (100 runs):

           44,1435      cache-misses              #   38.960 % of all cache refs      ( +-  0.26% )
          113,3038      cache-references                                              ( +-  0.13% )
       2,3406,4289      instructions              #    1.51  insns per cycle          ( +-  0.02% )
       1,5466,7970      cycles                                                        ( +-  0.22% )

       0.065952826 seconds time elapsed                                          ( +-  2.30% )

發現提高 table的SIZE可以提高執行速率，並降低 cache-miss
但是高於某一個值 cache-miss反而開始上升，並不是愈高愈好
真正原因 –– 查

設定 gnuplot

Makefile 先設定完成 
了解make plot 執行什麼動作
plot: output.txt
        gnuplot scripts/runtime.gp
$ cd scripts
編輯 runtime.gp

reset
set ylabel 'time(sec)'
set style fill solid
set title 'perfomance comparison'
set term png enhanced font 'verdana,10'
set output 'runtime.png'

plot [:][:0.100]'output.txt'\
    using 2:xtic(1) with histogram title 'original',        \
''  using 3:xtic(1) with histogram title 'optimized',       \
''  using 4:xtic(1) with histogram title 'BKDRHash',        \
''  using ($0-0.2):($2+0.008):2 with labels title ' ',      \
''  using ($0+0.1):($3+0.006):3 with labels title ' ',      \
''  using ($0+0.4):($4+0.004):4 with labels title ' ',

參數要自己調整範圍，讓數字不要重疊再一起