2016q3 Homework1 (phonebook)

contributed by <andy19950>

Reviewed by `jserv`

沒有嘗試不同的 data set，原本的程式輸入是已排序、非典型英文姓氏，這與現實不匹配
在 append() 中，malloc() 是個顯著的時間開銷，缺乏減緩效能衝擊的方案，而且沒考慮到 malloc() 失敗的情況
- 在上圖的環境中，可用記憶體不足以載入 35 萬筆電話資料，於是連 phonebook_orig 執行都會失敗:

$ ./phonebook_orig 
size of entry : 136 bytes
程式記憶體區段錯誤

卻乏搜尋演算法的評估和效能分析
main.c 無法透過 function pointer 來切換和比較不同實做的效能落差，應該先設計一份可通用的軟體界面，然後將 binary tree, hash table, trie 等不同實做機制加入
將 append() 和 findName() 時間加入統計的意義不大，真實應用往往是個別操作，特別在圖表的呈現

環境架設

安裝作業系統： ubuntu 16.04 LTS
申請 github 帳號
安裝perf gnuplot

更改資料結構

資料結構改為以 "lastName" 為index的較小結構
剩餘的詳細資料用另一個結構表示
用 *detail 讓每個人有對應的詳細資料

















typedef struct __BOOK_INDEX_ENTRY {
    char lastName[MAX_LAST_NAME_SIZE];
    struct __BOOK_INDEX_ENTRY *pNext;
    struct __PHONE_BOOK_ENTRY *detail;
} entry;

typedef struct __PHONE_BOOK_ENTRY {
    char firstName[16];
    char email[16];
    char phone[10];
    char cell[10];
    char addr1[16];
    char addr2[16];
    char city[16];
    char state[2];
    char zip[5];
};

orig (cache miss： 94%)

Performance counter stats for './phonebook_orig' (100 runs):

  2,130,447	cache-misses	#94.137 % of all cache refs	( +-  0.12% )
  2,263,135	cache-references				( +-  0.12% )
261,167,670	instructions	#1.29  insns per cycle		( +-  0.02% )
202,575,413	cycles						( +-  0.90% )
0.073190313	seconds time elapsed				( +-  1.60% )

opt_v1 –structure reduction– (cache miss： 63%)

Performance counter stats for './phonebook_opt' (100 runs):

    423,413	cache-misses   #63.932 % of all cache refs      ( +-  1.04% )
    662,282	cache-references                                ( +-  1.36% )
243,923,231	instructions    #1.66  insns per cycle          ( +-  0.02% )
146,548,771	cycles                                          ( +-  1.06% )
0.048755741	seconds time elapsed                            ( +-  2.23% )

實做hash function

我把table size分作26等份，用lastName的第一個字當作分區的指標
把剩下字母的ASCII code相加當作分區內的分散值
最後再 MOD 我預設的TABLE_SIZE
經過實驗TABLE_SIZE設為10000最適合我的演算法
沒有用到的 hash table 欄位可以低到100左右










unsigned int hash(char *name)
{
    unsigned int num=0;
    for(int i=1; i<strlen(name); i++) {
        num += name[i] ;
    }
    num %= (TABLE_SIZE / 26);
    num += (name[0] - 'a') * (TABLE_SIZE / 26);
    return num % TABLE_SIZE;
}

使用 hash 後的結果

會因為 append 的方式改變而大幅度減少 cache miss 的次數
執行時間也會因此延長！！

第一種： hash完直接接到第一個節點（cache miss： 67~74%）




/*  Version 1 append immediately  */
int loc = hash(lastName);
e->pNext = hash_table[loc];
hash_table[loc] = e;

Performance counter stats for './phonebook_opt_v2' (100 runs):

    214,742	cache-misses	#72.231 % of all cache refs	( +-  0.36% )
    297,299	cache-references                        	( +-  0.59% )
296,020,025	instructions	#1.98  insns per cycle		( +-  0.34% )
149,720,323	cycles                                  	( +-  0.39% )
0.055791399	seconds time elapsed                   	( +-  2.73% )

第二種︰hash完若有collision就走到linked list的最後面然後新增 (cache miss： 6~9%)









/*  Version 2 append to "the end of" linked list  */
if(hash_table[loc] == NULL)
    hash_table[loc] = e;
else{
    entry* ptr = hash_table[loc];
    while(ptr->pNext != NULL)
    	ptr = ptr->pNext;
    ptr->pNext = e;
}

Performance counter stats for './phonebook_opt_v2' (100 runs):

    273,206	cache-misses	#6.246 % of all cache refs	( +-  1.18% )
  4,373,970	cache-references				( +-  0.53% )
362,481,024	instructions	#0.79  insns per cycle     	( +-  0.18% )
458,004,336	cycles						( +-  0.34% )
0.166064657	seconds time elapsed				( +-  0.97% )

小結：

使用第二種方式可以減少cache miss但會花費許多時間去走到linked list的最後面。
但是為什麼減少 cache miss 還需要了解一下。
老實說我使用的 hash function 對這次的 dataset比較有效，因為它每個字母幾乎平均分配，若對於現實生活來說可能效果就不會這麼好。

修改 calculate.c

為了讓 gnuplot 能畫出三種版本的比較
修改為 orgi, opt_v1, opt_v2
opt_v1為 struct reduction 版本
opt_v2為 hash 版本
計算三種版本的平均時間
用%.4f來輸出結果讓gnuplot畫出來比較不擠，但會造成微小的誤差


fprintf(output, "append() %.4f %.4f %.4f\n",orig_sum_a / 100.0, opt_sum_a / 100.0, opt_sum2_a / 100.0);
fprintf(output, "findName() %.4f %.4f %.4f", orig_sum_f / 100.0, opt_sum_f / 100.0, opt_sum2_f / 100.0);

修改 runtime.gp

新增第三個欄位來畫出 hash 後的結果
修改一下 y軸長度讓比較差異更明顯
修改輸出答案的位置讓它比較好看











/*------[:.num] 修改num能調整 y軸長度------*/
plot [:][:.1]'output.txt' using 2:xtic(1) with histogram title 'original', \

/*-----第一個括號為結果的x軸位置，第二個為y軸位置-----*/
'' using ($0-0.1):(0.01):2 with labels title ' ', \
'' using ($0+0.1):(0.015):3 with labels title ' ', \
'' using ($0+0.4):(0.01):4 with labels title ' '

/*-----新增hash function的欄位-----*/
'' using 4:xtic(1) with histogram title 'hash function' , \
'' using ($0+0.4):(0.01):4 with labels title ' '

gnuplot 結果

使用第一種append版本
使用第二種append版本

Futrue Work

~~了解 gnuplot 語法把圖畫的好看一點~~
~~更改hash function 讓collision次數減少 –目前想法是把第一個字的權重改高~~
弄清楚第二種append為什麼可以降低cache miss
~~熟悉git使用方式~~
完成挑戰題

Reference

Git 初學筆記 - 指令操作教學

2016q3 Homework1 (phonebook)

Reviewed by jserv

環境架設

更改資料結構

orig (cache miss： 94%)

opt_v1 –structure reduction– (cache miss： 63%)

實做hash function

使用 hash 後的結果

第一種： hash完直接接到第一個節點（cache miss： 67~74%）

第二種︰hash完若有collision就走到linked list的最後面然後新增 (cache miss： 6~9%)

小結：

修改 calculate.c

修改 runtime.gp

gnuplot 結果

Futrue Work

Reference

Reviewed by `jserv`