owned this note
owned this note
Published
Linked with GitHub
# 2016q3 Homework1 Phonebook
contributed by <`finalallpass`>
### Reviewed bt `RayPan`
* 錯別字 struc -> struct
* 尚未比較不同hash function之間的造成的效能差異
* 尚未闡明BKDR hash hunction造成cache misses較djb2 hash function低的原因
---
## 安裝開發環境Ubuntu 14.04
* 由於手邊正好有14.04版本的光碟便拿來安裝win10和Ubuntu雙系統,但遇到了問題如下圖
![](https://i.imgur.com/taG6qy8.jpg)
所幸最後在學長的幫忙下有安裝成功。(有向老師粉專詢問不過當時按下Ctrl+Alt+F7也沒有反應)
安裝過後花了點時間了解linux。
>> 請升級到 Ubuntu 16.04,否則可能會遇到套件太舊的狀況 [name=jserv]
>> 畫面本來就是 Linux 終端,就可以輸入指令
---
## 1.修改struc
嘗試將原本的phonebook_opt.h更改為
```
#ifndef _PHONEBOOK_H
#define _PHONEBOOK_H
#define MAX_LAST_NAME_SIZE 16
/*
typedef struct __PHONE_BOOK{
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
};
*/
typedef struct __PHINE_BOOK_ENTRY {
char lastName[MAX_LAST_NAME_SIZE];
//struct __PHONE_BOOK;
struct __PHONE_BOOK_ENTRY *pNext;
}entry;
entry *findName(char lastName[], entry *pHead);
entry *append(char lastName[], entry *e);
#endif
```
這樣跑出的結果為
```
size of entry : 24 bytes
execution time of append() : 0.032780 sec
execution time of findName() : 0.001741 sec
```
可以看到跟原本相比進步了不少
```
size of entry : 136 bytes
execution time of append() : 0.050244 sec
execution time of findName() : 0.005072 sec
```
使用gnuplot將圖形描繪出方便對照
![image alt](https://i.imgur.com/C4Rav3X.png)
然後再對照兩者的cache missed
未改善前的cache-misses高達90.513%
```
Performance counter stats for './phonebook_orig' (100 runs):
4,486,432 cache-misses # 90.513 % of all cache refs
4,946,858 cache-references
262,201,908 instructions # 1.59 insns per cycle
```
改善struc之降為68.396%
```
Performance counter stats for './phonebook_opt' (100 runs):
956,466 cache-misses # 60.158 % of all cache refs
1,625,589 cache-references
241,776,255 instructions # 2.33 insns per cycle
103,304,383 cycles
```
和原本相比已經好上了不少,90% -> 60%。
接下來想說在新的結構中加上指標指向原本更加詳細的資訊
```
typedef struct __PHONE_BOOK{
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
}more;
typedef struct __PHINE_BOOK_ENTRY {
char lastName[MAX_LAST_NAME_SIZE];
struct __PHONE_BOOK *more;
struct __PHONE_BOOK_ENTRY *pNext;
}entry;
```
```
entry *append(char lastName[], entry *e)
{
e->pNext = (entry *) malloc(sizeof(entry));
e->more= (more *)malloc(sizeof(more));
e = e->pNext;
strcpy(e->lastName, lastName);
e->pNext = NULL;
return e;
}
```
這樣會變成
```
size of entry : 32 bytes
execution time of append() : 0.049442 sec
execution time of findName() : 0.001894 sec
```
entry的部份因為結構中多了一個指標而多了8bytes
![](https://i.imgur.com/Hg6RBZR.png)
和原本entry為24bytes時相比,append的時間稍微多了一些,但還是比original的好。
## 使用hash function來進一步的優化
* 由於之前沒有接觸過hash還有對於雙重指標的不熟悉,昨天花了時間在閱讀書籍相關資料上,今天來嘗試將系統進一步的改善
* 在更改的過程中遇到問題,當使用BKDR hash function時,出現(core dumped)錯誤。
```
hashIndex hash(char *key)
{
unsigned int seed = 13;
unsigned int hashVal = 0;
while (*key) {
hashVal = hashVal * seed + (*key++);
}
return hashVal;
}
```
試算了一下BKDR可能出現的hash值非常大,所以會造成overflow 導致錯誤。那修改最後return的部份,去取餘數。
```
unsigned int seed = 13;
unsigned int hashVal = 0;
while (*key) {
hashVal = hashVal * seed + (*key++);
}
return hashVal % tablesize;
```
重新測試一次
```
size of entry : 24 bytes
execution time of append() : 0.048484 sec
execution time of findName() : 0.000027 sec
Performance counter stats for './phonebook_opt' (100 runs):
397,301 cache-misses # 34.368 % of all cache refs
1,164,390 cache-references
282,565,019 instructions # 2.17 insns per cycle
130,712,769 cycles
0.045373804 seconds time elapsed ( +- 2.29% )
```
* 所以先嘗試了另一種演算法,就沒有出現錯誤。
```
hashIndex hash(char *key)
{
unsigned int hashVal = 0;
while (*key != '\0') {
hashVal = (hashVal << 5) + *key++;
}
return hashVal % sizetable;
}
```
* 修改後結果為
```
size of entry : 24 bytes
execution time of append() : 0.055109 sec
execution time of findName() : 0.000005 sec
```
可以看到findName的時間大幅度的減少
再來檢測cache-mises
```
Performance counter stats for './phonebook_opt' (100 runs):
381,434 cache-misses # 32.500 % of all cache refs
1,153,347 cache-references
283,006,169 instructions # 2.18 insns per cycle
123,954,407 cycles
0.047213072 seconds time elapsed ( +- 1.06% )
```
和只修改structure相比cache-misses也少了快要30%
不過為了做hash function的優化,我有修改main.c的程式碼,現在要去研究一下Makefile將圖繪出比較。
* 運用了#if defined在main.c中讓他可以同時run opt跟orig。可以得到下圖。
![](https://i.imgur.com/A2XFBHr.png)
* 雖然append時間變成但findName變得十分的快。
[參考共筆](https://github.com/scvzxb/phonebook)
###### tags作業一