2017q1 Homework1(phonebook)

# 2017q1 Homework1(phonebook) contributed by < `potinglee` > ### Reviewed by `yachiyang01` -git commit message需用命令語氣，內文與標題需一個空白行 ### Reviewed by `harryoooooooooo` * C/C++標準中，free(), delete等函式不須再另外判斷是否為NULL，傳入NULL並不會發生錯誤 >> 注意一致的格式 [name="jserv"] >進度嚴重落後，請趕工![name=課程助教][color=red] ## 開發環境 * OS： Ubuntu 16.04.2 LTS * L1d cache： 32K * L1i cache： 32K * L2 cache： 256K * L3 cache： 6144K * Architecture： x86_64 * CPU 作業模式： 32-bit, 64-bit * Byte Order： Little Endian * CPU(s)： 4 * Model name： Intel(R) Core(TM) i5-6300HQ CPU @ 2.30GHz * **memory** description： System memory physical id： 0 size： 7872MiB ## 未優化版本 * 執行： `$ ./phonebook_orig` * append() 及 findName() 時間如下 ``` size of entry ： 136 bytes execution time of append() ： 0.083434 sec execution time of findName() ： 0.005348 sec ``` * catch test： `$ make cache-test` * cache-misses 為 92% ``` Performance counter stats for './phonebook_orig' (100 runs)： 457,4257 cache-misses # 92.288 % of all cache refs ( +- 0.18% ) 495,6511 cache-references ( +- 0.10% ) 2,6246,7397 instructions # 1.50 insn per cycle ( +- 0.02% ) 1,7458,4735 cycles ( +- 0.19% ) 0.066569148 seconds time elapsed ( +- 0.26% ) ``` ## 優化版本1 - 減少資料結構大小原本的資料結構為： ```c= typedef struct __PHONE_BOOK_ENTRY { char lastName[MAX_LAST_NAME_SIZE]; char firstName[16]; char email[16]; char phone[10]; char cell[10]; char addr1[16]; char addr2[16]; char city[16]; char state[2]; char zip[5]; struct __PHONE_BOOK_ENTRY *pNext; } entry; ``` 因為只搜尋 lastName，所以把其他不會用到的成員移除。結果如下： ```c= typedef struct __PHONE_BOOK_ENTRY { char lastName[MAX_LAST_NAME_SIZE]; struct __PHONE_BOOK_ENTRY *pNext; } entry; ``` * 執行： `$ ./phonebook_opt` * append() 及 findName() 時間如下 ``` size of entry ： 24 bytes execution time of append() ： 0.039963 sec execution time of findName() ： 0.001991 sec ``` * catch test： `$ make cache-test` * cache-misses 減少為 63% ``` Performance counter stats for './phonebook_opt' (100 runs)： 102,7372 cache-misses # 63.629 % of all cache refs ( +- 0.76% ) 161,4624 cache-references ( +- 0.13% ) 2,4126,7291 instructions # 2.16 insn per cycle ( +- 0.02% ) 1,1166,5508 cycles ( +- 0.19% ) 0.041329077 seconds time elapsed ( +- 0.30% ) ``` * 和未優化版本的比較圖： ![](https：//i.imgur.com/5puco8J.png) ## 優化版本2 - 使用二元搜尋樹將原本的用 linked list 建立的資料，改用二元搜尋樹建立。原本建立 linked list 的程式： ```c= entry *append(char lastName[], entry *e) { /* allocate memory for the new entry and put lastName */ e->pNext = (entry *) malloc(sizeof(entry)); e = e->pNext; strcpy(e->lastName, lastName); e->pNext = NULL; return e; } ``` 建立二元搜尋樹的程式： ```c= entry *append(char lastName[], entry *e) { if(e == NULL) { strcpy(e->lastName, lastName); return e; } else { //create a new node entry* newnode = (entry *) malloc(sizeof(entry)); strcpy(newnode->lastName, lastName); newnode->pRight = newnode->pLeft = NULL; entry *tmp = e; entry *prev = NULL; while(tmp != NULL) { prev = tmp; if(strcasecmp(lastName, tmp->lastName) > 0) tmp = tmp->pRight; else if(strcasecmp(lastName, tmp->lastName) < 0) tmp = tmp->pLeft; } if(prev->pRight == tmp) prev->pRight = newnode; else prev->pLeft = newnode; return newnode; } } ``` 搜尋二元搜尋樹的 findName()： ```c= entry *findName(char lastName[], entry *pHead) { if(pHead == NULL) { /* Element is not found */ return NULL; } while(pHead != NULL) { if(strcasecmp(lastName, pHead->lastName) == 0) { return pHead; } else { if(strcasecmp(lastName, pHead->lastName) > 0) { pHead = pHead->pRight; } else { pHead = pHead->pLeft; } } } return pHead; } ``` * 執行： `$ ./phonebook_opt` * append() 及 findName() 時間如下 ``` size of entry ： 32 bytes execution time of append() ： 0.309874 sec execution time of findName() ： 0.003658 sec ``` * catch test： `$ make cache-test` * cache-misses 減少為 68%，但因資料結構較大的緣故，cache-miss較優化版本1增加 5%。 ``` Performance counter stats for './phonebook_opt' (100 runs)： 160,2956 cache-misses # 68.397 % of all cache refs ( +- 0.50% ) 234,3613 cache-references ( +- 0.11% ) 3,2665,4220 instructions # 2.38 insn per cycle ( +- 0.01% ) 1,3712,2147 cycles ( +- 0.16% ) 0.050254760 seconds time elapsed ( +- 0.32% ) ``` * 和未優化版本的比較圖： ![](https：//i.imgur.com/I0d3JTA.png) 雖然使用了二元搜尋樹，但 findNmae()的時間卻沒減少，反而增加。推測是由於每次加新資料都從 tree 的尾端，因此會形成這種狀況： ```graphviz digraph bst { aaaa -> aaaaa -> aaaaaa -> aaaaaaa [label = pRight]; aaaa -> NULL [label = pLeft]; aaaaa -> NULL [label = pLeft]; aaaaaa -> NULL [label = pLeft]; aaaaaaa -> NULL [label = pLeft]; } ``` 每一個 node 雖有兩個指標，實際上卻只用到一個(另一個指向 NULL)，因此整個二元搜尋樹等同於普通的 linked list，而無法發揮二元搜尋樹的優勢。 >> 請及早調整輸入的 data set [name="jserv"] ## 優化版本3 - 用樹旋轉改進以上的問題（working） >> 可以先將 words.txt 透過 -R 打亂排序 >> [name=課程助教][color=red]

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.