owned this note
owned this note
Published
Linked with GitHub
<style>
h2.part{color:#0099B0;}
h3.part{color:#D92424;}
h4.part{color:#005BB0;}
h5.part{color:#FD6F0A;}
h6.part{color:#4400B0;}
</style>
# 2016q3 Homework1 (phonebook)
contributed by <`f5120125`>
###### tags: `sysprog`
>>請依照指定格式填寫標題和"contributed by" [name=課程助教]
### Reviewed by `GreenYo0413`
- 使用二元搜尋數來當作資料結構是個不錯的想法,但產生分枝的方法可以再詳述,像是使用strcmp還是自己定義的hash function。
>>已新增二元搜尋樹和雜湊函數[name=f5120125]
### 開發環境
#### Ubuntu 14.04 LTS
```
$ lscpu
```
- CPU: Intel® Core™ i5 CPU 650 @ 3.20GHz × 4
- Mem: 8 GiB
- Cache:
L1d cache: 32 KB
L1i cache: 32 KB
L2 cache: 256 KB
L3 cache: 4096 KB
### 前置作業
#### Github
```
$ git branch <branch name>
$ git checkout <branch name>
$ git add .
$ git commit -m "my commit"
$ git pull
$ git push origin <branch name>
```
#### Gnuplot
```
reset
set ylabel 'time(sec)'
set style fill solid
set title 'perfomance comparison'
set term png size 1024,512 enhanced font 'Verdana,11'
set output 'runtime.png'
plot [:][:0.08]'output.txt' using 2:xtic(1) with histogram title 'original', \
'' using ($0-0.15):($2+0.001):2 with labels title ' ', \
'' using 3:xtic(1) with histogram title 'small-struct' , \
```
#### Perf
參考[[1]](http://wiki.csie.ncku.edu.tw/embedded/perf-tutorial)和[[2]](http://mropengate.blogspot.tw/2016/04/perf.html)的文章所列出幾個常用指令學習perf
### ==列出所有可以觸發perf採樣點的EVENT:==
```
$ perf list
```
- 採樣點類型
1. [Hardware event]
2. [Software event]
3. [Hardware cache event]
4. [Kernel PMU event]
### ==提供指定程序執行的概括情況==
- 更詳細的說明
```
$ perf stat --help
```
- 以phonebook_orig為例:
```
$ perf stat ./phonebook_orig
```
- 取得概略資訊:
```
pid: 6196
size of entry : 136 bytes
execution time of append() : 0.057027 sec
execution time of findName() : 0.009792 sec
Performance counter stats for './phonebook_orig':
89.716798 task-clock (msec) # 0.994 CPUs utilized
18 context-switches # 0.201 K/sec
0 cpu-migrations # 0.000 K/sec
12,357 page-faults # 0.138 M/sec
291,019,350 cycles # 3.244 GHz (82.23%)
167,571,318 stalled-cycles-frontend # 57.58% frontend cycles idle (82.19%)
98,915,774 stalled-cycles-backend # 33.99% backend cycles idle (64.47%)
272,060,112 instructions # 0.93 insns per cycle
# 0.62 stalled cycles per insn (82.27%)
46,888,250 branches # 522.625 M/sec (86.55%)
1,274,939 branch-misses # 2.72% of all branches (85.12%)
0.090283053 seconds time elapsed
```
### ==察看整體系統負載:==
```
$ sudo perf top
```
- 更詳細的說明
```
$ perf top --help
```
### ==小結==
往後可以根據==觸發採樣點的EVENT類型==, 去得知需要的資訊
- 例如, 若想知道cache miss在整體系統的狀況
```
$ perf top -e cache-misses
```
## 案例分析 - Phonebook
### 目標
- **減少Cache miss**
- **提升效能**
#### ►根據[提示](https://hackmd.io/s/S1rbwmZ6)可以減少==Cache miss==或==提升效能==的修改方向:
- [x] 改用較小的結構
- [x] 改用二元搜尋
- [x] 使用雜湊函數
### 1. 未優化版本
#### ►原本的結構 ==**(共計 136 bytes)**==
- 16+16+16+10+10+16+16+16+2+5=123
compiler做alignment後有128-byte
最後加上pointer的size(在64-bit中為8-byte):128+8=136
```clike=
#define MAX_LAST_NAME_SIZE 16
#define OPT 1
typedef struct __PHONE_BOOK_ENTRY {
char lastName[MAX_LAST_NAME_SIZE];
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
struct __PHONE_BOOK_ENTRY *pNext;
}detail;
```
### 2. 優化版本 - ==使用較小的結構==
#### ►新增結構 ==**(只有32 bytes)**==
```clike=
typedef struct __LAST_NAME{
char lastName[ MAX_NAME_SIZE ];
struct __LAST_NAME *pNext;
struct __PHONE_BOOK_ENTRY *dNext;
}entry;
```
#### ►執行時間
```
size of entry : 32 bytes
execution time of append() : 0.034547 sec
execution time of findName() : 0.003680 sec
```
- [ ] 尚未釐清
- 16+16+16+10+10+16+16+16+2+5=123
compiler做alignment後有128-byte
最後加上pointer的size(在64-bit中為8-byte):128+8=136
在Makefile中已經將最佳化關掉,為什麼還是會有alignment的最佳化
上網查到關閉alignment的方法為使用關鍵字:__attribute__((packed));
### 3. 優化版本 - ==使用二元搜尋==
#### ►採用二元搜尋樹的原因
| Binary Search Tree | Average | Worst |
| ---------- | -------------- | ------|
| **space** | $\Theta(n)$ | $O(n)$|
| **search** | $\Theta(log~n)$| $O(n)$|
| **insert** | $\Theta(log~n)$| $O(n)$|
| **delete** | $\Theta(log~n)$| $O(n)$|
- 由上表可得知再==一般情況==下==搜尋==資料可以在 ==$\Theta(log~n)$==
#### ►如何建立二元搜尋樹
- [x] Top-down appraoch ==$O(nlog~n)$==
- 由root開始一筆一筆insert資料來建立二元搜尋樹
- [x] [Bottom-up approach](http://www.geeksforgeeks.org/sorted-linked-list-to-balanced-bst/) ==$O(n)$==
- 由於Input為 **sorted list**. 藉由已知的資料集, 我們可從leaf開始建立二元搜尋樹
#### ►二元搜尋樹的實作
##### Tree node
```clike=
typedef struct _PHONE_BOOK_BST {
struct __LAST_NAME *data;
struct _PHONE_BOOK_BST *left;
struct _PHONE_BOOK_BST *right;
}BST;
```
##### Search function
- 假設str1為欲在樹中搜尋的字串, str2為目前探訪節點的字串
- 利用[strcasecmp](https://linux.die.net/man/3/strcasecmp)或者[strncasecmp]()
**```strcasecmp(const char *str1, const char *str2)```**
**```strncasecmp(const char *str1, const char *str2, size_t len)```**
- 皆是 ==**case insensitive**==
- 若 str1 和 str2 相同, 回傳值==0
- 若 str1 > str2, 回傳值>0 (往 right child 探訪)
- 若 str1 < str2, 回傳值<0 (往 left child 探訪)
##### Build tree
###### ♦如何建立*balanced* tree structure
- 當root為資料集中間數, 樹的結構能夠較balanced, 所以以n筆sorted list來看, 我們將中間數放在root, 高度預期為 $\Theta(log~n)$
###### ♦從sorted list中的何處開始建立subtree到完成整棵binary search tree
- 下列為從資料集中挑出的已排序資料
```graphviz
digraph linkedList {
rankdir=LR;
node [shape=record];
edge [tailclip=false];
a [label="{ <data> aaron | <ref> }"]
b [label="{ <data> abdulkarim | <ref> }"];
c [label="{ <data> burton | <ref> }"];
d [label="{ <data> cob | <ref> }"];
e [label="{ <data> desmund | <ref> }"];
f [label="{ <data> dru | <ref> }"];
g [label="{ <data> elijoh | <ref> }"];
NULL [shape=box];
a:ref:c -> b:data [arrowhead=vee, arrowtail=dot, dir=both];
b:ref:c -> c:data [arrowhead=vee, arrowtail=dot, dir=both];
c:ref:c -> d:data [arrowhead=vee, arrowtail=dot, dir=both];
d:ref:c -> e:data [arrowhead=vee, arrowtail=dot, dir=both];
e:ref:c -> f:data [arrowhead=vee, arrowtail=dot, dir=both];
f:ref:c -> g:data [arrowhead=vee, arrowtail=dot, dir=both];
g:ref:c -> NULL [arrowhead=vee, arrowtail=dot, dir=both];
}
```
- 預期建立的binary search tree
```graphviz
digraph G {
nodesep=0.8;
ranksep=1;
{node[style=invis,label=""]; __root_cob__;
}
{node[style=invis, label="", width=.1]; __dru_children__; __abdulkarim_children__;
}
{rank=same; abdulkarim; dru; __root_cob__}
{rank=same; aaron; burton; __abdulkarim_children__}
{rank=same; desmund; elijah; __dru_children__}
cob -> abdulkarim;
cob -> dru;
abdulkarim -> aaron;
abdulkarim -> burton;
dru -> desmund;
dru -> elijah;
{edge[style=invis];
//Distantiate nodes
cob -> __root_cob__;
abdulkarim -> __root_cob__ -> dru;
//Force ordering between children
dru -> __dru_children__;
desmund -> __dru_children__ -> elijah;
abdulkarim -> __abdulkarim_children__;
aaron -> __abdulkarim_children__ -> burton;
}
}
```
##### Algorithm
```clike=
BST *sortedListToBSTRecur(entry** node, int len)
{
/*
base step:
if len is zero, then return NULL
*/
//build left subtree recursively
BST *left = sortedListToBSTRecur(node, len>>1);
/*
allocate root here and set its attributes
*/
/*
Move the node to the next in this recursion step.
When this step returns, the previous recursion step
will get the next node
*/
*node = (*node)->pNext;
//build right node
root->right = sortedListToBSTRecur(node, len-(len>>1)-1);
return root;
}
```
#### ►分析演算法
##### step 1
- 遞迴建立左節點, 且因sorted list從第一筆開始, 其資料為最小, 要將其放入正確位置則必須深入binary tree的高度, 因此長度會在每次遞迴時除以2,直到碰到NULL
##### step 2
- allocate節點並放入資料
##### step 3
- 因為已經處理完目前的資料, 要將指標移向sorted list的下一筆資料
##### step 4
- 遞迴建立右節點
##### Time Complexity
- 由於每個節點探訪一次並馬上allocate空間並放入資料, 其時間複雜度為 ==$O(n)$==
#### ►執行時間
```
size of entry : 32 bytes
size of tree node : 24 bytes
execution time of append() : 0.071578 sec
execution time of findName() : 0.000019 sec
```
### 4. 優化版本 - ==使用雜湊函數==
- 如何[實作hash table](https://www.tutorialspoint.com/data_structures_algorithms/hash_data_structure.htm)
- 定義TABLE_SIZE
- 實作hash_function去取得key值
- 實作hash_append建立hash table
- 實作hash_find_name去找尋給定的last name
- HashTable
```clike=
typedef struct _HASH_TABLE_{
int len;
HashItem *next;
}HashTable;
```
- HashItem
```clike=
typedef struct _HASH_ITEM_{
char lastName[ MAX_LAST_NAME_SIZE ];
int key;
struct _HASH_ITEM_ *next;
}HashItem;
```
- hash_append
我將新增的item插入slut前端, 每個slut有自己的linked-list. 由於file是由小排序到大讀取, 因此linked-list會是由大到小
- 估計append的時間和建立linked-list差不多
- 由於每個slut有自己的linked-list. 若資料分佈均勻, findName時間可以下降
- 執行時間
```
size of hash item : 32 bytes
execution time of append() : 0.049486 sec
execution time of findName() : 0.000014 sec
```
### 5. ==runtime==的比較圖
![image alt](http://i.imgur.com/U8OKQZH.png)