# 2017q1 Homework1 (phonebook)
contributed by < `twzjwang` >
作業說明: [B01: phonebook](https://hackmd.io/s/rJYD4UPKe)
github: [twzjwang/phonebook](https://github.com/twzjwang/phonebook)
### Reviewed by `zhanyangch`
* 修改 runtime.gp 的指令順序可以使數字不被蓋住
# 開發環境
- Ubuntu 14.04.5 LTS
Linux 4.4.0-62-generic
>> 這個 distribution 太舊,之後的實驗可能會有問題,請升級到 Ubuntu 16.04 之後 [name="jserv"]
- cpu
version: Intel(R) Core(TM) i5-3337U CPU @ 1.80GHz
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
- memory
size: 4GiB
# 開發前
- 安裝Ubuntu系統
- 安裝開發工具
- vim
- git
- build-essential
- linux-tools-common
- linux-tools-generic
- cppcheck
- astyle
- colordiff
- gnuplot
- ...
- github設定
# 未優化版本
- 清空 cache
- `$ echo 1 | sudo tee /proc/sys/vm/drop_caches`
- 執行 phonebook_orig
- `$ ./phonebook_orig`
- 結果
```
size of entry : 136 bytes
execution time of append() : 0.108467 sec
execution time of findName() : 0.006022 sec
```
- cache-test `$ make cache-test`
-
```
Performance counter stats for './phonebook_orig' (100 runs):
2,080,686 cache-misses # 96.031 % of all cache refs ( +- 0.16% )
2,166,682 cache-references ( +- 0.18% )
262,557,158 instructions # 1.41 insns per cycle ( +- 0.02% )
185,753,200 cycles ( +- 0.12% )
0.075103558 seconds time elapsed ( +- 0.94% )
```
# 實驗一 - 用 binary search tree
>> 不要濫用「優化」(optimize) 這詞,在你有具體改進之前,都只是實驗(experiment) [name="jserv"]
## 方法 1: 使用strcmp
>> 避免用 version 1, 2, 3, ... 等寫法,明確標注你的嘗試途徑,尤其在 Git commit messages 更該如此,原始程式碼和技術報告都是寫給人看的! [name="jserv"]
- 用`strcmp`判斷,left < parent, right > parent
- append 時間太久 => 直接 ctrl+c
```C
int strcmp(const char *s1, const char *s2)
{
while(*s1 && (*s1 == *s2))
s1++, s2++;
return *(const unsigned char*) s1 - *(const unsigned char*) s2;
}
```
>> 注意一致的 coding style,應該寫作 `char *s1`,而非 `char *s1`,然後是 `*s1 == *s2` (有空白) 而非 `*s1==*s2` [name="jserv"]
- 改用`all-names.txt` find `zora`測試
- 結果
- findName 時間大幅縮短(優化前 20%)
- append 時間太久(優化前的 90 倍)
- cache miss 約為 4.795%
```
Performance counter stats for './phonebook_orig' (100 runs):
69,435 cache-misses # 77.038 % of all cache refs ( +- 1.23% )
90,131 cache-references ( +- 1.26% )
12,058,151 instructions # 1.13 insns per cycle ( +- 0.03% )
10,699,046 cycles ( +- 1.63% )
0.004715377 seconds time elapsed ( +- 1.86% )
Performance counter stats for './phonebook_opt_bst' (100 runs):
97,979 cache-misses # 4.795 % of all cache refs ( +- 2.68% )
2,043,183 cache-references ( +- 1.37% )
971,374,950 instructions # 2.01 insns per cycle ( +- 0.00% )
482,156,625 cycles ( +- 1.04% )
0.198216919 seconds time elapsed ( +- 1.31% )
```
>上面這行是...?[name=課程助教][color=red]
>已刪除 [name=twzjwang]
![](https://i.imgur.com/rFDGhkX.png)
## 方法 2: 改用自訂義 entry_cmp
- 因為輸入檔字母按順序(a-z) => 產生 skewed BST
- 改用自訂 cmp => 先比長度
```C
int entry_cmp(char a[], char b[])
{
if(strlen(a) == strlen(b))
return strcmp(a, b);
return strlen(a) - strlen(b);
}
```
>> 沒必要多次呼叫 `strlen`,請避免 [name="jserv"]
```C
int entry_cmp(char a[], char b[])
{
int i = strlen(a) - strlen(b);
return i ? i : strcmp(a, b);
}
```
- 結果
- append 時間縮短為 version 1 的 0.47%
- findName 時間縮短為優化前 4%
- cache miss 約為 5.148%
```
Performance counter stats for './phonebook_orig' (100 runs):
74,670 cache-misses # 78.183 % of all cache refs ( +- 0.70% )
95,508 cache-references ( +- 1.00% )
12,061,787 instructions # 1.02 insns per cycle ( +- 0.02% )
11,865,179 cycles ( +- 1.95% )
0.005664565 seconds time elapsed ( +- 3.60% )
Performance counter stats for './phonebook_opt_bst' (100 runs):
134,457 cache-misses # 5.148 % of all cache refs ( +- 2.42% )
2,611,651 cache-references ( +- 0.88% )
449,076,938 instructions # 1.88 insns per cycle ( +- 0.15% )
238,755,941 cycles ( +- 1.50% )
0.098018759 seconds time elapsed ( +- 1.83% )
```
![](https://i.imgur.com/IXviDpN.png)
- 問題
- 能否再縮短 append 時間?
## 方法 3: 嘗試修改 entry_cmp
- 再修改用 cmp => 先比長度,再從結尾比大小(與 strcmp相反)
```C
int entry_cmp(char a[], char b[])
{
if (strlen(a) == strlen(b)) {
int l = strlen(a);
for (int i = l - 1; i >= 0; i--) {
if (*(a + i) != *(b + i))
return *(a + i) - *(b + i);
}
}
return strlen(a) - strlen(b);
}
```
>> 縮減 `int i` 和 `int l` 的 scope。
>> `strlen` 多次呼叫,可以簡化 [name="jserv"]
```C
int entry_cmp(char a[], char b[])
{
int m = strlen(a);
int n = m - strlen(b);
if (!n) {
while (m--) {
if (*(a + m) != *(b + m))
return *(a + m) - *(b + m);
}
}
return n;
}
```
- 結果
- append 時間縮短為 version 2 的 0.14%
- findName 時間縮短 < 優化前 1%
- cache miss 約為 44.641%
- 推測:
- 過程中採用 binary search tree 結構存資料
插入和搜尋中存取 cache 次數與 singly-linked list 不同
(cache-references 和 cache-misses 次數明顯不同)
- 已知
cache max size: 32 KB
cache block size: 64 B
entry size: 144 B
8-way set associative
- orig
- append : n * O(1)
16750筆資料 * 每筆 144 B = 2,412,000 B
2,412,000 B / 64 B = 37688 block
- findName : O(n)
16750筆資料 * 每筆 144 B = 2,412,000 B
2,412,000 B / 64 B = 37,688 block
- 37,688 + 37,688 = 75,376 次
- opt_bst
- append : O(nlogn)
16750 * log(16750) * 144 B = 33,844,878 B
33,844,878 B / 64 B = 528,826 block
- findName :O(n)
log(16750) * 144 B = 2,021 B
2,021 B / 64 B = 32 block
- 528,826 + 32 = 528,858 次
- 參考 [cache 原理和實驗](https://hackmd.io/s/S14L26-_l)
>> 如何解釋 cache miss 的變化? [name=jserv]
```
Performance counter stats for './phonebook_orig' (100 runs):
61,477 cache-misses # 74.388 % of all cache refs ( +- 1.54% )
82,643 cache-references ( +- 1.34% )
12,059,048 instructions # 1.26 insns per cycle ( +- 0.03% )
9,593,850 cycles ( +- 1.84% )
0.004231672 seconds time elapsed ( +- 2.34% )
Performance counter stats for './phonebook_opt_bst' (100 runs):
200,986 cache-misses # 44.641 % of all cache refs ( +- 1.99% )
450,229 cache-references ( +- 1.23% )
55,910,271 instructions # 0.95 insns per cycle ( +- 0.25% )
58,829,146 cycles ( +- 1.87% )
0.025019251 seconds time elapsed ( +- 2.06% )
```
![](https://i.imgur.com/mfSomHh.png)
更改字串 cmp 方法比較 (用`all-names.txt` find `zora`)
版本|方法 1 | 方法 2 | 方法 3
-|----------|-----------|------------
append|0.189335|0.089585|0.013539
findName|0.000022|0.000007|0.000001
- 改回`words.txt` find `zyxel`
- ![](https://i.imgur.com/t5x0jmT.png)
- 問題
- 除了改變 cmp 規則,有沒有辦法減少插入、搜尋的深度?
- balance BST
## 方法 4: 將 single-linked list 轉換成 Balanced BST
- 因為輸入字母已按順序排列
>> 可用 `sort -R` 建立新的輸入資料集 [name="jserv"]
- 參考 [Sorted Linked List to Balanced BST](http://www.geeksforgeeks.org/sorted-linked-list-to-balanced-bst/) Method 1
- 將 orig 的 singly_linked list 資料轉換成 balanced BST
```C
//search by bst
entry *findName(char lastName[], entry *pHead)
{
entry *temp = pHead;
int cmp;
while (temp) {
cmp = strcmp(lastName, temp->lastName);
if (cmp == 0)
return temp;
else if (cmp < 0)
temp = temp->pLeft;
else
temp = temp->pRight;
}
return NULL;
}
//orig append
entry *append(char lastName[], entry *e)
{
/* allocate memory for the new entry and put lastName */
e->pNext = (entry *) malloc(sizeof(entry));
e = e->pNext;
strcpy(e->lastName, lastName);
e->pNext = NULL;
e->pRight = NULL;
e->pLeft = NULL;
return e;
}
entry *new_balance_bst(entry *root, int num)
{
if(!num)
return NULL;
entry *bst_root = NULL;
bst_root = balance_bst(bst_root, num, root);
free(root);
return bst_root;
}
entry *balance_bst(entry *e, int num, entry *root)
{
static entry *bst_build_cur = NULL;
if(!num)
return NULL;
if(!bst_build_cur)
bst_build_cur = root;
if(!e){
e = (entry *) malloc(sizeof(entry));
e->pNext = NULL;
e->pRight = NULL;
e->pLeft = NULL;
}
int mid = (num +1)/2;
e->pLeft = balance_bst(e->pLeft, mid - 1, root);
strcpy(e->lastName, bst_build_cur->lastName);
bst_build_cur = bst_build_cur->pNext;
e->pRight = balance_bst(e->pRight, num - mid, root);
return e;
}
void free_bst(entry *e)
{
if (e) {
free_bst(e->pRight);
free_bst(e->pLeft);
free(e);
}
}
```
>> 上述的 `findName` 可改寫為以下:
>> ```C
>> {
>> entry *temp = pHead;
>> while (temp) {
>> if (!strcmp(lastName, temp->lastName))
>> return temp;
>> temp = (cmp < 0) ? temp->pLeft : temp->pRight;
>> }
>> return NULL;
>> }
>> ```
>> [name="jserv"]
- 結果
- append 時間降為優化前1.69倍
- findName 時間 < 優化前0.0178%
- cache-miss 約為 94.463%
```
Performance counter stats for './phonebook_opt_bst' (100 runs):
2,431,096 cache-misses # 94.463 % of all cache refs ( +- 0.10% )
2,573,593 cache-references ( +- 0.20% )
481,113,045 instructions # 1.34 insns per cycle ( +- 0.01% )
358,634,304 cycles ( +- 0.29% )
0.142062161 seconds time elapsed ( +- 0.62% )
```
![](https://i.imgur.com/k4XQ9gH.png)
- 問題
- 降低 cache-misses ?
- 直接建成 BST (不經過 singly-linked list )?
- 邊讀檔邊建 => 不知資料筆數 => 先讀過一次存下資料數
## 方法 5: 嘗試省略 singly-linked list ,直接建 Balanced BST
- 修改 versoin 4 先讀檔取得資料數量
- 直接讀檔建 balance BST
```C
while (fgets(line, sizeof(line), fp)) {
while (line[i] != '\0')
i++;
count_node++;
}
```
- 結果
- append 時間降為優化前 1.44 倍
- findName 時間 < 優化前 0.0169%
- cache-miss 約為 92.262%
```
Performance counter stats for './phonebook_opt_bst' (100 runs):
1,425,612 cache-misses # 92.262 % of all cache refs ( +- 0.04% )
1,545,179 cache-references ( +- 0.11% )
434,912,471 instructions # 1.38 insns per cycle ( +- 0.00% )
314,507,544 cycles ( +- 0.10% )
0.124844305 seconds time elapsed ( +- 0.54% )!
```
![](https://i.imgur.com/2DtIu9z.png)
- 問題
- 其他方法?
## binary search tree 小結
Algo.|優化前(singly-linked list)|優化後(balanced binary search tree)
-|-----|-----
space|O(n)|O(n)
insert|O(1)|O(n) (讀資料數) + O(log n) (插入)
search|O(n)|O(log n)
# 實驗二 - 縮減 entry 結構
- 程式中依據 `lastName` 插入及搜尋
- 將其他資訊定義在新的結構 `entry_info`
需要時再配置記憶體
```C
/* original version */
typedef struct __PHONE_BOOK_ENTRY {
char lastName[MAX_LAST_NAME_SIZE];
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
struct __PHONE_BOOK_ENTRY *pNext;
} entry;
```
改成以下
```C
typedef struct __PHONE_BOOK_ENTRY_INFO {
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
} entry_info;
typedef struct __PHONE_BOOK_ENTRY {
char lastName[MAX_LAST_NAME_SIZE];
entry_info *pInfo;
struct __PHONE_BOOK_ENTRY *pNext;
} entry;
```
- 結果
- append 時間為修改前 82%
- findName 時間為修改前 51%
- cache-miss 約為 68.577%
- cache miss 降低原因: size of entry 變小
```
Performance counter stats for './phonebook_orig' (100 runs):
2,042,217 cache-misses # 95.007 % of all cache refs ( +- 0.12% )
2,149,550 cache-references ( +- 0.16% )
262,784,639 instructions # 1.43 insns per cycle ( +- 0.02% )
183,943,466 cycles ( +- 0.37% )
0.073711683 seconds time elapsed ( +- 0.90% )
Performance counter stats for './phonebook_opt_struct' (100 runs):
378,199 cache-misses # 68.577 % of all cache refs ( +- 0.20% )
551,497 cache-references ( +- 1.13% )
245,480,764 instructions # 1.71 insns per cycle ( +- 0.02% )
143,394,756 cycles ( +- 1.12% )
0.056445606 seconds time elapsed ( +- 1.34% )
```
![](https://i.imgur.com/gURlVZX.png)
# 實驗三 - 結合實驗一(BST)與實驗二(縮減 entry 結構)
- 將前兩個方法結合
- 結果
- (優化前) original:
- 使用 single linked list 儲存資料
- append 時紀錄最後一個 node
插入linked list 時間不會太多 ( n*O(1) )
- findName 最糟的情況下必須從頭找到尾 ( O(n) )
- (實驗一) opt bst:
- 使用BST雖然 append 時間稍微增加 ( n * O(log n) )
- 但 findName 時間大幅減少 ( O(log n) )
- (實驗二) opt struct:
- 簡化 entry 結構降低 cache misses rate
- append 和 findName 時間因此減少
- (實驗三) optimized
- 使用 size 較小的資料結構讓 BST 插入、搜尋時間再縮短
```
Performance counter stats for './phonebook_orig' (100 runs):
1,987,476 cache-misses # 95.591 % of all cache refs ( +- 0.67% )
2,079,154 cache-references ( +- 0.70% )
262,484,896 instructions # 1.43 insns per cycle ( +- 0.02% )
184,098,351 cycles ( +- 0.58% )
0.084555741 seconds time elapsed ( +- 2.78% )
Performance counter stats for './phonebook_opt' (100 runs):
469,681 cache-misses # 78.364 % of all cache refs ( +- 0.35% )
599,356 cache-references ( +- 1.20% )
314,544,912 instructions # 1.49 insns per cycle ( +- 0.00% )
211,290,940 cycles ( +- 0.75% )
0.094401048 seconds time elapsed ( +- 2.10% )
Performance counter stats for './phonebook_opt_bst' (100 runs):
1,387,546 cache-misses # 90.434 % of all cache refs ( +- 0.14% )
1,534,318 cache-references ( +- 0.47% )
434,348,761 instructions # 1.37 insns per cycle ( +- 0.00% )
317,930,025 cycles ( +- 0.59% )
0.137107756 seconds time elapsed ( +- 1.86% )
Performance counter stats for './phonebook_opt_struct' (100 runs):
374,466 cache-misses # 71.274 % of all cache refs ( +- 0.25% )
525,392 cache-references ( +- 0.86% )
245,306,198 instructions # 1.78 insns per cycle ( +- 0.02% )
137,518,542 cycles ( +- 0.87% )
0.057796514 seconds time elapsed ( +- 2.39% )
```
![](https://i.imgur.com/HB2T2d2.png)
# changing git commit message
- 之前沒有注意 commit message 內容及格式
Author email也打錯...
經過提醒後嘗試修改 commit message
- `git rebase -i HEAD~5` 修改前五次 commit message
- 將要修改那行的 `pick` 改成 `edit`
- `git commit --amend`
Author 資訊錯誤也可在此更改
`git commit --amend --author="twzjwang <twzjwang@gmail.com>"`
- 開啟編輯,修改後儲存
- `git rebase --continue`
- 重複以上三步驟
- `git push --force-with-lease`
# 本例選用的 dataset 是否存在問題?
- 有代表性嗎?跟真實英文姓氏差異大嗎?
- 許多字不是名詞,甚至是連續而無意義
- 資料難道「數大就是美」嗎?
- 除了資料"量"外,"質"也很重要
- 如果內容大部分是無意義的資料,要從中篩選出有參考價值的資料
- 如果我們要建立符合真實電話簿情境的輸入,該怎麼作?
- 從已整理好的開放資料取得更常見的姓氏
- 要考慮姓名有可能重複的狀況
- 除了根據"姓名"搜尋資料外,根據 電話號碼 搜尋也是常見的方式,要重新設計演算法
- e.g. [whoscall](https://whoscall.com/zh-TW/)
# 參考資料
- [Linux 查看系統及硬體資訊](https://www.phpini.com/linux/linux-check-system-hardware-information)
- [Sorted Linked List to Balanced BST](http://www.geeksforgeeks.org/sorted-linked-list-to-balanced-bst/)
- [jkrvivian HackMD筆記](https://hackmd.io/s/SyOQgOQa#)
- [vic85821 HackMD筆記](https://hackmd.io/s/r1B6WwQT#)
- [How to Write a Git Commit Message](https://chris.beams.io/posts/git-commit/)
- [changing-git-commit-message-after-push](http://stackoverflow.com/questions/8981194/changing-git-commit-message-after-push-given-that-no-one-pulled-from-remote)
- [修改 Git commits 的作者資訊](https://yulun.me/2014/git-tips-change-author-and-email-in-previous-commits/)
- [記憶體系統-快取記憶體](http://systw.net/note/af/sblog/more.php?id=252)
###### tags: `twzjwang`