<style>
h2.part{color:#000000;}
h3.part{color:#D92424;}
h4.part{color:#005BB0;}
h5.part{color:#FD6F0A;}
h6.part{color:#4400B0;}
</style>
# 2017q1 Homework1 (phonebook)
contributed by <bananahuang>
[作業要求](https://hackmd.io/s/rJYD4UPKe#)
## Reviewed by `claaaaassic`
* Github 上的 commit message 盡量打清楚這段程式碼修改的目的,使用 test 為標題會讓別人看不懂為何做這段修改,以及自己也會過一段時間就忘記原因
* 可以嘗試使用 hash function 改善效能,可使 findName() 時間大幅縮短
* 在你解決 memory leak 的地方有提出一段程式碼,這邊會把你原先建立出來的鏈接串列從頭開始,每一個不等於 NULL 的節點都會先讓它指向他的下一個節點,然後讓它執行 free(),建議 while 結束可以加上一行 `free(pHead);` ,才能讓所有 malloc 出來的都釋放
```
while (pHead->pNext) {
entry *next = pHead->pNext;
pHead->pNext = next->pNext;
free(next);
}
free(pHead);
```
## 開發環境
`$ cat /etc/lsb-release`
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.1 LTS"
`$lscpu`
```
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 42
Model name: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
Stepping: 7
CPU MHz: 3245.917
CPU max MHz: 3400.0000
CPU min MHz: 1600.0000
BogoMIPS: 6186.21
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-3
```
## 了解電腦的系統架構圖
指令:
`$ sudo apt-get install hwloc`
`$ lstopo`

## 安裝相關開發工具
[2017 年春季作業說明](https://hackmd.io/s/Bk-1zqIte#)
## 工具熟悉: perf
#### step1:
檢查是否啟用 perf
==$ cat "/boot/config-`uname -r`" | grep "PERF_EVENT"==
終端機輸出:
```
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_HAVE_PERF_EVENTS=y`
CONFIG_PERF_EVENTS=y`
CONFIG_HAVE_PERF_EVENTS_NMI=y
```
#### step2:
安裝 perf
`$ sudo apt-get install linux-tools-common`
#### step3:
使用 perf list
`$ perf list`
```
List of pre-defined events (to be used in -e):
branch-instructions OR branches [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]
cache-misses [Hardware event]
cache-references [Hardware event]
cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
ref-cycles [Hardware event]
stalled-cycles-backend OR idle-cycles-backend [Hardware event]
stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]
alignment-faults [Software event]
bpf-output [Software event]
context-switches OR cs [Software event]
cpu-clock [Software event]
cpu-migrations OR migrations [Software event]
dummy [Software event]
emulation-faults [Software event]
major-faults [Software event]
minor-faults [Software event]
```
#### step4(法1):
使用圓周率程式測試 perf
注意:用原本 terminal 收尋完 pid 要開另外一個terminal,並且要在圓周率的 process 尚未結束前在新開的terminal打上`perf top -p pid`,如果怕來不及可以在圓周率程式裡增加迴圈
**原本 terminal**
指令:
`$ g++ -c perf_top_example.c`
`$ g++ perf_top_example.o -o example`
`$ ./example`

**新開 terminal**
指令:
`$perf top -p 8628`

錯誤版本(當process已經執行完了在新開的視窗再打`perf top -p pid`會無效)

#### step4(法2):
指令:
`$ ./ "執行檔案名稱" & sudo perf top -p $!`

打完 password 就會出現下圖

## reference
[Documents/Perf FAQ](http://kernel.taobao.org/index.php?title=Documents/Perf_FAQ)
[perf 原理和實務](https://hackmd.io/s/B11109rdg#)
## 工具熟悉: gnuplot(針對未改善前 phonebook 使用 gnuplot 練習)
`$ cd phonebook`
`$ make phonebook`
`$ make`(編譯)
`$ make run`(測試)
`$ make plot`
接下來進入 phonebook 就會看到用gnuplot建立的圖表

## 開發目標:
1.減少 cache miss 情形
keyword:cache line
[cache line](http://cenalulu.github.io/linux/all-about-cpu-cache/)
**改善方法1**:
將 lastname 提出來變成一個struct,收尋時先比對 lastname
## 閱讀程式碼
**main.c**
* IMPL
* diff_in_second:算時間比較精準
* assert:後面判斷為真繼續執行,為否停止執行
**reference**
[louielu](https://hackmd.io/CYJgZgHA7AnADAFgLQGMCmaCGSEGYCsyMARjPkgIwhS4gT4VwogJA===?view#)
## 效能分析(優化前)
提醒:在執行程式(phonebook_orig 和 phonebook_opt) 前,先清空 cache
##### 指令:`$ echo 1 | sudo tee /proc/sys/vm/drop_caches`
##### 執行:`$ ./phonebook_orig`
#### 顯示append()及findName()所執行時間
```
size of entry : 136 bytes
execution time of append() : 0.057090 sec
execution time of findName() : 0.005835 sec
```
#### 找 cache-misses 發現高達90.506%
`$ make cache-test`(makefile 有寫好)
```
Performance counter stats for './phonebook_orig' (100 runs):
203,8137 cache-misses # 90.506 % of all cache refs ( +- 0.06% )
225,1932 cache-references ( +- 0.03% )
2,6169,4498 instructions # 1.30 insns per cycle ( +- 0.02% )
2,0144,5323 cycles ( +- 0.09% )
0.064792767 seconds time elapsed ( +- 0.38% )
```
#### 鎖定想優化的目標可以使用 perf stat,下面的數字可以自行設定要 run 幾次
`$ perf stat --repeat 5 -e cache-misses,cache-references,instructions,cycles ./執行檔`
```
Performance counter stats for './phonebook_orig' (5 runs):
205,2030 cache-misses # 90.374 % of all cache refs ( +- 0.28% )
227,0589 cache-references ( +- 0.22% )
2,6144,2354 instructions # 1.29 insns per cycle ( +- 0.11% )
2,0344,6395 cycles ( +- 0.19% )
0.064507173 seconds time elapsed ( +- 1.21% )
```
#### 透過 perf 找熱點:`$./phonebook_orig & sudo perf top -p $!`
* findName 佔了24.67%而 append 佔了1.47%,但是 append 執行時間比 findName 長
```
24.67% phonebook_orig [.] findName
6.09% libc-2.23.so [.] _IO_fgets
5.82% libc-2.23.so [.] _int_malloc
5.54% phonebook_orig [.] main
3.39% libc-2.23.so [.] __memcpy_sse2
3.32% [kernel] [k] page_fault
2.36% [kernel] [k] handle_mm_fault
2.16% [kernel] [k] clear_page
1.55% [kernel] [k] unmap_page_range
1.53% [kernel] [k] free_hot_cold_page
1.47% phonebook_orig [.] append
1.29% libc-2.23.so [.] __strcpy_sse2_unaligned
1.23% [kernel] [k] policy_zonelist
1.12% libc-2.23.so [.] memchr
1.03% libc-2.23.so [.] _IO_getline_info
0.85% [kernel] [k] __do_page_fault
0.84% [kernel] [k] __rmqueue.isra.80
0.83% [kernel] [k] nv04_fifo_intr
0.79% [kernel] [k] __internal_add_timer
0.79% libc-2.23.so [.] __strcasecmp_avx
```
## 效能分析(改變 struct)
#### 方法1:更改原先 struct
```clike=
typedef struct __PHONE_BOOK_ENTRY {
//char lastName[MAX_LAST_NAME_SIZE];
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
struct __PHONE_BOOK_ENTRY *pNext;
} other_entry;
typedef struct entry{
char lastName[MAX_LAST_NAME_SIZE];
other_entry *information;
struct entry *pNext;
} entry
entry *findName(char lastName[], entry *pHead);
entry *append(char lastName[], entry *e);
```
執行:`$ ./phonebook_opt`
* 顯示 append()及 findName()所執行時間
```
size of entry : 32 bytes
execution time of append() : 0.049802 sec
execution time of findName() : 0.002554 sec
```
cache test:`$ make cache-test`
* 發現 cache-misses 降到57.459%
```
Performance counter stats for './phonebook_opt' (100 runs):
37,6305 cache-misses # 57.459 % of all cache refs ( +- 0.12% )
65,4911 cache-references ( +- 0.13% )
2,4415,7636 instructions # 1.71 insns per cycle ( +- 0.02% )
1,4276,3682 cycles ( +- 0.28% )
0.047078644 seconds time elapsed ( +- 0.72% )
```

#### entry size 縮小對 cache miss 影響
Lid cache size = 32kB, Cache line=64bytes
計算 Cache line個數
==32 * 1024 / 64=== 512
改善結構前,每筆 entry 有136bytes,所以一次能存入Ld1 cache size 只有
==64 * 512/ 136=== 大約 240筆 Entry
根據大小改善的原始結構後
每筆Entry剩下32Bytes,因此一次存入L1d的數量增加為
==64 * 512 / 32== 大約 1024筆 Entry
#### 觀察是否有資料留在heap上,造成memory leak
安裝工具:`$sudo apt install valgrind`
執行:`$ valgrind --leak-check=full ./phonebook_opt`
```
==3854== HEAP SUMMARY:
==3854== in use at exit: 11,196,768 bytes in 349,899 blocks
==3854== total heap usage: 349,906 allocs, 7 frees, 11,207,152 bytes allocated
==3854==
==3854== 11,196,768 (32 direct, 11,196,736 indirect) bytes in 1 blocks are definitely lost in loss record 2 of 2
==3854== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3854== by 0x400D50: append (in /home/huangtingshieh/phonebook/phonebook_opt)
==3854== by 0x400AC8: main (in /home/huangtingshieh/phonebook/phonebook_opt)
==3854==
==3854== LEAK SUMMARY:
==3854== definitely lost: 32 bytes in 1 blocks
==3854== indirectly lost: 11,196,736 bytes in 349,898 blocks
==3854== possibly lost: 0 bytes in 0 blocks
==3854== still reachable: 0 bytes in 0 blocks
==3854== suppressed: 0 bytes in 0 blocks
==3854==
==3854== For counts of detected and suppressed errors, rerun with: -v
==3854== Use --track-origins=yes to see where uninitialised values come from
==3854== ERROR SUMMARY: 7 errors from 7 contexts (suppressed: 0 from 0)
```
##### 分析:
total heap usage:程式配置新記憶體片段349906次但只釋放7次,且有11,196,768 bytes 留在 heap 上
#### 解決memory leak
原本
```clike=
if (pHead->pNext) free(pHead->pNext);
free(pHead);
```
修改(尚未十分理解此為參考寫法) [<0140454>](https://hackmd.io/s/Bkx3cYX6#reviewed-bt-finalallpass)
```clike=
while(pHead->pNext) {
entry *next = pHead->pNext;
pHead->pNext = next->pNext;
free(next);
```
解決了大部份 memory leak
```
==4014== HEAP SUMMARY:
==4014== in use at exit: 32 bytes in 1 blocks
==4014== total heap usage: 349,906 allocs, 349,905 frees, 11,207,152 bytes allocated
==4014==
==4014== 32 bytes in 1 blocks are definitely lost in loss record 1 of 1
==4014== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4014== by 0x400A44: main (in /home/huangtingshieh/phonebook/phonebook_opt)
==4014==
==4014== LEAK SUMMARY:
==4014== definitely lost: 32 bytes in 1 blocks
==4014== indirectly lost: 0 bytes in 0 blocks
==4014== possibly lost: 0 bytes in 0 blocks
==4014== still reachable: 0 bytes in 0 blocks
==4014== suppressed: 0 bytes in 0 blocks
==4014==
==4014== For counts of detected and suppressed errors, rerun with: -v
==4014== Use --track-origins=yes to see where uninitialised values come from
==4014== ERROR SUMMARY: 7 errors from 7 contexts (suppressed: 0 from 0)
```
效能也有改善一點

**reference**
網站資源:
[淺談memory cache ](http://opass.logdown.com/posts/249025-discussion-on-memory-cache)
共筆閱讀:
[Hsiang](https://hackmd.io/s/HylfV2FFe#2017q1-homework1-phonebook)
[0140454](https://hackmd.io/EwMwhgDBDGBsYFoDsBGWAOBAWdSCsCAnCIQEYLAoRgAmApvHiAMylA==?view#)
## 效能分析(使用 hash function)
#### 新增phonebook_hash.c
```clike=
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "phonebook_hash.h"
/* FILL YOUR OWN IMPLEMENTATION HERE! */
/* original version */
entry *findName(char lastName[], entry *pHead[])
{
/* TODO: implement */
unsigned int key=BKDRHash(lastName)%3571;
while (pHead[key] != NULL) {
if (strcasecmp(lastName, pHead[key]->lastName) == 0){
printf("%15s is found\n",lastName);
return pHead[key];
}
pHead[key] = pHead[key]->pNext;
}
return NULL;
}
entry *append(char lastName[], entry *e[])
{
unsigned int key=BKDRHash(lastName)%3571;
/* allocate memory for the new entry and put lastName */
e[key]->pNext = (entry *) malloc(sizeof(entry));
e[key] = e[key]->pNext;
strcpy(e[key]->lastName, lastName);
e[key]->pNext = NULL;
return e[key];
}
unsigned int BKDRHash(char lastName[])
{
unsigned int seed = 131;
// 31 131 1313 13131 131313 etc..
unsigned int hash = 0;
while (*lastName) {
hash = hash * seed + (*lastName++);
}
return hash;
}
```
#### 新增phonebook_hash.h
```clike=
#ifndef _PHONEBOOK_H
#define _PHONEBOOK_H
#define MAX_LAST_NAME_SIZE 16
/* TODO: After modifying the original version, uncomment the following
* line to set OPT properly */
#define HASH 1
typedef struct __PHONE_BOOK_DETAIL {
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
} phonebookdetail;
typedef struct entry {
char lastName[MAX_LAST_NAME_SIZE]; //__PHONE_BOOK_ENTRY
phonebookdetail *detail;
struct entry *pNext;
} entry;
entry *findName(char lastName[], entry *pHead[]);
entry *append(char lastName[], entry *e[]);
unsigned int BKDRHash(char lastname[]);
#endif
```
#### 修改runtime.gp
```clike=
reset
set ylabel 'time(sec)'
set style fill solid
set title 'perfomance comparison'
set term png enhanced font 'Verdana,10'
set output 'runtime.png'
plot [:][:0.150]'output.txt' using 2:xtic(1) with histogram title 'original', \
'' using ($0-0.06):($2+0.001):2 with labels title ' ', \
'' using 3:xtic(1) with histogram title 'optimized' , \
'' using 4:xtic(1) with histogram title 'hash' , \
'' using ($0+0.3):($3+0.0015):3 with labels title ' ', \
'' using ($0+0.4):($4+0.0015):4 with labels title ' '
```
#### 修改main.c
#### 修改calculate.c
#### 修改Makefile
#### 效能改進

```
Performance counter stats for './phonebook_hash' (100 runs):
29,7280 cache-misses # 41.245 % of all cache refs ( +- 0.09% )
72,0775 cache-references ( +- 0.11% )
2,4553,5637 instructions # 1.51 insns per cycle ( +- 0.02% )
1,6262,9456 cycles ( +- 0.07% )
0.051042278 seconds time elapsed ( +- 0.61% )
```
**reference**
[hash function 觀念和實務](https://hackmd.io/s/HJln3jU_e#)
[Hash Function](https://hackmd.io/OwIwTALAHGwCYFoDMA2AnGhEQGMUKigkTSgFMBWARioAYBDOFAM1qA==?view#hash-function)
[為何hash seed 是131](https://www.zhihu.com/question/20507188)
[哈希表之bkdrhash算法解析及扩展 ](http://blog.csdn.net/wanglx_/article/details/40400693)
[Which hashing algorithm is best for uniqueness and speed?](http://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed)
[各种字符串Hash函数比较](https://www.byvoid.com/zhs/blog/string-hash-compare)
[非常重要:算法与数据结构,有HASH各種介紹](http://lib.csdn.net/datastructure/knowledge/817)
參考共筆:
[Hsiang](https://hackmd.io/OwIwTALAHGwCYFoDMA2AnGhEQGMUKigkTSgFMBWARioAYBDOFAM1qA==?view)
[changyuanhua](https:// "title")
## 未來工作
先閱讀共筆後,再進行更多實作
[陳品睿](https://embedded2016.hackpad.com/2016q1-Homework1-utBhpkDFVMh)
[yenWu](https://hackmd.io/s/BkN1JNQp#2016q3-homework1-phonebook)
[Hsiang ](https://hackmd.io/s/HylfV2FFe#2017q1-homework1-phonebook)
[twzjwang](https://hackmd.io/s/HJsOjpYKl#2017q1-homework1-phonebook)
[louielu](https://hackmd.io/CYJgZgHA7AnADAFgLQGMCmaCGSEGYCsyMARjPkgIwhS4gT4VwogJA===?view#2016q3-homework1-phonebook)
[0140454](https://hackmd.io/EwMwhgDBDGBsYFoDsBGWAOBAWdSCsCAnCIQEYLAoRgAmApvHiAMylA==?view#)
[Weinux](https://hackmd.io/s/SJOifEKte#2017q1-homework1-phonebook)
[hugikun999](https://hackmd.io/AwFgJgHATAxgpsAtMARgVgIyJAZhBRFAThzkQHY0w0AzMANhChBjCA==?view#2017q1-homework1-phonebook)
[ktvexe](https://hackmd.io/MYZgnCwGbALAtGArCADPWAjAhgRnpgOwAmY8UAbMJqsLgBxb2ZA=#2017q1-homework1-phonebook)
[0xff07](https://hackmd.io/s/SkpIRTvKe#)
[jeff60907](https://hackmd.io/s/HyA9r6CYx#2017q1-homework1-phonebook)
## 同步 Github fork 出来的分支
**step1**
先登入 github,對要 fork 的專案按下右上角的 fork

**step2**
打開 terminal,打上`$ git clone 剛剛fork的URL`
`$ git clone https://github.com/bananahuang/phonebook.git`
**step3**
`$ git remote -v `羅列遠端主機並顯示網址
```
origin https://github.com/bananahuang/phonebook.git (fetch)
origin https://github.com/bananahuang/phonebook.git (push)
```
**step4**
顯示當前狀態
`$git status`
顯示
```
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: phonebook_opt.c
modified: phonebook_opt.h
no changes added to commit (use "git add" and/or "git commit -a")
```
**step5**
`$ git add phonebook_opt.c`
`$ git add phonebook_opt.h`
顯示
```
On branch master
Your branch is up-to-date with 'origin/master'.
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: phonebook_opt.c
modified: phonebook_opt.h
```
**step6**
`$ git commit`(local workspace)會觸發 git hooks astyle, cppcheck
test 是自己的註解,要打上做了什麼修改,打錯了可以用`$git commit --amend`修正
最後註解會上傳github
[How to Write a Git Commit Message](https://chris.beams.io/posts/git-commit/)
```
# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# On branch master
# Your branch is up-to-date with 'origin/master'.
#
# Changes to be committed:
# modified: Makefile
# modified: math-toolkit.h
#
# Untracked files:
# out.jpg
# output.jpg
# output.png
#
```
```
[master 8f2853d] test
2 files changed, 25 insertions(+), 5 deletions(-)
```
**step7**
`$ git status`
顯示
```
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)
nothing to commit, working directory clean
```
**step8**
上傳到遠端儲存庫 `$ git push`
顯示
```
Username for 'https://github.com': brennenhuang
Password for 'https://brennenhuang@github.com':
Counting objects: 4, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 760 bytes | 0 bytes/s, done.
Total 4 (delta 2), reused 0 (delta 0)
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To https://github.com/bananahuang/phonebook.git
88e1942..8f2853d master -> master
```
**reference**
[ssh key](https://git-scm.com/book/zh-tw/v1/%E4%BC%BA%E6%9C%8D%E5%99%A8%E4%B8%8A%E7%9A%84-Git-%E7%94%9F%E6%88%90-SSH-%E5%85%AC%E9%96%8B%E9%87%91%E9%91%B0)
[ssh key 觀念](http://dannysun-unknown.blogspot.tw/2016/08/git-ssh-keygen-github.html)
[30 天精通 Git 版本控管](https://github.com/doggy8088/Learn-Git-in-30-days/blob/master/zh-tw/README.md)
[10分鐘掌握Git常用命令,做一名有知識的程式設計師](https://kknews.cc/zh-tw/other/5lgy26.html)
[Git 教學(1) : Git 的基本使用](http://gogojimmy.net/2012/01/17/how-to-use-git-1-git-basic/)
[介绍如何使用git及gitlab](https://github.com/kanlidy/LearmGit/blob/master/README.md)
[Git](https://zlargon.gitbooks.io/git-tutorial/content/)
[Git遠程操作詳解](https://read01.com/BEDyKG.html)
[gnuplot](https://hackmd.io/IwQwzApgTFBGEFoCsAGYAOBAWA7DxsYSmEwScWISVsAJkA==#)
[How to use makefile](https://hackmd.io/CYDgZgjArCAMDGBaA7AIwGysQFhAJm0QE4iR0cwBmPdWbePMZKIA#)
[etc276](https://hackmd.io/s/H12bIgQ5e)
[如果再github建立一個空的resp,那麼再本地端要直接上傳檔案的話,用這個](http://stackoverflow.com/questions/4181861/src-refspec-master-does-not-match-any-when-pushing-commits-in-git)
## [code review]
### coding skill:
while()?
if(...==NULL)=>if(!...)
strcmp意義比較字串後是回傳>0=0<0可能是把ascII相減
### other
malloc可能配置不到記憶體位置要小心(可用man malloc去看)
c/C++已經大不同
hash function 有六種 數學意義??
=>加速資料查詢的速度?
http://blog.xuite.net/yh96301/blog/242333268-%E5%AE%89%E8%A3%9DUbuntu+16.04
http://marktwtn.blogspot.tw/2014/05/diy-win8-ubuntu.html
CPU cache miss wiki
**思維提升**:
反藍標示 可以變成orig hash bst全都標示在main.c變成一個struct