# 2016q3 Homework1 (phonebook)
contributed by <`HaoTse`>
---
### Reviewed by <`jkrvivian`>
* 可以嘗試實作不同hash function進行效能比較
* 沒有進一步分析使用hash function造成append()時間提高的原因
* 還未實作文末提及到的字串壓縮法及binary search tree
* [91f657b](https://github.com/HaoTse/phonebook/commit/91f657bf5d02dde3e3862d432e2b272ce3b539eb) 和[315f489](https://github.com/HaoTse/phonebook/commit/315f489aa5d5caffb6fa42da252d5ea886a388fe)的commit可以參考[ How to Write a Git Commit Message](http://chris.beams.io/posts/git-commit/)改善
* [64c7ccf](https://github.com/HaoTse/phonebook/commit/64c7ccf186efc20100da1bfdd76700ce3c139757) 沒有說明程式碼修改的原因
## 開發環境
- 在筆電上灌了 Ubuntu 16.04 LTS
- .vimrc參考[成大資工wiki]( http://wiki.csie.ncku.edu.tw/vim/vimrc)
- 安裝相關開發工具
- 基本編譯器
- 效能分析工具 (perf)
- astyle
- colordiff
- gnuplot
```
$ sudo apt-get update
$ sudo apt-get install build-essential
$ sudo apt-get install linux-tools-common linux-tools-generic
$ sudo apt-get install astyle colordiff gnuplot
```
----
- `$uname -a`
```shell
Linux tse-Aspire-V5-573G 4.4.0-38-generic
```
- 查看電腦 cache 大小 `$ lscpu | grep cache`
```shell
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
```
---
## 案例分析:Phone Book
----
### 未優化版本
- `$ ./phonebook_orig`
```
size of entry : 136 bytes
execution time of append() : 0.047622 sec
execution time of findName() : 0.005108 sec
```
一個entry(PhoneBook size)為 136bytes,L1 cache的size為 32KB = 32 * 1024bytes,所以(32 * 1024) / (136 * 8) = 30.12,只能存30筆左右。
----
- `$ make cache-test`
```
Performance counter stats for './phonebook_orig' (100 runs):
96,4439 cache-misses # 87.936 % of all cache refs ( +- 0.43% )
109,6754 cache-references ( +- 0.28% )
2,6120,5049 instructions # 1.50 insns per cycle ( +- 0.02% )
1,7371,1341 cycles ( +- 0.39% )
0.070385042 seconds time elapsed ( +- 0.57% )
```
----
- `$ ./phonebook_orig & sudo perf top -p $!`
> 使用perf尋找熱點[name=鄭皓澤]
```shell=
23.06% libc-2.23.so [.] __strcasecmp_l_avx
22.35% phonebook_orig [.] findName
8.66% phonebook_orig [.] main
5.41% libc-2.23.so [.] _int_malloc
5.21% [kernel] [k] clear_page_c_e
4.01% phonebook_orig [.] append
3.96% libc-2.23.so [.] __memcpy_sse2
3.65% libc-2.23.so [.] _IO_fgets
3.20% [unknown] [k] 0x00007ff9db93e02e
2.73% [kernel] [k] release_pages
2.17% libc-2.23.so [.] memchr
1.40% libc-2.23.so [.] __strcpy_sse2_unaligned
1.40% libc-2.23.so [.] _IO_getline_info
1.37% [kernel] [k] uncharge_list
1.36% [kernel] [k] page_remove_rmap
1.28% [kernel] [k] alloc_pages_vma
0.94% [kernel] [k] lru_cache_add_active_or_unevictable
0.81% [kernel] [k] __rmqueue.isra.79
0.71% [kernel] [k] get_page_from_freelist
0.71% libc-2.23.so [.] _IO_getline
0.71% [kernel] [k] pagevec_lru_move_fn
0.71% libc-2.23.so [.] malloc
0.70% libc-2.23.so [.] __strcasecmp_avx
0.70% phonebook_orig [.] strcasecmp@plt
0.70% [kernel] [k] __pagevec_lru_add_fn
0.69% [kernel] [k] free_hot_cold_page
0.68% [kernel] [k] unmap_page_range
0.67% [kernel] [k] free_pcppages_bulk
0.05% [kernel] [k] __perf_event_enable
0.00% [kernel] [k] nmi_handle
0.00% [kernel] [k] native_write_msr_safe
```
使用 perf 的結果中的第2行我們可以看出 findName() 佔了 22.35%,第4行 append() 則佔了4.01%,findName()為熱點,著手改善findName()的速度。
---
### 優化方法一:減少struct的大小
前面提到我的電腦的L1 cache最多只能放30個entry,但總共有35萬個單字,因此一定會發生頻繁的cache miss。
根據題目要求「有沒有找到 last name」,我們只需要搜尋last name,因此其他資訊是可以忽略不看的。所以在這裡我就另外宣告了一個struct儲存last name,將他取代原本的entry,另外將原本的entry改為detail。
```clike=
typedef struct __LAST_NAME_ENTRY{
char lastName[MAX_LAST_NAME_SIZE];
detail *detail;
struct __LAST_NAME_ENTRY *pNext;
} entry;
```
----
#### 效能分析
- `$ ./phonebook_opt`
```
size of entry : 32 bytes
execution time of append() : 0.053768 sec
execution time of findName() : 0.002357 sec
```
entry的大小很明顯的從 136bytes下降至 32bytes,cache就可以放入(32 * 1024) / (32 * 8) = 128筆資料,大大的增加cache hit的機會。
findName()的執行時間也明顯下降了。
----
- `$ make cache-test`
```
Performance counter stats for './phonebook_opt' (100 runs):
14,9402 cache-misses # 28.386 % of all cache refs ( +- 1.18% )
52,6332 cache-references ( +- 1.64% )
2,4425,4756 instructions # 1.49 insns per cycle ( +- 0.02% )
1,6354,4468 cycles ( +- 1.61% )
0.071050083 seconds time elapsed ( +- 1.84% )
```
cache miss 的機率明顯的從 87.936%下降至 23.386%。
----
- `$ ./phonebook_opt & sudo perf top -p $!`
```shell=
28.64% libc-2.23.so [.] __strcasecmp_l_avx
20.16% phonebook_opt [.] main
8.76% phonebook_opt [.] findName
8.33% libc-2.23.so [.] malloc
7.41% libc-2.23.so [.] __memcpy_sse2
4.36% libc-2.23.so [.] _IO_getline_info
2.83% libc-2.23.so [.] _IO_fgets
2.76% libc-2.23.so [.] _int_malloc
2.60% libc-2.23.so [.] memchr
2.59% phonebook_opt [.] append
2.50% libc-2.23.so [.] __strcasecmp_avx
1.43% [kernel] [.] native_irq_return_iret
1.34% libc-2.23.so [.] __strcpy_sse2_unaligned
1.29% [unknown] [k] 0x00007f9a8187d02e
1.26% [kernel] [k] do_exit
1.25% [kernel] [k] unmap_page_range
1.24% [kernel] [k] free_pages_prepare
1.24% [kernel] [k] free_pcppages_bulk
0.01% [kernel] [k] native_sched_clock
0.00% [kernel] [k] native_write_msr_safe
```
findName()所佔的效能也從 22.35%降至 8.76%。
----
- plot
![](https://i.imgur.com/hpWWzFi.png)
---
### 優化方法二:使用hash function
從方法一改變資料結構後已經可以發現findName()明顯的效能進步,不過考慮到單字量的大小,就想到可以從hash function著手,利用last name當作key,對hash table做搜尋一定比每次都要從頭搜尋還要快。
根據[參考資料](https://www.byvoid.com/blog/string-hash-compare)看出BKDRHash的平均表現是最好的,因此在這裡選擇使用它。
```clike=
// BKDR Hash Function
unsigned int BKDRHash(char *str)
{
unsigned int seed = 131; // 31 131 1313 13131 131313 etc..
unsigned int hash = 0;
while (*str)
{
hash = hash * seed + (*str++);
}
//HASH_TABLE_SIZE define in phonebook_hash.h
return (hash % HASH_TABLE_SIZE);
}
```
並先將table size設為42737,宣告方式變為
```clike=
entry *pHead[HASH_TABLE_SIZE], *e[HASH_TABLE_SIZE];
```
> 這裡出現了小bug,花了不少時間學習使用gdb。[name=鄭皓澤]
----
#### 效能分析
- `$ ./phonebook_hash`
```
size of entry : 32 bytes
execution time of append() : 0.079218 sec053768
execution time of findName() : 0.000000 sec
```
findName()的時間直接降至顯示0秒,但append()卻上升了0.02545秒。
----
- `$ make cache-test`
```
Performance counter stats for './phonebook_hash' (100 runs):
105,9511 cache-misses # 64.586 % of all cache refs ( +- 0.85% )
164,0460 cache-references ( +- 0.31% )
2,9978,3164 instructions # 1.13 insns per cycle ( +- 0.02% )
2,6638,1339 cycles ( +- 0.45% )
0.111340353 seconds time elapsed ( +- 0.52% )
```
cache miss 的機率從 23.386%上升至 64.586%。
----
- plot
![](https://i.imgur.com/JVc7vg3.png)
---
## 其他方法(待實驗)
- [ ] 字串壓縮法
- [ ] BST
---
## 參考資料
- [Linux 效能分析工具: Perf](http://wiki.csie.ncku.edu.tw/embedded/perf-tutorial)
- [Makefile 簡介](http://tetralet.luna.com.tw/?op=ViewArticle&articleId=185)
- [各種hash function介紹](https://www.byvoid.com/blog/string-hash-compare)
- [BKDR hash介紹](http://blog.csdn.net/wanglx_/article/details/40400693)
- [GDB簡介](http://tetralet.luna.com.tw/?op=ViewArticle&articleId=187&blogId=1)
- [DADA筆記](https://embedded2016.hackpad.com/ep/pad/static/YkqjhwgnQcA)
- [勃興筆記](https://hackmd.io/s/Bkx3cYX6)
- [VVN筆記](https://hackmd.io/s/SyOQgOQa#優化)
###### tags: `HaoTse` `sysprog21` `phonebook` `hash function`