# 2020q3 Homework3 (dict)
contributed by < `nelsonlai1` >
## 視覺化 ternary search tree + bloom filter 的效能表現並分析
在 `test_common.c` 中可以發現預設的輸入檔案為 `cities.txt` 以及使用 REF 機制。
而測試用的指令只有 `-- bench`,也就是並沒有使用 bloom filter。為了測試使用 bloom filter 帶來的差異,在 `test_common.c` 中加入
```c=
if (argc == 3 && strcmp(argv[1], "--bloom_bench") == 0) {
int stat = bloom_bench_test(root, BENCH_TEST_FILE, LMAX, bloom);
tst_free(root);
free(pool);
bloom_free(bloom);
return stat;
}
```
並在 `bench.c` 加入 `bloom_bench_test` 這個 function。
而這個 function 與 `bench_test` 大致相同,只差在會先用 `bloom_test` 測試是否通過 bloom filter,再做 `tst_search_prefix`。
另外在 bench.c 中讀取資料使用 fscanf,會在空白處截斷,但 `test_common.c` 中建檔時使用 fgets,不會在空白處中斷,因此這裡也改用 fgets 讀取,讓 `bloom_test` 可以通過。
**bench input file : cities.txt**
![](https://i.imgur.com/mEvj1mu.png)
**bench input file : password.txt (常用密碼字典,與 cities.txt 完全不同)**
![](https://i.imgur.com/Tp2jz1A.png)
## 以 perf 分析 TST 程式碼的執行時期行為並且充分解讀
**bench input file : cities.txt**
`--bench CPY`
```
CPY mechanism
ternary_tree, loaded 206849 words in 0.097356 sec
Performance counter stats for './test_common --bench CPY':
205.15 msec task-clock # 0.999 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
7350 page-faults # 0.036 M/sec
8,4884,4762 cycles # 4.138 GHz
11,0056,7614 instructions # 1.30 insn per cycle
1,7759,3844 branches # 865.671 M/sec
436,0298 branch-misses # 2.46% of all branches
0.205407753 seconds time elapsed
0.205419000 seconds user
0.000000000 seconds sys
```
`--bench REF`
```
REF mechanism
ternary_tree, loaded 206849 words in 0.096256 sec
Performance counter stats for './test_common --bench REF':
209.29 msec task-clock # 0.999 CPUs utilized
1 context-switches # 0.005 K/sec
0 cpu-migrations # 0.000 K/sec
7169 page-faults # 0.034 M/sec
8,6639,8287 cycles # 4.140 GHz
10,7603,0564 instructions # 1.24 insn per cycle
1,7275,8733 branches # 825.442 M/sec
456,7932 branch-misses # 2.64% of all branches
0.209602805 seconds time elapsed
0.193478000 seconds user
0.016123000 seconds sys
```
`--bloom_bench CPY`
```
CPY mechanism
ternary_tree, loaded 206849 words in 0.097569 sec
Performance counter stats for './test_common --bloom_bench CPY':
214.67 msec task-clock # 0.999 CPUs utilized
1 context-switches # 0.005 K/sec
0 cpu-migrations # 0.000 K/sec
7349 page-faults # 0.034 M/sec
8,8872,2265 cycles # 4.140 GHz
11,4129,9229 instructions # 1.28 insn per cycle
1,8121,4640 branches # 844.150 M/sec
436,6378 branch-misses # 2.41% of all branches
0.214975275 seconds time elapsed
0.210907000 seconds user
0.004055000 seconds sys
```
`--bloom_bench REF`
```
REF mechanism
ternary_tree, loaded 206849 words in 0.095650 sec
Performance counter stats for './test_common --bloom_bench REF':
218.26 msec task-clock # 0.999 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
7168 page-faults # 0.033 M/sec
9,0654,7267 cycles # 4.154 GHz
11,1553,2711 instructions # 1.23 insn per cycle
1,7629,4874 branches # 807.732 M/sec
458,9250 branch-misses # 2.60% of all branches
0.218490144 seconds time elapsed
0.218502000 seconds user
0.000000000 seconds sys
```
雖然在圖上看不太出來,但透過 perf 指令依然可看出在每筆資料都會命中的情況下使用 bloom filter ,會比不使用還花時間。而 REF 也會比 CPY 來的久。
**bench input file : password.txt**
`--bench CPY`
```
CPY mechanism
ternary_tree, loaded 206849 words in 0.096618 sec
Performance counter stats for './test_common --bench CPY':
149.09 msec task-clock # 0.998 CPUs utilized
1 context-switches # 0.007 K/sec
0 cpu-migrations # 0.000 K/sec
7346 page-faults # 0.049 M/sec
6,1920,8149 cycles # 4.153 GHz
9,3016,3852 instructions # 1.50 insn per cycle
1,5460,1854 branches # 1036.972 M/sec
257,6981 branch-misses # 1.67% of all branches
0.149381557 seconds time elapsed
0.141297000 seconds user
0.008074000 seconds sys
```
`--bench REF`
```
REF mechanism
ternary_tree, loaded 206849 words in 0.096548 sec
Performance counter stats for './test_common --bench REF':
148.93 msec task-clock # 0.998 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
7171 page-faults # 0.048 M/sec
6,1528,0654 cycles # 4.131 GHz
9,0672,2462 instructions # 1.47 insn per cycle
1,4985,9561 branches # 1006.229 M/sec
282,7426 branch-misses # 1.89% of all branches
0.149161319 seconds time elapsed
0.145145000 seconds user
0.004031000 seconds sys
```
`--bloom_bench CPY`
```
CPY mechanism
ternary_tree, loaded 206849 words in 0.096723 sec
Performance counter stats for './test_common --bloom_bench CPY':
147.76 msec task-clock # 0.998 CPUs utilized
2 context-switches # 0.014 K/sec
0 cpu-migrations # 0.000 K/sec
7345 page-faults # 0.050 M/sec
6,0905,3410 cycles # 4.122 GHz
9,2034,7182 instructions # 1.51 insn per cycle
1,5179,2614 branches # 1027.282 M/sec
245,7345 branch-misses # 1.62% of all branches
0.148022654 seconds time elapsed
0.140008000 seconds user
0.008000000 seconds sys
```
`--bloom_bench REF`
```
REF mechanism
ternary_tree, loaded 206849 words in 0.096837 sec
Performance counter stats for './test_common --bloom_bench REF':
147.60 msec task-clock # 0.998 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
7168 page-faults # 0.049 M/sec
6,0931,8807 cycles # 4.128 GHz
8,9705,9714 instructions # 1.47 insn per cycle
1,4704,2041 branches # 996.217 M/sec
273,8430 branch-misses # 1.86% of all branches
0.147825302 seconds time elapsed
0.143846000 seconds user
0.003995000 seconds sys
```
時間上的差距比使用 `cities.txt` 還小很多,但在都會 miss 時,使用 bloom filter 有稍微快了一點。