contributed by < chasingfar >
請務必依循作業要求,把上面那行寫好!從細節就做好。
改這樣嗎?
chasingfar
0922james0922
每一個部份都個別放在branch裡
cache-test
hash-test
red-black-tree-test
chasingfar
YuanHaoHo
$ uname -a
Linux chasingjar-V5-591G 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ lscpu
Architecture: x86_64
CPU 作業模式: 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
每核心執行緒數:2
每通訊端核心數:4
Socket(s): 1
NUMA 節點: 1
供應商識別號: GenuineIntel
CPU 家族: 6
型號: 94
Model name: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
製程: 3
CPU MHz: 800.007
CPU max MHz: 3500.0000
CPU min MHz: 800.0000
BogoMIPS: 5183.88
虛擬: VT-x
L1d 快取: 32K
L1i 快取: 32K
L2 快取: 256K
L3 快取: 6144K
NUMA node0 CPU(s): 0-7
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb invpcid_single intel_pt retpoline kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
$ free
total used free shared buff/cache available
Mem: 8029896 2599920 3639168 386196 1790808 4751816
置換: 3999740 0 3999740
Performance counter stats for './phonebook_orig' (100 runs):
4,456,791 cache-misses # 91.648 % of all cache refs ( +- 0.19% )
4,862,924 cache-references ( +- 0.13% )
261,860,427 instructions # 1.44 insns per cycle ( +- 0.02% )
182,196,930 cycles ( +- 0.11% )
0.056528421 seconds time elapsed ( +- 0.70% )
Performance counter stats for './phonebook_opt_256' (100 runs):
1,814,972 cache-misses # 73.200 % of all cache refs ( +- 0.16% )
2,479,458 cache-references ( +- 0.16% )
295,179,271 instructions # 1.78 insns per cycle ( +- 0.03% )
165,662,413 cycles ( +- 0.08% )
0.050684667 seconds time elapsed ( +- 0.17% )
Performance counter stats for './phonebook_opt_512' (100 runs):
1,723,911 cache-misses # 71.551 % of all cache refs ( +- 0.14% )
2,409,340 cache-references ( +- 0.23% )
294,702,925 instructions # 1.79 insns per cycle ( +- 0.03% )
164,437,002 cycles ( +- 0.07% )
0.050525250 seconds time elapsed ( +- 0.13% )
Performance counter stats for './phonebook_opt_1024' (100 runs):
1,673,682 cache-misses # 70.257 % of all cache refs ( +- 0.13% )
2,382,239 cache-references ( +- 0.24% )
294,768,040 instructions # 1.79 insns per cycle ( +- 0.03% )
164,291,731 cycles ( +- 0.08% )
0.050583163 seconds time elapsed ( +- 0.16% )
Performance counter stats for './phonebook_opt_2048' (100 runs):
1,650,038 cache-misses # 69.371 % of all cache refs ( +- 0.13% )
2,378,586 cache-references ( +- 0.35% )
294,483,815 instructions # 1.80 insns per cycle ( +- 0.03% )
163,949,313 cycles ( +- 0.08% )
0.050560016 seconds time elapsed ( +- 0.14% )
Performance counter stats for './phonebook_opt_4096' (100 runs):
1,638,272 cache-misses # 68.202 % of all cache refs ( +- 0.12% )
2,402,086 cache-references ( +- 0.32% )
294,757,134 instructions # 1.80 insns per cycle ( +- 0.03% )
163,483,033 cycles ( +- 0.08% )
0.050201784 seconds time elapsed ( +- 0.16% )
想法:
使用 djb2hash function 壓縮字串資料量加快比對速度
優點:
hash 後就只是 long int 比對速度極快
缺點:
hash會增加構建時間
實測
註1: djb2hash function 來自http://www.cse.yorku.ca/~oz/hash.html
Performance counter stats for './phonebook_orig' (100 runs):
4,757,940 cache-misses # 93.920 % of all cache refs ( +- 0.16% )
5,065,927 cache-references ( +- 0.26% )
261,406,649 instructions # 1.37 insns per cycle ( +- 0.02% )
190,460,703 cycles ( +- 0.27% )
0.063316735 seconds time elapsed ( +- 0.53% )
Performance counter stats for './phonebook_opt_256' (100 runs):
1,696,705 cache-misses # 76.264 % of all cache refs ( +- 0.17% )
2,224,792 cache-references ( +- 0.29% )
278,638,415 instructions # 1.64 insns per cycle ( +- 0.02% )
169,751,759 cycles ( +- 0.18% )
0.055284318 seconds time elapsed ( +- 0.23% )
Performance counter stats for './phonebook_opt_512' (100 runs):
1,587,984 cache-misses # 72.650 % of all cache refs ( +- 0.27% )
2,185,787 cache-references ( +- 0.56% )
278,485,265 instructions # 1.65 insns per cycle ( +- 0.02% )
169,072,254 cycles ( +- 0.14% )
0.055322586 seconds time elapsed ( +- 0.22% )
Performance counter stats for './phonebook_opt_1024' (100 runs):
1,538,045 cache-misses # 72.220 % of all cache refs ( +- 0.16% )
2,129,653 cache-references ( +- 0.32% )
278,190,421 instructions # 1.65 insns per cycle ( +- 0.02% )
168,256,201 cycles ( +- 0.13% )
0.054859117 seconds time elapsed ( +- 0.22% )
Performance counter stats for './phonebook_opt_2048' (100 runs):
1,512,210 cache-misses # 71.419 % of all cache refs ( +- 0.18% )
2,117,363 cache-references ( +- 0.41% )
278,144,942 instructions # 1.66 insns per cycle ( +- 0.02% )
167,913,330 cycles ( +- 0.11% )
0.054865124 seconds time elapsed ( +- 0.23% )
Performance counter stats for './phonebook_opt_4096' (100 runs):
1,495,944 cache-misses # 71.578 % of all cache refs ( +- 0.14% )
2,089,945 cache-references ( +- 0.25% )
278,056,698 instructions # 1.66 insns per cycle ( +- 0.02% )
167,666,340 cycles ( +- 0.10% )
0.054465060 seconds time elapsed ( +- 0.15% )
闡述動機和考量點,以及實務上你在什麼應用看到 red-black tree 呢?
因為第一反應這像是個 Map 結構,而 Java 的 TreeMap 就是用 red-black tree 實做的
所以嘗試自己平衡 binary-search tree 失敗後就先找了 red-black tree 的資料
另外實現似乎還有問題,我還沒確定問題所在,所以一直沒維護這邊
chasingfar
Performance counter stats for './phonebook_orig' (100 runs):
7,565,749 cache-misses # 1.952 % of all cache refs ( +- 4.17% )
387,559,508 cache-references ( +- 0.34% )
5,751,954,460 instructions # 2.93 insns per cycle ( +- 0.00% )
1,965,321,160 cycles ( +- 0.36% )
0.644600121 seconds time elapsed ( +- 0.48% )
Performance counter stats for './phonebook_opt' (100 runs):
154,832 cache-misses # 8.596 % of all cache refs ( +- 2.37% )
1,801,256 cache-references ( +- 7.61% )
146,623,518 instructions # 0.98 insns per cycle ( +- 0.01% )
149,276,315 cycles ( +- 0.22% )
0.049843064 seconds time elapsed