contributed by <heathcliffYang
>
hunng
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 58
Model name: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
Stepping: 9
CPU MHz: 1200.000
CPU max MHz: 3100.0000
CPU min MHz: 1200.0000
BogoMIPS: 4988.88
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0-3
Handle 0x0001, DMI type 7, 19 bytes
Cache Information
Socket Designation: Unknown
Configuration: Enabled, Not Socketed, Level 1
Operational Mode: Write Back
Location: Internal
Installed Size: 32 kB
Maximum Size: 32 kB
Supported SRAM Types:
Asynchronous
Installed SRAM Type: Asynchronous
Speed: Unknown
Error Correction Type: Single-bit ECC
System Type: Other
Associativity: 16-way Set-associative
Performance counter stats for './phonebook_orig' (100 runs):
2,037,678 cache-misses # 95.827 % of all cache refs ( +- 0.65% )
2,126,412 cache-references ( +- 0.65% )
260,791,029 instructions # 1.44 insns per cycle ( +- 0.02% )
180,682,430 cycles ( +- 0.27% )
0.069068516 seconds time elapsed ( +- 1.85% )
entry
的大小簡化為基本,繼續之前沒能實作的 hash function 的方法。TABLE_SIZE
的值會影響到 append()
所花的時間與 cache miss Performance counter stats for './phonebook_opt' (100 runs):
412,103 cache-misses # 42.583 % of all cache refs ( +- 0.76% )
967,771 cache-references ( +- 0.26% )
233,042,376 instructions # 1.41 insns per cycle ( +- 0.02% )
165,053,962 cycles ( +- 0.34% )
0.058836808 seconds time elapsed ( +- 1.34% )
什麼是 the number of layers (Y軸)
課程助教
我喜歡他的圖亮谷
To 課程助教:因為想知道 word 被分配到各個 key 的情形楊惟晶
BKDR
cache-misses 52.208%
SDBM
cache-misses 59.723%
RS
cache-misses 57.723%
JS
49.795%
PJW
53.365%
ELF
50.140%
DJB
50.343%
AP
51.845%
/dictionary/words.txt -> 349900 個 words
/dictionary/all-names.txt -> 16750 個 names
因為 dataset 太小,append() 都沒有花很多時間;而就 findName()來說,只要 target 越前面,對 orig 版本越有利,而 opt 則是穩定的少。
此次嘗試只改變 opt 的搜尋方法,而不改變資料的結構
opt : BKDR + TABLE_SIZE 為 20000
append() + findName()
Performance counter stats for './phonebook_opt' (100 runs):
1,163,149 cache-misses # 64.363 % of all cache refs ( +- 0.16% )
1,807,180 cache-references ( +- 0.12% )
255,560,409 instructions # 1.18 insns per cycle ( +- 0.12% )
216,628,926 cycles ( +- 0.53% )
0.103203648 seconds time elapsed ( +- 1.99% )
只有append()
Performance counter stats for './phonebook_opt' (100 runs):
1,159,457 cache-misses # 63.565 % of all cache refs ( +- 0.19% )
1,824,048 cache-references ( +- 0.12% )
254,958,193 instructions # 1.19 insns per cycle ( +- 0.08% )
215,133,335 cycles ( +- 0.42% )
0.101813265 seconds time elapsed ( +- 2.16% )
Performance counter stats for './phonebook_orig' (100 runs):
908,096 cache-misses # 93.467 % of all cache refs ( +- 0.05% )
971,572 cache-references ( +- 0.10% )
201,187,670 instructions # 1.50 insns per cycle ( +- 0.03% )
133,950,753 cycles ( +- 0.20% )
0.052471375 seconds time elapsed ( +- 2.70% )
cache-misses 只少了 3692 次 (只佔約0.32%)影響很小
跟一開始的檔案系統設定的 block size = 4096 bytes 有不一樣嗎?
楊惟晶
這學期希望自己著重在實際把演算法變成程式碼,而不是找到可以用的複製貼上,寫程式能力太弱了需要練習。
從構思到 debug 完居然花了一整天。
/dictionary/all-names.txt , 找的字 = zuzana (最後一個)
append() 的時間多了整整 46%,而從 distribution 的狀況來看,可以隱約看出端倪,分佈的跨度非常大,極度不平均,這是因為英文上有些"聽起來很像"的名字,所以依照 soundex 會得到一樣的編碼。
Example:
find : abby
Similar answer:
abby, 100
abbie, 100
abbi, 100
abbey, 100
abbe, 100
TABLE_SIZE 的變化。
20000
Performance counter stats for './phonebook_opt' (100 runs):
41,700 cache-misses # 57.339 % of all cache refs ( +- 0.65% )
72,724 cache-references ( +- 2.26% )
38,520,996 instructions # 1.68 insns per cycle ( ± 0.01% )
22,983,744 cycles ( ± 1.21% )
0.008330655 seconds time elapsed ( ± 1.23% )
3000
Performance counter stats for './phonebook_opt' (100 runs):
25,337 cache-misses # 54.181 % of all cache refs ( +- 0.83% )
46,763 cache-references ( +- 2.82% )
21,227,533 instructions # 1.41 insns per cycle ( ± 0.01% )
15,037,622 cycles ( ± 1.18% )
0.005637389 seconds time elapsed ( ± 1.41% )
table size 愈小,出現愈高的峰值,某些"聽起來"可能的字愈多的名字,就可能會找到不省人事。