Try   HackMD

2018q1 Homework1 (phonebook)

contributed by < chasingfar >

請務必依循作業要求,把上面那行寫好!從細節就做好。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

改這樣嗎?
chasingfar

Reviewed by 0922james0922

  • 在紅黑樹的部份,可以再加強一些改善動機或是想法的細節,而不是直接跑出結論跟實驗結果,中間過程可以再加詳述。
  • 兩種方法的append的改善都不佳,可以試想想有沒有改善append的方法而不是只考量findname好就好
  • github上的opt的程式是否沒上傳成功,沒看到改寫過後的版本

GitHub

每一個部份都個別放在branch裡
cache-test
hash-test
red-black-tree-test
chasingfar

Reviewed by YuanHaoHo

  • 建議可用探討在使用各種方法後的成效,而非將成果丟出後不進行討論。
  • 中英文間記得加空白喔~

系統環境

$ uname -a
Linux chasingjar-V5-591G 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ lscpu
Architecture:          x86_64
CPU 作業模式:    32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
每核心執行緒數:2
每通訊端核心數:4
Socket(s):             1
NUMA 節點:         1
供應商識別號:  GenuineIntel
CPU 家族:          6
型號:              94
Model name:            Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
製程:              3
CPU MHz:             800.007
CPU max MHz:           3500.0000
CPU min MHz:           800.0000
BogoMIPS:              5183.88
虛擬:              VT-x
L1d 快取:          32K
L1i 快取:          32K
L2 快取:           256K
L3 快取:           6144K
NUMA node0 CPU(s):     0-7
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb invpcid_single intel_pt retpoline kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
$ free
              total        used        free      shared  buff/cache   available
Mem:        8029896     2599920     3639168      386196     1790808     4751816
置換:     3999740           0     3999740

增加entry size利用率

  • 想法:
    鍵值分別存,以避免讀入無用資料進cache
    增加一次宣告的記憶體大小,一個entry放盡可能多筆資料
  • 優點:
    findName()時的比對近乎線性,應能減少cache miss
  • 缺點:
    鍵值分別需要兩個strcpy,增加append()時間
    一次宣告的記憶體太大時會浪費空間
  • 實測
    嘗試改變一個entry裡的資料數(opt後的數字),但似乎影響不大
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
    ​​​​Performance counter stats for './phonebook_orig' (100 runs):
    
    ​​​​     4,456,791      cache-misses              #   91.648 % of all cache refs      ( +-  0.19% )
    ​​​​     4,862,924      cache-references                                              ( +-  0.13% )
    ​​​​   261,860,427      instructions              #    1.44  insns per cycle          ( +-  0.02% )
    ​​​​   182,196,930      cycles                                                        ( +-  0.11% )
    
    ​​​​   0.056528421 seconds time elapsed                                          ( +-  0.70% )
    
    ​​​​Performance counter stats for './phonebook_opt_256' (100 runs):
    
    ​​​​     1,814,972      cache-misses              #   73.200 % of all cache refs      ( +-  0.16% )
    ​​​​     2,479,458      cache-references                                              ( +-  0.16% )
    ​​​​   295,179,271      instructions              #    1.78  insns per cycle          ( +-  0.03% )
    ​​​​   165,662,413      cycles                                                        ( +-  0.08% )
    
    ​​​​   0.050684667 seconds time elapsed                                          ( +-  0.17% )
    
    ​​​​Performance counter stats for './phonebook_opt_512' (100 runs):
    
    ​​​​     1,723,911      cache-misses              #   71.551 % of all cache refs      ( +-  0.14% )
    ​​​​     2,409,340      cache-references                                              ( +-  0.23% )
    ​​​​   294,702,925      instructions              #    1.79  insns per cycle          ( +-  0.03% )
    ​​​​   164,437,002      cycles                                                        ( +-  0.07% )
    
    ​​​​   0.050525250 seconds time elapsed                                          ( +-  0.13% )
    
    
    ​​​​Performance counter stats for './phonebook_opt_1024' (100 runs):
    
    ​​​​     1,673,682      cache-misses              #   70.257 % of all cache refs      ( +-  0.13% )
    ​​​​     2,382,239      cache-references                                              ( +-  0.24% )
    ​​​​   294,768,040      instructions              #    1.79  insns per cycle          ( +-  0.03% )
    ​​​​   164,291,731      cycles                                                        ( +-  0.08% )
    
    ​​​​   0.050583163 seconds time elapsed                                          ( +-  0.16% )
    
    
    ​​​​Performance counter stats for './phonebook_opt_2048' (100 runs):
    
    ​​​​     1,650,038      cache-misses              #   69.371 % of all cache refs      ( +-  0.13% )
    ​​​​     2,378,586      cache-references                                              ( +-  0.35% )
    ​​​​   294,483,815      instructions              #    1.80  insns per cycle          ( +-  0.03% )
    ​​​​   163,949,313      cycles                                                        ( +-  0.08% )
    
    ​​​​   0.050560016 seconds time elapsed                                          ( +-  0.14% )
    
    
    ​​​​Performance counter stats for './phonebook_opt_4096' (100 runs):
    
    ​​​​     1,638,272      cache-misses              #   68.202 % of all cache refs      ( +-  0.12% )
    ​​​​     2,402,086      cache-references                                              ( +-  0.32% )
    ​​​​   294,757,134      instructions              #    1.80  insns per cycle          ( +-  0.03% )
    ​​​​   163,483,033      cycles                                                        ( +-  0.08% )
    
    ​​​​   0.050201784 seconds time elapsed                                          ( +-  0.16% )
    

使用 hash function

  • 想法:
    使用 djb2hash function 壓縮字串資料量加快比對速度

  • 優點:
    hash 後就只是 long int 比對速度極快

  • 缺點:
    hash會增加構建時間

  • 實測
    註1: djb2hash function 來自http://www.cse.yorku.ca/~oz/hash.html

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

    ​​​​ Performance counter stats for './phonebook_orig' (100 runs):
    
    ​​​​     4,757,940      cache-misses              #   93.920 % of all cache refs      ( +-  0.16% )
    ​​​​     5,065,927      cache-references                                              ( +-  0.26% )
    ​​​​   261,406,649      instructions              #    1.37  insns per cycle          ( +-  0.02% )
    ​​​​   190,460,703      cycles                                                        ( +-  0.27% )
    
    ​​​​   0.063316735 seconds time elapsed                                          ( +-  0.53% )
    
    
    ​​​​Performance counter stats for './phonebook_opt_256' (100 runs):
    
    ​​​​     1,696,705      cache-misses              #   76.264 % of all cache refs      ( +-  0.17% )
    ​​​​     2,224,792      cache-references                                              ( +-  0.29% )
    ​​​​   278,638,415      instructions              #    1.64  insns per cycle          ( +-  0.02% )
    ​​​​   169,751,759      cycles                                                        ( +-  0.18% )
    
    ​​​​   0.055284318 seconds time elapsed                                          ( +-  0.23% )
    
    
    ​​​​Performance counter stats for './phonebook_opt_512' (100 runs):
    
    ​​​​     1,587,984      cache-misses              #   72.650 % of all cache refs      ( +-  0.27% )
    ​​​​     2,185,787      cache-references                                              ( +-  0.56% )
    ​​​​   278,485,265      instructions              #    1.65  insns per cycle          ( +-  0.02% )
    ​​​​   169,072,254      cycles                                                        ( +-  0.14% )
    
    ​​​​   0.055322586 seconds time elapsed                                          ( +-  0.22% )
    
    
    ​​​​Performance counter stats for './phonebook_opt_1024' (100 runs):
    
    ​​​​     1,538,045      cache-misses              #   72.220 % of all cache refs      ( +-  0.16% )
    ​​​​     2,129,653      cache-references                                              ( +-  0.32% )
    ​​​​   278,190,421      instructions              #    1.65  insns per cycle          ( +-  0.02% )
    ​​​​   168,256,201      cycles                                                        ( +-  0.13% )
    
    ​​​​   0.054859117 seconds time elapsed                                          ( +-  0.22% )
    
    
    ​​​​Performance counter stats for './phonebook_opt_2048' (100 runs):
    
    ​​​​     1,512,210      cache-misses              #   71.419 % of all cache refs      ( +-  0.18% )
    ​​​​     2,117,363      cache-references                                              ( +-  0.41% )
    ​​​​   278,144,942      instructions              #    1.66  insns per cycle          ( +-  0.02% )
    ​​​​   167,913,330      cycles                                                        ( +-  0.11% )
    
    ​​​​   0.054865124 seconds time elapsed                                          ( +-  0.23% )
    
    
    ​​​​Performance counter stats for './phonebook_opt_4096' (100 runs):
    
    ​​​​     1,495,944      cache-misses              #   71.578 % of all cache refs      ( +-  0.14% )
    ​​​​     2,089,945      cache-references                                              ( +-  0.25% )
    ​​​​   278,056,698      instructions              #    1.66  insns per cycle          ( +-  0.02% )
    ​​​​   167,666,340      cycles                                                        ( +-  0.10% )
    
    ​​​​   0.054465060 seconds time elapsed                                          ( +-  0.15% )
    

使用紅黑樹

闡述動機和考量點,以及實務上你在什麼應用看到 red-black tree 呢?

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

因為第一反應這像是個 Map 結構,而 Java 的 TreeMap 就是用 red-black tree 實做的
所以嘗試自己平衡 binary-search tree 失敗後就先找了 red-black tree 的資料
另外實現似乎還有問題,我還沒確定問題所在,所以一直沒維護這邊
chasingfar

  • 想法:
    用二元搜尋樹減少比對次數
  • 優點:
    查詢速度極快
  • 缺點:
    構建時間極長
  • 實測
    註1:由於構建時間過長字典資料改為 all-names.txt
    註2:由於查詢速度過快改為測試查詢所有資料
    註3:紅黑樹的代碼來自Wikipedia
    ​​​​ Performance counter stats for './phonebook_orig' (100 runs):
    
    ​​​​     7,565,749      cache-misses              #    1.952 % of all cache refs      ( +-  4.17% )
    ​​​​   387,559,508      cache-references                                              ( +-  0.34% )
    ​​​​ 5,751,954,460      instructions              #    2.93  insns per cycle          ( +-  0.00% )
    ​​​​ 1,965,321,160      cycles                                                        ( +-  0.36% )
    
    ​​​​   0.644600121 seconds time elapsed                                          ( +-  0.48% )
    
    
    ​​​​Performance counter stats for './phonebook_opt' (100 runs):
    
    ​​​​       154,832      cache-misses              #    8.596 % of all cache refs      ( +-  2.37% )
    ​​​​     1,801,256      cache-references                                              ( +-  7.61% )
    ​​​​   146,623,518      instructions              #    0.98  insns per cycle          ( +-  0.01% )
    ​​​​   149,276,315      cycles                                                        ( +-  0.22% )
    
    ​​​​   0.049843064 seconds time elapsed