Try   HackMD

2016q3 Homework1 (clz)

contributed by <HaoTse>


目標

  • recursive version
  • iteration version
  • binary search technique
  • byte-shift version
  • Harley’s algorithm

Check clz

首先撰寫確認計算出來的值正確的程式,先使用 重新理解數值 中提到的 iteration version, binary search technique, byte-shift version 來實作,並用 gcc 提供的 int __builtin_clz (unsigned int x) 作為正確輸出。

注意. __builtin_clz (unsigned int x) 不能傳入 0,否則會是 undefined。鄭皓澤


結果輸出

  • $ make check
time ./test_iterator
Very Good
100.55user 0.01system 1:40.58elapsed 99%CPU (0avgtext+0avgdata 1252maxresident)k
0inputs+0outputs (0major+60minor)pagefaults 0swaps
time ./test_bin_search
Very Good
34.66user 0.01system 0:34.68elapsed 99%CPU (0avgtext+0avgdata 1168maxresident)k
0inputs+0outputs (0major+61minor)pagefaults 0swaps
time ./test_byte_shift
Very Good
39.28user 0.05system 0:39.33elapsed 99%CPU (0avgtext+0avgdata 1256maxresident)k
0inputs+0outputs (0major+63minor)pagefaults 0swaps

明顯看出 iterator 版本明顯慢很多。

time 指令的輸出格式可參照連結。鄭皓澤


Recursive 版本

加入以下程式碼

int recursive_clz(uint32_t x, int offset) { //when input = 0 if(x == 0) return 32; if(offset == 0) return 0; uint16_t upper = (x >> offset); uint16_t lower = (x & (0xFFFF >> (16 - offset))); return upper ? recursive_clz(upper, offset >> 1) : offset + recursive_clz(lower, offset >> 1); }

  • $ make check 輸出
time ./test_iterator
Very Good
101.70user 0.00system 1:41.71elapsed 99%CPU (0avgtext+0avgdata 1212maxresident)k
0inputs+0outputs (0major+63minor)pagefaults 0swaps
time ./test_bin_search
Very Good
35.92user 0.00system 0:35.93elapsed 99%CPU (0avgtext+0avgdata 1268maxresident)k
0inputs+0outputs (0major+61minor)pagefaults 0swaps
time ./test_byte_shift
Very Good
41.53user 0.09system 0:41.63elapsed 99%CPU (0avgtext+0avgdata 1252maxresident)k
0inputs+0outputs (0major+61minor)pagefaults 0swaps
time ./test_recursive
Very Good
207.42user 0.00system 3:27.43elapsed 99%CPU (0avgtext+0avgdata 1236maxresident)k
0inputs+0outputs (0major+63minor)pagefaults 0swaps

發現 recursive 跑的非常久,接下來使用 gnuplot 畫圖方便做比較。


  • gnuplot 輸出

tags: HaoTse sysprog21 clz