contributed by <HaoTse
>
首先撰寫確認計算出來的值正確的程式,先使用 重新理解數值 中提到的 iteration version
, binary search technique
, byte-shift version
來實作,並用 gcc 提供的 int __builtin_clz (unsigned int x)
作為正確輸出。
注意.
__builtin_clz (unsigned int x)
不能傳入 0,否則會是 undefined。鄭皓澤
$ make check
time ./test_iterator
Very Good
100.55user 0.01system 1:40.58elapsed 99%CPU (0avgtext+0avgdata 1252maxresident)k
0inputs+0outputs (0major+60minor)pagefaults 0swaps
time ./test_bin_search
Very Good
34.66user 0.01system 0:34.68elapsed 99%CPU (0avgtext+0avgdata 1168maxresident)k
0inputs+0outputs (0major+61minor)pagefaults 0swaps
time ./test_byte_shift
Very Good
39.28user 0.05system 0:39.33elapsed 99%CPU (0avgtext+0avgdata 1256maxresident)k
0inputs+0outputs (0major+63minor)pagefaults 0swaps
明顯看出 iterator 版本明顯慢很多。
time 指令的輸出格式可參照連結。鄭皓澤
加入以下程式碼
int recursive_clz(uint32_t x, int offset)
{
//when input = 0
if(x == 0)
return 32;
if(offset == 0)
return 0;
uint16_t upper = (x >> offset);
uint16_t lower = (x & (0xFFFF >> (16 - offset)));
return upper ? recursive_clz(upper, offset >> 1) : offset + recursive_clz(lower, offset >> 1);
}
$ make check
輸出time ./test_iterator
Very Good
101.70user 0.00system 1:41.71elapsed 99%CPU (0avgtext+0avgdata 1212maxresident)k
0inputs+0outputs (0major+63minor)pagefaults 0swaps
time ./test_bin_search
Very Good
35.92user 0.00system 0:35.93elapsed 99%CPU (0avgtext+0avgdata 1268maxresident)k
0inputs+0outputs (0major+61minor)pagefaults 0swaps
time ./test_byte_shift
Very Good
41.53user 0.09system 0:41.63elapsed 99%CPU (0avgtext+0avgdata 1252maxresident)k
0inputs+0outputs (0major+61minor)pagefaults 0swaps
time ./test_recursive
Very Good
207.42user 0.00system 3:27.43elapsed 99%CPU (0avgtext+0avgdata 1236maxresident)k
0inputs+0outputs (0major+63minor)pagefaults 0swaps
發現 recursive 跑的非常久,接下來使用 gnuplot 畫圖方便做比較。
HaoTse
sysprog21
clz