contributed by <Jing Zhou
>
ubuntu 16.04
sudo apt-get install gcc-avr binutils-avr gdb-avr avr-libc avrdude
$ make
gcc -c -O0 -std=gnu99 -Wall clz_function.c -o clz_function.o
clz_function.c:3:26: fatal error: avr/pgmspace.h: No such file or directory
compilation terminated.
Makefile:12: recipe for target 'clz_function.o' failed
make: *** [clz_function.o] Error 1
由高到低位元,計算在出現1前0的個數
32位元為例
0x00000006 為 0..0000110,前面0共出現29次
recursive version (修正後)
uint8_t clz_recursive(uint32_t x){
uint32_t upper = (x >> 16);
uint32_t lower = (x & 0x0000FFFF);
return upper ? -16 + clz_recursive(upper) : lower ? -1 + clz_recursive(lower >> 1) : 32;
}
iteration version
uint8_t clz_iteration(uint32_t x){
uint8_t n = 32, c = 16;
do {
uint32_t y = x >> c;
if (y) { n -= c; x = y; }
c >>= 1;
} while (c);
return (n - x);
}
binary search technique
uint8_t clz_binary_search(uint32_t x){
if (x == 0) return 32;
uint8_t n = 0;
if (x <= 0x0000FFFF) { n += 16; x <<= 16; }
if (x <= 0x00FFFFFF) { n += 8; x <<= 8; }
if (x <= 0x0FFFFFFF) { n += 4; x <<= 4; }
if (x <= 0x3FFFFFFF) { n += 2; x <<= 2; }
if (x <= 0x7FFFFFFF) { n += 1; x <<= 1; }
return n;
}
byte-shift version
uint8_t clz_byte_shift(uint32_t x){
if (x == 0) return 32;
uint8_t n = 1;
if ((x >> 16) == 0) { n += 16; x <<= 16; }
if ((x >> 24) == 0) { n += 8; x <<= 8; }
if ((x >> 28) == 0) { n += 4; x <<= 4; }
if ((x >> 30) == 0) { n += 2; x <<= 2; }
n = n - (x >> 31);
return n;
}
Harley’s algorithm (待測)
四個方法都正確,最後一排是各個的執行時間(sec)
參照compute-pi的方法用clock_gettime()
設計benchmark suite,各點為經過10000次loop的時間,發現數字小的話recursive的時間高出許多,整體效能binary search和byte-shift最優
多媒體的抄大規模信號整合技術,將資料壓縮
參考 Journal of Very Large Scale Integration Signal Processing Systems for Signal, Image, and Video Technology [Kluwer Academic Publishers, 2000]
implement Gosper's loop-detection algorithm
參考 Gosper, Bill (1972). "Loop detector"
The expression 16 − clz(x − 1)/2 is an effective initial guess for computing the square root of a 32-bit integer using Newton's method