2016q3 Homework1 (clz) === contributed by <`Jing Zhou`> ## 開發環境 ubuntu 16.04 ## 安裝 * AVR GCC ```shell sudo apt-get install gcc-avr binutils-avr gdb-avr avr-libc avrdude ``` * 執行依然產生錯誤 ```shell $ make gcc -c -O0 -std=gnu99 -Wall clz_function.c -o clz_function.o clz_function.c:3:26: fatal error: avr/pgmspace.h: No such file or directory compilation terminated. Makefile:12: recipe for target 'clz_function.o' failed make: *** [clz_function.o] Error 1 ``` ## count leading zero 由高到低位元,計算在出現1前0的個數 32位元為例 0x00000006 為 0..0000110,前面0共出現29次 ## 實做clz ### 方法 * recursive version (修正後) ```shell uint8_t clz_recursive(uint32_t x){ uint32_t upper = (x >> 16); uint32_t lower = (x & 0x0000FFFF); return upper ? -16 + clz_recursive(upper) : lower ? -1 + clz_recursive(lower >> 1) : 32; } ``` * iteration version ```shell uint8_t clz_iteration(uint32_t x){ uint8_t n = 32, c = 16; do { uint32_t y = x >> c; if (y) { n -= c; x = y; } c >>= 1; } while (c); return (n - x); } ``` * binary search technique ```shell uint8_t clz_binary_search(uint32_t x){ if (x == 0) return 32; uint8_t n = 0; if (x <= 0x0000FFFF) { n += 16; x <<= 16; } if (x <= 0x00FFFFFF) { n += 8; x <<= 8; } if (x <= 0x0FFFFFFF) { n += 4; x <<= 4; } if (x <= 0x3FFFFFFF) { n += 2; x <<= 2; } if (x <= 0x7FFFFFFF) { n += 1; x <<= 1; } return n; } ``` * byte-shift version ```shell uint8_t clz_byte_shift(uint32_t x){ if (x == 0) return 32; uint8_t n = 1; if ((x >> 16) == 0) { n += 16; x <<= 16; } if ((x >> 24) == 0) { n += 8; x <<= 8; } if ((x >> 28) == 0) { n += 4; x <<= 4; } if ((x >> 30) == 0) { n += 2; x <<= 2; } n = n - (x >> 31); return n; } ``` * Harley’s algorithm (待測) ```shell ``` ### 驗證正確性   四個方法都正確,最後一排是各個的執行時間(sec) ### 效能分析 參照compute-pi的方法用`clock_gettime()`設計benchmark suite,各點為經過10000次loop的時間,發現數字小的話recursive的時間高出許多,整體效能binary search和byte-shift最優  ## clz運用場合 * 多媒體的抄大規模信號整合技術,將資料壓縮 參考 Journal of Very Large Scale Integration Signal Processing Systems for Signal, Image, and Video Technology [Kluwer Academic Publishers, 2000] * implement Gosper's loop-detection algorithm 參考 [Gosper, Bill (1972). "Loop detector"](http://www.inwap.com/pdp10/hbaker/hakmem/flows.html#item132) * The expression 16 − clz(x − 1)/2 is an effective initial guess for computing the square root of a 32-bit integer using Newton's method #### 參考資料 * [Find first set](https://en.wikipedia.org/wiki/Find_first_set#cite_note-41) * [Google圖書](https://books.google.com.tw/books?id=5m1xAAAAIAAJ&q=Journal+of+Very+Large+Scale+Integration+Signal+Processing+Systems+for+Signal,+Image,+and+Video+Technology&dq=Journal+of+Very+Large+Scale+Integration+Signal+Processing+Systems+for+Signal,+Image,+and+Video+Technology&hl=zh-TW&sa=X&ved=0ahUKEwiF243M1bTPAhXEF5QKHVIfCh0Q6AEIHDAA)
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up