2017q1 Homework1 (clz)

# 2017q1 Homework1 (clz) contributed by < `twzjwang` > 作業說明: [B04: clz](https://hackmd.io/s/ry1u0uDFg) github: [twzjwang/clz-tests](https://github.com/twzjwang/clz-tests) # 開發環境 - Ubuntu Ubuntu 16.04.2 LTS Linux 4.8.0-36-generic - cpu version: Intel(R) Core(TM) i5-3337U CPU @ 1.80GHz Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 3072K - memory size: 4GiB # 開發前 - [重新理解數值](https://hackmd.io/s/Hy-pTh8Fl) - Leading Zeros - 在 `1` 前面(左邊)有幾個 `0` - 浮點數運算 # 實驗一 - 比較不同實作方法的差異 - 執行`make run` `make plot` 結果 ![](https://i.imgur.com/j1BnFkr.png) - recursive version ```C static const int mask[]={0,8,12,14}; static const int magic[]={2,1,0,0}; unsigned clz2(uint32_t x,int c) { if (!x && !c) return 32; uint32_t upper = (x >> (16>>c)); uint32_t lower = (x & (0xFFFF>>mask[c])); if (c == 3) return upper ? magic[upper] : 2 + magic[lower]; return upper ? clz2(upper, c + 1) : (16 >> (c)) + clz2(lower, c + 1); } ``` - 初始呼叫 `clz2(x,0)` - 若 x 為 0 回傳 32 - 將 x 分為 upper 跟 lower - c = 0 (第一次) x = abcdefgh~16~ - upper : abcd~16~ - lower : efgh~16~ - c = 1 (第二次) x = abcd~16~ - upper : ab~16~ - lower : cd~16~ - c = 2 (第三次) x = ab~16~ - upper : a~16~ - lower : b~16~ - c = 3 (第四次) x = a~16~ = bcde~2~ - upper : bc~2~ - lower : de~2~ - 若 upper 不為 0 遞迴呼叫 `clz2(upper, c+1)` ，否則 `clz2(lower, c+1)` - 建 `mask[]` 及 `magic[]` 協助計算 - iteration version ```C static inline __attribute((always_inline)) unsigned clz(uint32_t x) { int n = 32, c = 16; do { uint32_t y = x >> c; if (y) { n -= c; x = y; } c >>= 1; } while (c); return (n - x); } ``` - 初始呼叫 `clz(x)` - 類似 `clz2` 將 x 分為兩部分 - c = 16 x = abcdefgh~16~ - y = abcd~16~ - if(y) x = abcd~16~ - else x = efgh~16~ - ...... - c = 1 x = ab~2~ - y = ab~2~ - if(y) x = a~2~ - else x = b~2~ - N 紀錄目前 X 長度，包括 Leading 0 - X 最後剩下一位，可能為 0 或 1 - 回傳 N-X - binary search technique ```C static inline __attribute((always_inline)) unsigned clz(uint32_t x) { if (x == 0) return 32; int n = 0; if (x <= 0x0000FFFF) { n += 16; x <<= 16; } if (x <= 0x00FFFFFF) { n += 8; x <<= 8; } if (x <= 0x0FFFFFFF) { n += 4; x <<= 4; } if (x <= 0x3FFFFFFF) { n += 2; x <<= 2; } if (x <= 0x7FFFFFFF) { n += 1; x <<= 1; } return n; } ``` - 將迴圈拆開 - 改變判斷方式 - x = abcdefgh~16~ - if (x <= 0x0000FFFF) x = efgh0000~16~ - else x = abcdefgh~16~ - ... - byte-shift version ```C static inline __attribute((always_inline)) unsigned clz(uint32_t x) { if (x == 0) return 32; int n = 1; if ((x >> 16) == 0) { n += 16; x <<= 16; } if ((x >> 24) == 0) { n += 8; x <<= 8; } if ((x >> 28) == 0) { n += 4; x <<= 4; } if ((x >> 30) == 0) { n += 2; x <<= 2; } n = n - (x >> 31); return n; } ``` - 透過 shift x 判斷前幾 bits 是否為 0 - x = abcdefgh~16~ - if ((x >> 16) == 0) x = efgh0000~16~ - else x = abcdefgh~16~ - ... - Harley’s algorithm - 可以根據不同 table 實作 CLZ 或 CTZ # CLZ 應用 ## 資料壓縮 - 資料壓縮加密及解密過程與 leading zero 相關 - ![](https://i.imgur.com/1G2IePa.png) - [Advances in Databases and Information Systems：19th East European Conference, ADBIS 2015, Poitiers, France, September 8-11, 2015, Proceedings](https://books.google.com.tw/books?id=SohgCgAAQBAJ&pg=PA159&dq=count+leading+zero&hl=zh-TW&sa=X&redir_esc=y#v=onepage&q=count%20leading%20zero&f=false) - [Fast Integer Compression using SIMD Instructions](https://people.mpi-inf.mpg.de/~rgemulla/publications/schlegel10compression.pdf) ## 浮點數 - 在浮點數架構下，檢查 leading zero 是必要的 - e.g. fixed-point to floating-point format conversion - [High-Performance System Design: Circuits and Logic](https://books.google.com.tw/books?id=ybNarxOtoUAC&q=leading+zero&dq=leading+zero&hl=zh-TW&sa=X&ved=0ahUKEwicqNaZhr3SAhWDfLwKHRjGC9o4FBDoAQghMAE) # 參考 - [你所不知道的 C 語言：前置處理器應用篇](https://hackmd.io/s/S1maxCXMl) - [重新理解數值](https://hackmd.io/s/Hy-pTh8Fl) - [CLZ](https://en.wikipedia.org/wiki/Find_first_set#CLZ) - [ktvexe](https://hackmd.io/s/BJBZn6Q6) - [Advances in Databases and Information Systems：19th East European Conference, ADBIS 2015, Poitiers, France, September 8-11, 2015, Proceedings](https://books.google.com.tw/books?id=SohgCgAAQBAJ&pg=PA159&dq=count+leading+zero&hl=zh-TW&sa=X&redir_esc=y#v=onepage&q=count%20leading%20zero&f=false) - [Fast Integer Compression using SIMD Instructions](https://people.mpi-inf.mpg.de/~rgemulla/publications/schlegel10compression.pdf) - [Digital Signal Processing with Field Programmable Gate Arrays](https://books.google.com.tw/books?id=wzYuOF6HFX0C&pg=PA105&dq=leading+zero&hl=zh-TW&sa=X&ved=0ahUKEwiMkcnWh73SAhUFhbwKHUMVAE04FBDoAQhdMAk#v=onepage&q=leading%20zero&f=false)