# 2017q1 Homework1 (clz)
contributed by < `twzjwang` >
作業說明: [B04: clz](https://hackmd.io/s/ry1u0uDFg)
github: [twzjwang/clz-tests](https://github.com/twzjwang/clz-tests)
# 開發環境
- Ubuntu Ubuntu 16.04.2 LTS
Linux 4.8.0-36-generic
- cpu
version: Intel(R) Core(TM) i5-3337U CPU @ 1.80GHz
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
- memory
size: 4GiB
# 開發前
- [重新理解數值](https://hackmd.io/s/Hy-pTh8Fl)
- Leading Zeros
- 在 `1` 前面(左邊)有幾個 `0`
- 浮點數運算
# 實驗一 - 比較不同實作方法的差異
- 執行`make run` `make plot` 結果
![](https://i.imgur.com/j1BnFkr.png)
- recursive version
```C
static const int mask[]={0,8,12,14};
static const int magic[]={2,1,0,0};
unsigned clz2(uint32_t x,int c)
{
if (!x && !c) return 32;
uint32_t upper = (x >> (16>>c));
uint32_t lower = (x & (0xFFFF>>mask[c]));
if (c == 3) return upper ? magic[upper] : 2 + magic[lower];
return upper ? clz2(upper, c + 1) : (16 >> (c)) + clz2(lower, c + 1);
}
```
- 初始呼叫 `clz2(x,0)`
- 若 x 為 0 回傳 32
- 將 x 分為 upper 跟 lower
- c = 0 (第一次) x = abcdefgh~16~
- upper : abcd~16~
- lower : efgh~16~
- c = 1 (第二次) x = abcd~16~
- upper : ab~16~
- lower : cd~16~
- c = 2 (第三次) x = ab~16~
- upper : a~16~
- lower : b~16~
- c = 3 (第四次) x = a~16~ = bcde~2~
- upper : bc~2~
- lower : de~2~
- 若 upper 不為 0 遞迴呼叫 `clz2(upper, c+1)` ,否則 `clz2(lower, c+1)`
- 建 `mask[]` 及 `magic[]` 協助計算
- iteration version
```C
static inline __attribute((always_inline))
unsigned clz(uint32_t x)
{
int n = 32, c = 16;
do {
uint32_t y = x >> c;
if (y) { n -= c; x = y; }
c >>= 1;
} while (c);
return (n - x);
}
```
- 初始呼叫 `clz(x)`
- 類似 `clz2` 將 x 分為兩部分
- c = 16 x = abcdefgh~16~
- y = abcd~16~
- if(y) x = abcd~16~
- else x = efgh~16~
- ......
- c = 1 x = ab~2~
- y = ab~2~
- if(y) x = a~2~
- else x = b~2~
- N 紀錄目前 X 長度,包括 Leading 0
- X 最後剩下一位,可能為 0 或 1
- 回傳 N-X
- binary search technique
```C
static inline __attribute((always_inline))
unsigned clz(uint32_t x)
{
if (x == 0) return 32;
int n = 0;
if (x <= 0x0000FFFF) { n += 16; x <<= 16; }
if (x <= 0x00FFFFFF) { n += 8; x <<= 8; }
if (x <= 0x0FFFFFFF) { n += 4; x <<= 4; }
if (x <= 0x3FFFFFFF) { n += 2; x <<= 2; }
if (x <= 0x7FFFFFFF) { n += 1; x <<= 1; }
return n;
}
```
- 將迴圈拆開
- 改變判斷方式
- x = abcdefgh~16~
- if (x <= 0x0000FFFF) x = efgh0000~16~
- else x = abcdefgh~16~
- ...
- byte-shift version
```C
static inline __attribute((always_inline))
unsigned clz(uint32_t x)
{
if (x == 0) return 32;
int n = 1;
if ((x >> 16) == 0) { n += 16; x <<= 16; }
if ((x >> 24) == 0) { n += 8; x <<= 8; }
if ((x >> 28) == 0) { n += 4; x <<= 4; }
if ((x >> 30) == 0) { n += 2; x <<= 2; }
n = n - (x >> 31);
return n;
}
```
- 透過 shift x 判斷前幾 bits 是否為 0
- x = abcdefgh~16~
- if ((x >> 16) == 0) x = efgh0000~16~
- else x = abcdefgh~16~
- ...
- Harley’s algorithm
- 可以根據不同 table 實作 CLZ 或 CTZ
# CLZ 應用
## 資料壓縮
- 資料壓縮加密及解密過程與 leading zero 相關
- ![](https://i.imgur.com/1G2IePa.png)
- [Advances in Databases and Information Systems:19th East European Conference, ADBIS 2015, Poitiers, France, September 8-11, 2015, Proceedings](https://books.google.com.tw/books?id=SohgCgAAQBAJ&pg=PA159&dq=count+leading+zero&hl=zh-TW&sa=X&redir_esc=y#v=onepage&q=count%20leading%20zero&f=false)
- [Fast Integer Compression using SIMD Instructions](https://people.mpi-inf.mpg.de/~rgemulla/publications/schlegel10compression.pdf)
## 浮點數
- 在浮點數架構下,檢查 leading zero 是必要的
- e.g. fixed-point to floating-point format conversion
- [High-Performance System Design: Circuits and Logic](https://books.google.com.tw/books?id=ybNarxOtoUAC&q=leading+zero&dq=leading+zero&hl=zh-TW&sa=X&ved=0ahUKEwicqNaZhr3SAhWDfLwKHRjGC9o4FBDoAQghMAE)
# 參考
- [你所不知道的 C 語言:前置處理器應用篇](https://hackmd.io/s/S1maxCXMl)
- [重新理解數值](https://hackmd.io/s/Hy-pTh8Fl)
- [CLZ](https://en.wikipedia.org/wiki/Find_first_set#CLZ)
- [ktvexe](https://hackmd.io/s/BJBZn6Q6)
- [Advances in Databases and Information Systems:19th East European Conference, ADBIS 2015, Poitiers, France, September 8-11, 2015, Proceedings](https://books.google.com.tw/books?id=SohgCgAAQBAJ&pg=PA159&dq=count+leading+zero&hl=zh-TW&sa=X&redir_esc=y#v=onepage&q=count%20leading%20zero&f=false)
- [Fast Integer Compression using SIMD Instructions](https://people.mpi-inf.mpg.de/~rgemulla/publications/schlegel10compression.pdf)
- [Digital Signal Processing with Field Programmable Gate Arrays](https://books.google.com.tw/books?id=wzYuOF6HFX0C&pg=PA105&dq=leading+zero&hl=zh-TW&sa=X&ved=0ahUKEwiMkcnWh73SAhUFhbwKHUMVAE04FBDoAQhdMAk#v=onepage&q=leading%20zero&f=false)