2023q1 Homework2 (quiz2)

# 2023q1 Homework2 (quiz2) contributed by < `chiacyu` > # 測驗 `1` 考慮 `next_pow2` 可針對給定無號 `64` 位元數值 `x`，找出最接近且大於等於 `2` 的冪的值 ```c uint64_t next_pow2(uint64_t x) { x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 8; x |= x >> 16; x |= x >> 32; return (x+1); } ``` 程式碼其假設的情境是在 Big-Endian 的情境下，是找出最接近且大於等於等於 `2` 的冪的值。假設 `x = 0x70000000` ，以二進位來表示則為 `0x0111 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000` ，經過一系列運算後則為： ```c uint64_t next_pow2(uint64_t x) { //0x0111 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 x |= x >> 1; //0x0111 1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 x |= x >> 1; //0x0111 1100 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 x |= x >> 1; //0x0111 1110 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 x |= x >> 1; //0x0111 1111 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 x |= x >> 1; //0x0111 1111 1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 x |= x >> 1; //0x0111 1111 1100 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 x |= x >> 1; //0x0111 1111 1110 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 x |= x >> 8; //0x0111 1111 1111 1111 1110 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 x |= x >> 16; //0x0111 1111 1111 1111 1111 1111 1111 1111 1110 0000 0000 0000 0000 0000 0000 0000 x |= x >> 32; //0x0111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 return (x+1); ////0x1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 } ``` ## `__builtin_clzl` 改寫從 [`__builtin_clzl`](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) 可以看到回傳值為 `Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.` 舉例來說假設輸入值為 `16`，以 `64bit` 的數值系統，其二進位的表達式則為 `00000000 00000000 00000000 00000000 00000000 00000000 00000000 00010000`，回傳值則為 `59`，因此可以改寫為以下 ```c uint64_t next_pow2(uint64_t x) { // If x is 0, the result is undefined. if (x == 0) return 1; int leadin_z_count = __builtin_clzl(x); return (1 << (64 - leadin_z_count)) } ``` 這邊需要注意的地方是 `__builtin_clzl()` 在輸入值為 `0 ` 的時候的結果是 `undefined` 因此需要特別注意。 ## Linux 核心使用案例可以看到在 [log2.h](https://github.com/torvalds/linux/blob/master/include/linux/log2.h) 可以看到有相關的實作，如： ```c unsigned long __roundup_pow_of_two(unsigned long n) { return 1UL << fls_long(n - 1); } ``` :::warning `lab0-c` 的程式碼 [log2_lshift16.h](https://github.com/sysprog21/lab0-c/blob/master/log2_lshift16.h) 則有另一項實作，應用於 [shannon_entropy.c](https://github.com/sysprog21/lab0-c/blob/master/shannon_entropy.c)，可一併對照討論。 :notes: jserv ::: ## x86對應指令 --- # 測驗 `2` ```c int concatenatedBinary(int n) { const int M = 1e9 + 7; int len = 0; /* the bit length to be shifted */ /* use long here as it potentially could overflow for int */ long ans = 0; for (int i = 1; i <= n; i++) { /* removing the rightmost set bit * e.g. 100100 -> 100000 * 000001 -> 000000 * 000000 -> 000000 * after removal, if it is 0, then it means it is power of 2 * as all power of 2 only contains 1 set bit * if it is power of 2, we increase the bit length */ if (!(i & (i-1))) len++; ans = (i | (ans << len)) % M; } return ans; } ``` 觀察數值可以看到當數值為 `2` 的冪，`bit` 的位數則會加一，例如： ```c 0 //0 1 //1 2 //10 3 //11 4 //100 ``` 所以可以透過 `!(i & (i-1))` 來去判斷 `i` 是否為 2 的冪。每次的迭代須將結果往 `most significant bit` 位移 `len bit`。假設輸入值為 2，迴圈由 `i = 1` 開始。 ```c i = 1 // ans = 1 i = 2 // ans = 110 ``` ## `__builtin_{clz,ctz,ffs}` 改寫從範例可以知道 `ans` 在每一次的迭代需要預留下一個 `i` 值的空間。例如 `i = 2` 以二進位來表示為 `10` 因此 `ans` 需要往左位移兩個`bit`來容納`10`，因此我們可以善用`built in function`來改寫為 ```c int concatenatedBinary(int n) { long ans = 1; for (int i = 2; i <= n; i++) { ans = (i | (ans << (64 - __builtin_clzl(i)))); } return ans; } ``` 透過 `__builtin_clzl` 來知道 `i` 往最高位元數有幾個 `0` ，而 `64 - __builtin_clzl(i)` 就可以知道 `i` 所佔位元數。例如 `__builtin_clzl(2)` 會回傳 `62`， 64 - 62 = 2，就剛好是`10`所佔用的位元數。 --- # 測驗 `3` ```c size_t swar_count_utf8(const char *buf, size_t len) { const uint64_t *qword = (const uint64_t *) buf; const uint64_t *end = qword + len >> 3; size_t count = 0; for (; qword != end; qword++) { const uint64_t t0 = *qword; const uint64_t t1 = ~t0; const uint64_t t2 = t1 & 0x04040404040404040llu; const uint64_t t3 = t2 + t2; const uint64_t t4 = t0 & t3; count += __builtin_popcountll(t4); } count = (1 << 3) * (len / 8) - count; count += (len & 0x7) ? count_utf8((const char *) end, len & 0x7) : 0; return count; } ``` 這邊花了一點時間才了解其中的技巧，需要注意的是原本傳入的 `buf` 的資料結構是`char *`，而每一個 `char` 只佔用一個 `byte`，在 `* qword` 這邊做強制轉型成 `unsigned int 64` 佔用8個`byte`。因為做了資料轉型，原先的 `len` 也要做出對應的調整。 `>> 3` 可以當成除以 `8`。假設原先`len = 8` ，需要從`buf[0].....buf[7]`才可以看完所有的資料。轉型之後只需 `qword[0]` 便可以處理完。這邊有趣的地方在於 `count` 在不同的 `function scope` 代表不同的意思。 ```c count += __builtin_popcountll(t4); ``` - count : number of continuation bytes ```c count = (1 << 3) * (len / 8) - count; ``` - count total bytes - continuation bytes !!!missing a colon :, confusing to some people. by Urbaner3 ## Linux UTF-8 和 Unicode 討論最後有個有趣的技巧，當 `len` 不為`8`的倍數時，需要而外處理剩餘的部分。例如： `len = 9`，除完之後還會剩下 `1`，因此剩餘的部分用 `count_utf8()` 方法去處理。對 `8` 在 `mod` 運算可以用對 `0x7` 做 `and` 運算。 --- # 測驗 `4` ```c #include <stdint.h> #include <stdbool.h> bool is_pattern(uint16_t x) { if (!x) return 0; for (; x > 0; x <<= 1) { if (!(x & 0x8000)) return false; } return true; } bool is_pattern(uint16_t x) { const uint16_t n = ~x + 1; return (n ^ x) < x; } ``` 從題目的要求與敘述，該程式所驗證於一個 16 位元的數值，從 `MSB` 開始是否為連續的 `1`。 ```c //MSB LSB 1000 0000 0000 0000 1100 0000 0000 0000 1110 0000 0000 0000 ... ... ... 1111 1111 1111 1111 ``` 但如果是從 `LSB` 開始即便中間無間隔任何 0 ，也不符合要求，例如： ```c 0000 0000 0000 0001 0000 0000 0000 0011 ... ... ... ``` 這樣子是不符合要求的。這邊的做法是先將 `x` 值做 `~` 運算加上一，如果是有號數`(signed number)`等同於取負值。在與原先的數字做 `exlusive or` ，注意這邊是 `uint16_t` 因此數字中`1`的位數越多則數值越大。因此若是不符合我們要求的 `pattern` 在做完運算後反而會比遠先的數值更大。因此可以利用此特性來作為條件判斷。 ## Linux bitmask 探討應用