2023q1 Homework2 (quiz2)

# 2023q1 Homework2 (quiz2) [quiz2](https://hackmd.io/@sysprog/linux2023-quiz2) ## 測驗 1 ### 解釋上述程式碼原理假設輸入 `x` ```graphviz digraph structs { node [shape=record]; struct1 [label="0|...|0|1|x|...|x"]; } ``` 目標是把 `x` 最高位元的 set 之後的位元都轉為 set ```graphviz digraph structs { node [shape=record]; struct1 [label="0|...|0|1|1|...|1"]; } ``` 最後返回 `x + 1` ```graphviz digraph structs { node [shape=record]; struct1 [label="0|...|1|0|0|...|0"]; } ``` 輸入 `x` 執行一次 `x |= x >> 1;` 後, 便把最高位元的 set 之後 1 個位元轉為 set ```graphviz digraph structs { node [shape=record]; struct1 [label="0|...|0|1|1|x|...|x"]; } ``` 執行 7 次後, 把最高位元的 set 之後 7 個位元轉為 set ```graphviz digraph structs { node [shape=record]; struct1 [label="0|...|0|1|1|1|1|1|1|1|1|x|...|x"]; } ``` 執行 `x |= x >> 8;` 後, 把最高位元的 set 之後 7 + 8 個位元轉為 set。如此類推, 共把最高位元的 set 之後 7 + 8 + 16 + 32 = 63 個位元轉為 set。所以此算法能把 64 bit 寬的無號整數 `x` 最高位元 set 之後的位元都轉為 set。 ### 用 __builtin_clzl 改寫 ```c uint64_t next_pow2(uint64_t x) { int clz = __builtin_clz(x); return 1 << (64 - clz); } ``` ### 當上述 clz 內建函式已運用時，編譯器能否產生對應的 x86 指令？找到 bsr 指令如下: ```shell XPS:~/linux2023/quiz2$ cat 1.s | grep -n5 bsr 47- .type next_pow2, @function 48-next_pow2: 49-.LFB14: 50- .cfi_startproc 51- endbr64 52: bsrl %edi, %edi 53- movl $64, %ecx 54- movl $1, %eax 55- xorl $31, %edi 56- subl %edi, %ecx 57- sall %cl, %eax ``` ## 測驗 3 ### 解釋上述程式碼運作原理觀察 UTF-8 位元組字元規則後, 目標是找出共有多少個後續位元組 (continuation byte(s)), 然後輸出總長度減去後續位元組的數量。 #### 找出 continuation byte 的方法只觀察 bit7 和 bit6 共有以下可能組合: `B0` ```graphviz digraph structs { node [shape=record] struct1 [label="0|0|x...x"] struct2 [label="0|1|x...x"] subgraph cluster_c { label="continuation byte" struct3 [label="1|0|x...x"] } struct4 [label="1|1|x...x"] } ``` 反向運算 (not bit6): `B1 = ~B0` ```graphviz digraph structs { node [shape=record] struct1 [label="1|1|x...x"] struct2 [label="1|0|x...x"] subgraph cluster_c { label="continuation byte" struct3 [label="0|1|x...x"] } struct4 [label="0|0|x...x"] } ``` 隔離反向的第 6 個位元: `B2 = B1 & 0x40` ```graphviz digraph structs { node [shape=record] struct1 [label="0|1|0...0"] struct2 [label="0|0|0...0"] subgraph cluster_c { label="continuation byte" struct3 [label="0|1|0...0"] } struct4 [label="0|0|0...0"] } ``` 對第 6 個位元，向左位移 1 個位元 (移至第 7 個位元的位置): `B3 = B2 + B2` ```graphviz digraph structs { node [shape=record] struct1 [label="1|0|0...0"] struct2 [label="0|0|0...0"] subgraph cluster_c { label="continuation byte" struct3 [label="1|0|0...0"] } struct4 [label="0|0|0...0"] } ``` 進行 not bit6, and bit7: `B4 = B0 & B3` ```graphviz digraph structs { node [shape=record] struct1 [label="0|0|0...0"] struct2 [label="0|0|0...0"] subgraph cluster_c { label="continuation byte" struct3 [label="1|0|0...0"] } struct4 [label="0|0|0...0"] } ``` 最後只剩下 continuation byte 的 population count 為 1, 其餘為 0。 #### 結合 SWAR 運用於 64 位元微處理器架構程式碼以 8 個位元組為一組進行上述 bitwise 操作, 並以 `__builtin_popcountll` 得到 population count, 累加算出共有多少 continuation byte。 ### 比較 SWAR 和原本的實作效能落差 ## 測驗 4 ```c bool is_pattern(uint16_t x) { const uint16_t n = ~x + 1; return (n ^ x) < x; } ``` ### 解釋上述程式碼運作原理 #### 符合 pattern 的 `x` 模樣 ```graphviz digraph structs { node [shape=record]; struct1 [label="1|...|1|0|...|0"]; } ``` `~x` ```graphviz digraph structs { node [shape=record]; struct1 [label="0|...|0|1|...|1"]; } ``` `n = ~x + 1` ```graphviz digraph structs { node [shape=record]; clear [shape=none,label="進位"] clear -> struct1:f1 struct1 [label="0|...|<f1>1|0|...|0"]; } ``` `n ^ x` ```graphviz digraph structs { node [shape=record] clear [shape=none] clear -> struct1:f1 struct1 [label="1|...|<f1>0|0|...|0"]; } ``` `(n ^ x) < x` 為 Ture #### 不符合 pattern 的 `x` 模樣 ```graphviz digraph structs { node [shape=record]; struct1 [label="1|...|1|0|...|1|..."]; } ``` `~x` ```graphviz digraph structs { node [shape=record]; struct1 [label="0|...|0|1|...|0|..."]; } ``` `n = ~x + 1` ```graphviz digraph structs { node [shape=record]; clear [shape=none,label="進位或 set"] clear -> struct1:f1 struct1 [label="0|...|0|1|...|<f1>1|..."]; } ``` `n ^ x` ```graphviz digraph structs { node [shape=record] clear [shape=none] clear -> struct1:f1 set [shape=none] set -> struct1:f2 struct1 [label="1|...|1|<f2>1|...|<f1>0|..."]; } ``` `(n ^ x) < x` 為 False