2023q1 Homework2 (quiz2)

--- tags: linux2023 --- # 2023q1 Homework2 (quiz2) contributed by < `linhoward0522` > > [2023q1 第 2 週測驗題](https://hackmd.io/@sysprog/linux2023-quiz2) ## 測驗 `1` 給定無號 64 位元數值 x，找出最接近且大於等於 2 的冪的值想法是我們要先找到數值 x 最高位元的 1 的位置，他的下一個高位元就會是最接近且大於等於 2 的冪的值 - 我們透過將當前最高位元以下的位元都 set 1 - 接著再將此結果加一即是我們要的答案 ```c uint64_t next_pow2(uint64_t x) { x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> AAAA; x |= x >> 16; x |= x >> BBBB; return CCCC; } ``` 透過上述程式碼可知，我們將最高位元的 1 ，依序向右移動並和當前的數值做 or，以此類推最後可得最高位元以下的位元都為 1 而我們知道題目是給定無號 64 位元的數值 x，所以最多只要做 63 次右移即可完成，所以 - `AAAA` 應為 `8` - `BBBB` 應為 `32` - `CCCC` 應為 `x + 1` ### 用 `__builtin_clzl` 改寫 >`int __builtin_clz (unsigned int x)` Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined. >`int __builtin_clzl (unsigned long)` >Similar to __builtin_clz, except the argument type is unsigned long. ```c uint64_t next_pow2(uint64_t x) { if (!x) return 1; int clz = __builtin_clzl (x); return (1 << (64 - clz)); } ``` - 首先必須注意 `If x is 0, the result is undefined`，所以要先設定好 `x == 0` 的條件 - 接著參考 [你所不知道的 C 語言: bitwise 操作](https://hackmd.io/@sysprog/c-bitwise#%E4%BD%A0%E6%89%80%E4%B8%8D%E7%9F%A5%E9%81%93%E7%9A%84-C-%E8%AA%9E%E8%A8%80-bitwise-%E6%93%8D%E4%BD%9C) 當中 [`Set a bit`](https://hackmd.io/@sysprog/c-bitwise#Set-a-bit) 的操作 - 將 1 左移至正確位置但根據題目所述==找出最接近且大於等於 2 的冪的值==，所以當我們輸入的數值已經是 2 的冪，則應該就要回傳該值 ```c if (!(x & (x - 1))) return x; ``` 所以應要加上該條件才正確 ### 在 Linux 核心原始程式碼找出類似的使用案例並解釋在 [`include/linux/log2.h`](https://github.com/torvalds/linux/blob/master/tools/include/linux/log2.h) 中的 [`__roundup_pow_of_two`](https://github.com/torvalds/linux/blob/master/tools/include/linux/log2.h#L47) ```c /* * round up to nearest power of two */ static inline __attribute__((const)) unsigned long __roundup_pow_of_two(unsigned long n) { return 1UL << fls_long(n - 1); } ``` 此函式的作用是將傳入的無號整數 n 轉換為最接近且大於等於 2 的冪的值其中 [`fls_long`](https://github.com/torvalds/linux/blob/master/tools/include/linux/bitops.h#L73) 定義在 [`linux/include/linux/bitops.h`](https://github.com/torvalds/linux/blob/master/tools/include/linux/bitops.h) 中 ```c static inline unsigned fls_long(unsigned long l) { if (sizeof(l) == 4) return fls(l); return fls64(l); } ``` :::warning 要探討其應用場景，比照 [課前測驗參考解答: Q1](https://hackmd.io/@sysprog/bitwise-reverse) 的風格。 :notes: jserv ::: [`fls`](https://github.com/torvalds/linux/blob/master/include/asm-generic/bitops/fls.h) 用於 `unsigned int` 為 4 bytes ，而 [`fls64`](https://github.com/torvalds/linux/blob/master/include/asm-generic/bitops/fls64.h) 用於 `unsigned long` 為 8 bytes ```c static __always_inline int fls(unsigned int x) { int r = 32; if (!x) return 0; if (!(x & 0xffff0000u)) { x <<= 16; r -= 16; } if (!(x & 0xff000000u)) { x <<= 8; r -= 8; } if (!(x & 0xf0000000u)) { x <<= 4; r -= 4; } if (!(x & 0xc0000000u)) { x <<= 2; r -= 2; } if (!(x & 0x80000000u)) { x <<= 1; r -= 1; } return r; } ``` :::warning 使用 4 個空白字元，而非 tab 來排版，善用 [.clang-format](https://github.com/sysprog21/lab0-c/blob/master/.clang-format) :notes: jserv ::: 用於計算一個無號整數中最高位元為 1 之位置 <s> >應用場景 : 詢問 `chatgpt` ，回答 : >在網絡編程中，用於調整網絡緩衝區的大小； >在記憶體管理中，用於調整物理頁框的大小； >在虛擬內存管理中，用於調整虛擬頁框的大小等。 </s> :::warning 可詢問 ChatGPT，但你要驗證，而且舉出實際的 Linux 原始程式碼來討論。 :notes: jserv ::: ### 當上述 clz 內建函式已運用時，編譯器能否產生對應的 x86 指令？ ```c next_pow2: .LFB0: .cfi_startproc endbr64 movl $1, %eax testq %rdi, %rdi je .L1 leaq -1(%rdi), %rdx movq %rdi, %rax testq %rdi, %rdx je .L1 bsrq %rdi, %rax movl $64, %ecx xorq $63, %rax subl %eax, %ecx movl $1, %eax sall %cl, %eax cltq ``` 執行 `cc -O2 -std=c99 -S quiz2-1.c` 後觀察產生的 `quiz2-1.s` 檔案，確認有 `bsrq` 指令 > [程式碼](https://github.com/linhoward0522/linux2023-quiz/blob/master/quiz2/quiz2-1.c) --- ## 測驗 `2` 給定一整數 n ，回傳將 1 到 n 的二進位表示法依序串接在一起所得到的二進位字串，其所代表的十進位數字 $mod \ 10^9+7$ 之值。 ```c int concatenatedBinary(int n) { const int M = 1e9 + 7; int len = 0; /* the bit length to be shifted */ /* use long here as it potentially could overflow for int */ long ans = 0; for (int i = 1; i <= n; i++) { /* removing the rightmost set bit * e.g. 100100 -> 100000 * 000001 -> 000000 * 000000 -> 000000 * after removal, if it is 0, then it means it is power of 2 * as all power of 2 only contains 1 set bit * if it is power of 2, we increase the bit length */ if (!(DDDD)) len++; ans = (i | (EEEE)) % M; } return ans; } ``` - 透過註解 `if it is power of 2, we increase the bit length` 可知，下面的 `if` 條件式應是要判斷是否為 2 的冪，若為 2 的冪就會增加 `bit length` - 參考 [你所不知道的 C 語言: bitwise 操作](https://hackmd.io/@sysprog/c-bitwise#%E4%BD%A0%E6%89%80%E4%B8%8D%E7%9F%A5%E9%81%93%E7%9A%84-C-%E8%AA%9E%E8%A8%80-bitwise-%E6%93%8D%E4%BD%9C) 當中 [`運用 bit-wise operator`](https://hackmd.io/@sysprog/c-numerics#%E9%81%8B%E7%94%A8-bit-wise-operator) > C 語言中，x & (x - 1) == 0 的數學意義 > power of two > signed v.s. unsigned - 所以 `DDDD` 應為 `i & (i - 1)` - 每次迭代都要將當前結果往左移，左移的 `bit` 數為 `len` 的長度，才能達到串聯的效果 - 所以 `EEEE` 應為 `ans << len` ### 嘗試使用 [`__builtin_{clz,ctz,ffs}`](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) 改寫 - `__builtin_clz()` >Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined. - `__builtin_ctz()` >Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined. - `__builtin_ffs()` >Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero. 使用 `__builtin_clz()` 來作改寫，可以算出需要左移多少位元並且因為不用判斷是否為 2 的冪，所以可以省去 `if` 條件式所造成的分支。 ```c int concatenatedBinary(int n) { const int M = 1e9 + 7; long ans = 0; for (int i = 1; i <= n; i++) ans = (i | (ans << (32 - __builtin_clz(i)))) % M; return ans; } ``` --- ## 測驗 `3` - UTF-8 的多位元組字元是由一個首位元組和後續位元組所構成 - 後續位元組的最高 2 個位元會設定為 `10` - 對於首位元組，最高的 2 個位元始終為 `11` | **ASCII** | Bytes 1 | Bytes 2 | Bytes 3 | Bytes 4 | |------------|---------|---------|---------|---------| | 1 Bytes | 0xxx.xxxx |||| | 2 Bytes | 110x.xxxx | 10xx.xxxx ||| | 3 Bytes | 1110.xxxx | 10xx.xxxx | 10xx.xxxx || | 4 Bytes | 1111.0xxx | 10xx.xxxx | 10xx.xxxx | 10xx.xxxx| ```c for (size_t i = 0; i < len; i++) { /* -65 is 0b10111111, anything larger in two-complement's should start * new code point. */ if (p[i] > -65) counter++; } ``` `if` 條件式是用來判斷有多少個首位元組，若大於 `-65(0b10111111)` ，即表示最高的 2 個位元始必為 `11` ，就可以得到有多少個字元 - SWAR 實作: ```c const uint64_t *qword = (const uint64_t *) buf; const uint64_t *end = qword + len >> 3; ``` `len` 為位元組數量，`*end = qword + len >> 3` 表示將 `end` 指標指向不足 8 個 `byte` 的起始位置 ```c for (; qword != end; qword++) { const uint64_t t0 = *qword; const uint64_t t1 = ~t0; const uint64_t t2 = t1 & 0x04040404040404040llu; const uint64_t t3 = t2 + t2; const uint64_t t4 = t0 & t3; count += __builtin_popcountll(t4); } ``` 我們要來判斷後續位元組，已知後續位元組的最高 2 個位元會設定為 `10`，所以 - 先進行反向運算 - 隔離反向的第 6 個位元 - 對第 6 個位元，向左位移 1 個位元 (移至第 7 個位元的位置) - 進行 `not bit6 and bit7` - 計算有多少位元組裡頭的 `1` ```c count = (1 << AAAA) * (len / 8) - count; ``` 這邊是要計算有整除 8 的位元組數量，並求出其字元數量，所以 `AAAA` 應為 `3` ```c count += (len & BBBB) ? count_utf8((const char *) end, len & CCCC) : DDDD ``` - 這邊是要計算沒有整除 8 的位元組數量，用 `(len & BBBB)` 來找出，所以 `BBBB` 應為 `7` - 若成立，就呼叫 `count_utf8()` 來處理 - `end` 指標指向不足 8 個 `byte` 的起始位置 - `len & CCCC` 表示還沒計算出來的位元組數量，所以 `CCCC` 應為 `7` - 若不成立，則表示都計算完了，所以 `DDDD` 應為 `0` --- ## 測驗 `4` 判定 16 位元無號整數是否符合特定樣式 (pattern): ``` pattern: 8000 (0b1000000000000000) pattern: c000 (0b1100000000000000) pattern: e000 (0b1110000000000000) pattern: f000 (0b1111000000000000) . . . pattern: ffff (0b1111111111111111) ``` 可以看出上面這些樣式的二進位表示法從最高位元開始有一段連續的 1，其餘位元皆為 0 ```c for (; x > 0; x <<= 1) { if (!(x & 0x8000)) return false; } return true; ``` 若 x 小於等於 0 或最高位元為 0 ，則會離開迴圈。否則會將 x 左移，並檢查最高位元是否為 1 所以可用來判定符合從最高位元開始有一段連續的 1，其餘位元皆為 0 改寫上述程式碼，使其達到等價行為，但更精簡有效。 ```c bool is_pattern(uint16_t x) { const uint16_t n = -x; return (n ^ x) < x; } ``` - 若 `x` 符合樣式 - 對 `x` 取二補數，也就是最靠近低位元的 1 不會改變，仍然會是 1 - 這時候再跟 `x` 做 `XOR` ，會讓最靠近低位元的 1 變成 0 - 此結果就會比 `x` 還要小首先我們先對 `x` 取二補數 ``` x = 1111000000000000 -x = 0001000000000000 ``` 接著觀察 `x ^ -x` ，發現 `x ^ -x` 會使最靠近低位元的 1 變成 0，所以 `x ^ -x < x` ``` x = 1111000000000000 -x = 0001000000000000 x ^ -x = 1110000000000000 ``` - 若 `x` 不符合樣式，則 `x ^ -x > x` ``` x = 1110100000000000 -x = 0001100000000000 x ^ -x = 1111000000000000 ```