# 2023q1 Homework2 (quiz2) contributed by <[vax-r](https://github.com/vax-r)> ## 測驗一 ```c uint64_t next_pow2(uint64_t x) { x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> AAAA; x |= x >> 16; x |= x >> BBBB; return CCCC; } ``` ### 程式碼原理 上述函式需要返回大於等於x且最接近x的2的指數 以64位元16進位的方式思考此函式功能如下 ```graphviz digraph next_pow2{ node[shape=record] input [label="<f0> x|<f1> 0x0010100011110000"]; function [label="{\n next_pow2 \n\n}" shape=Mrecord]; output [label="<f0> next_pow2(x)|<f1> 0x0100000000000000"]; input:f1 -> function; function -> output; } ``` 也就是將x的msb的左邊一個bit變成1,並把剩下所有bit都設為0 依照題目給的程式碼,可以將上述next_pow2再細分成以下過程 ```graphviz digraph next_pow2{ node[shape=record] input [label="<f0> x|<f1> 0x0010100011110000"]; function [label="{<f0>next_pow2 |<f1> 0x0011111111111111 |<f2> 0x0011111111111111 + 1}" shape=Mrecord]; output [label="<f0> next_pow2(x)|<f1> 0x0100000000000000"]; input:f1 -> function; function -> output; } ``` 由此可知總共有16個bits,已經set 7個bits了,AAAA = 8, BBBB = 32, CCCC = x + 1 ### 用 __builtin_clzl()改寫 此函式`__builtin_clzl()`會回傳MSB左方有幾個0 所以若 `y = __builtin_clzl(x)` 則可以知道 `z = 0xffffffffffff >> y` 就會是next_pow2當中的第一部份 接著再回傳 `z+1` 即可 ```c uint64_t next_pow2(uint64_t x) { int zeroes = __builtin_clzl(x); x = 0xffffffffffffffff >> zeroes; return x + 1; } ``` ### 編譯器對應之x86指令 ``` next_pow2: .LFB1: .cfi_startproc endbr64 bsrq %rdi, %rcx movq $-1, %rax xorq $63, %rcx shrq %cl, %rax addq $1, %rax ret .cfi_endproc ``` ## 測驗二 ```c int concatenatedBinary(int n) { const int M = 1e9 + 7; int len = 0; /* the bit length to be shifted */ /* use long here as it potentially could overflow for int */ long ans = 0; for (int i = 1; i <= n; i++) { /* removing the rightmost set bit * e.g. 100100 -> 100000 * 000001 -> 000000 * 000000 -> 000000 * after removal, if it is 0, then it means it is power of 2 * as all power of 2 only contains 1 set bit * if it is power of 2, we increase the bit length */ if (!(DDDD)) len++; ans = (i | (EEEE)) % M; } return ans; } ``` ### 程式碼原理 DDDD是用來檢查將i移除lsb之後是否為0 思考如何利用bitwise operation移除lsb 觀察規律之後可以發現, `i & (i-1)` 可以達到這樣的效果 所以 DDDD: `i & (i-1)` 要將i串接在ans最右邊, 所以ans需要進行shift, shift的長度正好是len 所以 EEEE: `ans << len` ### 改寫 查閱 https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html :::info Built-in Function: int __builtin_ffs (int x) Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero. Built-in Function: int __builtin_clz (unsigned int x) Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined. Built-in Function: int __builtin_ctz (unsigned int x) Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined. ::: 根據以上定義和程式碼需求可以將程式碼改寫成以下 ```c int concatenatedBinary(int n) { const int M = 1e9 + 7; int len = 0; /* the bit length to be shifted */ /* use long here as it potentially could overflow for int */ long ans = 0; for (int i = 1; i <= n; i++) { /* removing the rightmost set bit * e.g. 100100 -> 100000 * 000001 -> 000000 * 000000 -> 000000 * after removal, if it is 0, then it means it is power of 2 * as all power of 2 only contains 1 set bit * if it is power of 2, we increase the bit length */ int ffs = __builtin_ffs(i); if (!(i >> ffs)) len++; ans = (i | (ans << len)) % M; } return ans; } ``` ## 測驗三 判斷 2 個 32 位元寬度的整數是否都是奇數 (odd),可能會這樣撰寫: ```c #include <stdint.h> bool both_odd(uint32_t x, uint32_t y) { return (x & 1) && (y & 1); } ``` 因為若num是奇數,則第0位元會是1, 所以才比較兩個數字的第0位元是否都是1 題目中提到 若輸入的字串是一個有效的 UTF-8 序列,則計**算其中的後續位元組數量,並將此數字從總字串長度中減去**,即可確定字元數量。 ```c size_t swar_count_utf8(const char *buf, size_t len) { const uint64_t *qword = (const uint64_t *) buf; const uint64_t *end = qword + len >> 3; size_t count = 0; for (; qword != end; qword++) { const uint64_t t0 = *qword; const uint64_t t1 = ~t0; const uint64_t t2 = t1 & 0x04040404040404040llu; const uint64_t t3 = t2 + t2; const uint64_t t4 = t0 & t3; count += __builtin_popcountll(t4); } count = (1 << AAAA) * (len / 8) - count; count += (len & BBBB) ? count_utf8((const char *) end, len & CCCC) : DDDD; return count; } ``` ### 程式碼原理 可以看到在for loop的部份是在計算此字串中的continuous bytes的數量 算完後需要將此數字從總長度中扣除 在一開始宣告的qword可以知道, for loop計算的時候每次存取8個byte 有可能送入的字串長度不被8整除, 所以for loop實際上只處理到能被8整除的字串長度 剩下沒有被處理到的字元需要另外處理 ```c count = (1 << 3) * (len / 8) - count; count += (len & 7) ? count_utf8((const char *) end, len & 7) : 0; ``` 在上述程式碼中 `(1 << 3) * (len / 8)` 是將len當中除以8之後的餘數給消除 而下一行當中的 `(len & 7)` 是用來檢查len是否被8整除 ## 測驗四 以下程式碼只讓特定規律的數字通過 ```c #include <stdint.h> #include <stdbool.h> bool is_pattern(uint16_t x) { if (!x) return 0; for (; x > 0; x <<= 1) { if (!(x & 0x8000)) return false; } return true; } ``` 觀察可得知數字規律如下 ``` 1000000000000000 1100000000000000 1110000000000000 . . . 1111111111110000 ``` ### 程式碼運作原理 ```c bool is_pattern(uint16_t x) { const uint16_t n = EEEE; return (n ^ x) < FFFF; } ``` 思考後可知 EEEE: `~x + 1` FFFF: `x` 參考: https://hackmd.io/@hankTaro/quiz2#%E6%B8%AC%E9%A9%97%E5%9B%9B