2023q1 Homework2 (quiz2)

# 2023q1 Homework2 (quiz2) contributed by < [yeiogy123](https://github.com/yeiogy123) > ## 測驗一原始程式碼為: ```c uint64_t next_pow2(uint64_t x) { x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> 1; x |= x >> AAAA;// AAAA = 8 x |= x >> 16; x |= x >> BBBB;// BBBB = 32 return CCCC;// x + 1 } ``` 其目的為找出最接近且大於等於本身的`2 的冪次方值`，因此程式原理為：假設目前輸入為 16 bits ，為1100 0000 0000 0000 而此程式碼以 binary search 為基礎，減少單次 shift 的重複次數則 shift 1 bit 且與自身 or for 15次，(1+2+4+8) 可得到 ```c 0100 0000 0000 0000 ==> 0110 0000 0000 0000 ==> 0111 0000 0000 0000 ==> 0111 1000 0000 0000 ==> 0111 1100 0000 0000 ==> 0111 1110 0000 0000 ==>0111 1111 0000 0000 ==> shift 8 bit ==> 0111 1111 1111 1111 ==> 再加上1 得到最接近的數 ==> 1000 0000 0000 0000，其為2的冪次方。 ```` 可以改寫成 ```c uint64_t next_pow2(uint64_t x) { x |= x >> 1; x |= x >> 2; x |= x >> 4; x |= x >> 8; x |= x >> 16; x |= x >> 32; return x + 1; } ``` 更可以用 `__builtin_clz(x)` 簡化 ```c uint64_t next_pow2(uint_t x) { int x = __builtin_clz(x); return 1 << (64 - x); } ``` 因 `__builtin_clz(x)` 可獲得 0 所占位元數，使用 64 扣除含 0 位元數可得到含1位元數，設為 x, 再對 `1` 向左位移 ?? 位元，可得到答案。 :::warning 修正內文，使描述符合程式碼。 :notes: jserv ::: --- ## 測驗二 ### 說明 ```c int concatenatedBinary(int n) { const int M = 1e9 + 7; int len = 0; /* the bit length to be shifted */ /* use long here as it potentially could overflow for int */ long ans = 0; for (int i = 1; i <= n; i++) { // removing the rightmost set bit // e.g. 100100 -> 100000 // 000001 -> 000000 // 000000 -> 000000 // after removal, if it is 0, then it means it is power of 2 // as all power of 2 only contains 1 set bit // if it is power of 2, we increase the bit length if (!(DDDD)) len++; ans = (i | (EEEE)) % M; } return ans; } ``` len 在這裡的用意為移動的位元數，因為在不同的位元總數下，所領導的第一位 leftmost 位元位置不同，所需要移動的位元數也不同。且若 i 為 2 的冪次，則我們不需要移動，此外則需要增加 len 的值。因此 DDD 為 i &(i - 1) ， EEE 為 ans << len 。 ## 使用 __builtin_ctz 改寫 ```c int concatenatedBinary(int n) { const int M = 1e9 + 7; long ans = 0; for(int i = 1 ; i <= n ; i++) ans = i | (ans << (64 - __builtin_ctz(i))); return ans; } ``` 透過 `__builtin_ctz` 獲得 0 所占位元數，而透過 64 - 其含 0 位元數得到含 1 的位元數。 ### 測驗三 ### 說明 ```c size_t swar_count_utf8(const char *buf, size_t len) { const uint64_t *qword = (const uint64_t *) buf; const uint64_t *end = qword + len >> 3; size_t count = 0; for (; qword != end; qword++) { const uint64_t t0 = *qword; const uint64_t t1 = ~t0; const uint64_t t2 = t1 & 0x04040404040404040llu; const uint64_t t3 = t2 + t2; const uint64_t t4 = t0 & t3; count += __builtin_popcountll(t4); } count = (1 << AAAA) * (len / 8) - count; count += (len & BBBB) ? count_utf8((const char *) end, len & CCCC) : DDDD; return count; } ``` ```c const uint64_t *qword = (const uint64_t *) buf; const uint64_t *end = qword + len >> 3; ``` 首先這裡使用了轉型，將 UTF-8 的 buf 使用了一個指標轉型成 uint64_t 的型態，並存入 qword 中，並使用 end 作為結尾，其結尾為 `len >> 3` ，原理為 `len/8` ，也就是一次移動 8 個位元組，再加上去 qword 的初始值，而得到結尾位置。 ```c size_t count = 0; for (; qword != end; qword++) { const uint64_t t0 = *qword; const uint64_t t1 = ~t0; const uint64_t t2 = t1 & 0x04040404040404040llu; const uint64_t t3 = t2 + t2; const uint64_t t4 = t0 & t3; count += __builtin_popcountll(t4); } ``` 以老師提供的例子： ```c 1.輸入的位元組 t0 = [00xx.xxxx|01xx.xxxx|10xx.xxxx|11xx.xxxx] ^^^^^^^^^ continuation byte ``` ```c 2.反向運算 (not bit6) t1 = ~t0 t1 = [11xx.xxxx|10xx.xxxx|01xx.xxxx|00xx.xxxx] ``` ```c 3.隔離反向的第 6 個位元 t2 = t1 & 0x40404040 t2 = [0100.0000|0000.0000|0100.0000|0000.0000] ``` ```c 4.對第 6 個位元，向左位移 1 個位元 (移至第 7 個位元的位置) t3 = t2 + t2 t3 = [1000.0000|0000.0000|1000.0000|0000.0000] ``` ```c 5.進行 not bit6 and bit7 t4 = t0 & t3 t4 = [00xx.xxxx|01xx.xxxx|10xx.xxxx|11xx.xxxx] & [1000.0000|0000.0000|1000.0000|0000.0000] = [0000.0000|0000.0000|1000.0000|0000.0000] ^^^^^^^^^ the only non-zero byte ``` ```c 6.以 __builtin_popcountll(t4) 計算，將結果加入 count裡。 ``` ```c 7.重複步驟直到迴圈結束，最後得到的 count 則為 continuation bytes 數量。 ``` 在藉由迴圈運算，其中 AAAA 為 3，整行程式碼為 `count = (1 << 3) * (len / 8) - count;` ，將 len 除以 8 後，再乘以 8 ，即可以得到以 8 為倍數所組成的位元組數量，是為了與 `uint64_t` align ，並扣除後續位元組的數量，即可得到 UTF-8 字元的數量。而實際應用上， UTF-8 的總位元組數量不一定以 8 的倍數構成，所以需要計算 len % 8 的值來判斷 end 是否有指向他剩餘的尾端字元組。因此 BBBB 為 7 ， CCCC 為 7 ， DDDD 為 0。整行程式碼為 `count += (len & 7) ? count_utf8((const char *) end, len & 7) : 0;` ### 測驗四 ```c bool is_pattern(uint16_t x) { const uint16_t n = EEEE; return (n ^ x) < FFFF; } ``` 輸入： ```c pattern: 8000 (32768) = (1000 0000 0000 0000) pattern: c000 (49152) = (1000 0000 0000 0000) pattern: e000 (57344) = (1110 0000 0000 0000) pattern: f000 (61440) = (1111 0000 0000 0000) pattern: f800 (63488) = (1111 1000 0000 0000) pattern: fc00 (64512) = (1111 1100 0000 0000) pattern: fe00 (65024) = (1111 1110 0000 0000) pattern: ff00 (65280) = (1111 1111 0000 0000) pattern: ff80 (65408) = (1111 1111 1000 0000) pattern: ffc0 (65472) = (1111 1111 1100 0000) pattern: ffe0 (65504) = (1111 1111 1110 0000) pattern: fff0 (65520) = (1111 1111 1111 0000) pattern: fff8 (65528) = (1111 1111 1111 1000) pattern: fffc (65532) = (1111 1111 1111 1100) pattern: fffe (65534) = (1111 1111 1111 1110) pattern: ffff (65535) = (1111 1111 1111 1111) ``` 我們由 pattern可以發現，他在二進制中有個特性。其為從 `leftmost` 開始後皆含有連續的 1 ，精簡版程式碼的 EEEE 為 ~x+1， FFFF 為 x 。理由為， `n = ~x + 1` 將自己 not 後加上 1 ，並與自己做 xor 在比較是否比自己大，即 (n ^ x) < x 。 ```c 例子1: 較多的1 x = 1111 1111 1111 1100 n = ~x + 1 = 0000 0000 0000 0011 + 1 = 0000 0000 0000 0100 x ^ n = 1111 1111 1111 1100 xor 0000 0000 0000 0100 -------------------------- 1111 1111 1111 1000 < 1111 1111 1111 1100 (x) ==> 回傳 true 例子2: 較少的1 x = 1100 0000 0000 0000 n = ~x + 1 = 0011 1111 1111 1111 + 1 = 0100 0000 0000 0100 x ^ n = 1100 0000 0000 0000 xor 0100 0000 0000 0100 -------------------------- 1000 0000 0000 0000 < 1111 1111 1111 1100 (x) ==> 回傳 true ```