2020q3 Homework3 (quiz3)

# 2020q3 Homework3 (quiz3) contributed by <`hsiehong`> ###### tags: `進階電腦系統理論與實作` ### [題目](https://hackmd.io/@sysprog/2020-quiz3) ## Outline [TOC] ## 測驗1 對有號整數的跨平台 ASR 實作 ```c= int asr_i(signed int m, unsigned int n) { const int logical = (((int) -1) OP1) > 0; unsigned int fixu = -(logical & (OP2)); int fix = *(int *) & fixu; return (m >> n) | (fix ^ (fix >> n)); } ``` `OP1 = >> 1`, `OP2 = m < 0` * `logical` 是用來判斷系統負數的位移是補 0 或 1，若系統是補0 ， -1 右移 1 後會是 `0x7FFFFFFF`，若系統是補 1，-1 右移 1 後會是 `0xFFFFFFFF`，所以 `logical` 是 1 若系統是位移是補 0，反之 `logical` 是 0 系統是補 1 * 這裡期望負數的話系統補 1 ，在這情況下因為 `logical` 為 0， `fixu` 和 `fix` 皆為 0，所以 `return (m >> n)` 就會輸出預期的結果 * 若系統負數的位移是補 0 的話，我們需要 `fix` 這個 mask 來做修正 * `m` 為負數的話，我們會希望 mask 的格式是 `1...10...0`，1 的個數是位移的個數，所以 OP2 = `m < 0`，紀錄 m 是正數還是負數，如此 `(fix ^ (fix >> n))` 可以得出我們上面要的 mask ## 測驗2 __builtin_ctz 來實作 [LeetCode 342. Power of Four](https://leetcode.com/problems/power-of-four/) ```c= bool isPowerOfFour(int num) { return num > 0 && (num & (num - 1))==0 && !(__builtin_ctz(num) OPQ); } ``` `OPQ = & 0x1` * ctz 和 clz 是 [GCC extension](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) 所提供 > Built-in Function: int __builtin_ctz (unsigned int x) > - Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined. > >Built-in Function: int __builtin_clz (unsigned int x) > - Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined. 一個數 `num` 為 4 的冪次方，必須滿足 * num > 0 * num 必須為 2 的冪次方，因為 $4^n$ = $2^{2n}$，二進制形式為 `0...010...0`，所以 `num & num - 1 = 0` * 因為 $4^n$ = $2^{2n}$，`num` 只可能是 2 的 0,2,4... 偶數次，所以從 LSB 開始數的 0 的個數為偶數個，`__builtin_ctz(num) & 1` 可以檢查是否為偶數 * LeetCode 練習 [1009. Complement of Base 10 Integer](https://leetcode.com/problems/complement-of-base-10-integer/) ```c= class Solution { public: int bitwiseComplement(int N) { return N ? N ^ (0xffffffff >> __builtin_clz(N)) : 1; } }; ``` [41. First Missing Positive](https://leetcode.com/problems/first-missing-positive/) ```c= class Solution { public: void swap(int &a, int &b) { int t = a; a = b; b = t; } int firstMissingPositive(vector<int>& nums) { int numSize = nums.size(); for (int i = 0; i < numSize; ++i) { if (nums[i] > 0 && nums[i] < numSize && nums[i] != nums[nums[i] - 1]) { swap (nums[i], nums[nums[i] - 1]); --i; } } for (int i = 0; i < numSize; ++i) { if (nums[i] != i + 1) return i + 1; } return numSize + 1; } }; ``` > 不太明白哪裡可以使用到 cltz ## 測驗3 利用 GCC builtin func 實作 LeetCode [1342. Number of Steps to Reduce a Number to Zero](https://leetcode.com/problems/number-of-steps-to-reduce-a-number-to-zero/) ```c= int numberOfSteps (int num) { return num ? AAA + 31 - BBB : 0; } ``` `AAA = __builtin_popcount(num)`, `BBB = __builtin_clz(num)` 思路 : 二進制表示法中，每一個 `1` 代表至少需要做 1 次(奇數減1的行為)，接著需要把最靠近 MSB 的 `1` 移到 LSB，所需的步數是 `31 - __builtin_clz(num) ` > TODO : 研讀 `unsigned popcnt_branchless(unsigned v)` 數學原理 ## 測驗4 運用 bitwise operation 完成 GCD 的實作 ```c= #include <stdint.h> uint64_t gcd64(uint64_t u, uint64_t v) { if (!u || !v) return u | v; int shift; for (shift = 0; !((u | v) & 1); shift++) { u /= 2, v /= 2; } while (!(u & 1)) u /= 2; do { while (!(v & 1)) v /= 2; if (u < v) { v -= u; } else { uint64_t t = u - v; u = v; v = t; } } while (XXX); return YYY; } ``` `XXX = v`, `YYY = u << shift` 程式運作原理: ```c=5 for (shift = 0; !((u | v) & 1); shift++) { u /= 2, v /= 2; } ``` 若 `u` 和 `v` 皆為偶數的話，LSB 必為 0 ，`& 1` 後取 not 可以判斷兩數是否都是偶數，是的話可以把 2 提出來，這樣只要計算 gcd(u/2 , v/2) 就好，迭代運算下去直到其中一數變成奇數，並用 `shift` 計算提了幾個 2 ，最後再乘回去 ```c=8 while (!(u & 1)) u /= 2; ``` 若 `u` 是是偶數，則可以把 `u` 的 2 的因數全數提出，因為 `u` , `v`必定不會有公因數 2， ```c=10 do { while (!(v & 1)) v /= 2; if (u < v) { v -= u; } else { uint64_t t = u - v; u = v; v = t; } } while (XXX); ``` 第 11 行 while loop 原理同上述，將 `v` 的 2 的因數全部提出，但是下面會做兩數的相減，有可能奇數會變成偶數，所以必須放在 do while loop 內每一輪都要檢查。剩下的部分是在做輾轉相除法，並確保每一輪都是大的數 - 小的數，且進入下一輪的 `v` 總是大數減小數的結果，若 `u == v`，`v` 會等於 0 且結束計算。 ```c=21 return YYY; // YYY = u << shift ``` 最後 `u` 必須左移修正一開始提出的公因數 2 的個數 ## 測驗5 將 bit array 程式碼利用 clz 改寫 ```c= #include <stddef.h> size_t improved(uint64_t *bitmap, size_t bitmapsize, uint32_t *out) { size_t pos = 0; uint64_t bitset; for (size_t k = 0; k < bitmapsize; ++k) { bitset = bitmap[k]; while (bitset != 0) { uint64_t t = KKK; int r = __builtin_ctzll(bitset); out[pos++] = k * 64 + r; bitset ^= t; } } return pos; } ``` `KKK = bitset & -bitset` > TODO，暫無想法