2020q3 Homework3 (quiz3)

contributed by < shauming1020 >

tags: `homework`

測驗 1







int asr_i(signed int m, unsigned int n)
{
    const int logical = (((int) -1) OP1) > 0;
    unsigned int fixu = -(logical & (OP2));
    int fix = *(int *) &fixu;
    return (m >> n) | (fix ^ (fix >> n));
}

解釋上述程式運作原理

$- 5 {>>}^{a r i t h} 1 = - 3$ ，1111…1011 算術右移 1 得 [1]111…1101，[] 為右移後補上的值，而在只有unsigned的系統上，[] 內總是為 0
從 return 回推，透過 (fix ^ (fix >> n)) 造出 mask ，並讓 (m >> n) 所產生的 [] 可以根據不同系統而對應到正確的值
const int logical = (((int) -1) OP1) > 0;
- 透過右移
  $(i n t) - 1 = 0 x F F F F F F F F$ 來判斷是 unsigned 或 signed 系統
- 如果為 unsigned 系統，則 [] 值為 0，其右移後的值大於 0 ，logical 被設為 0x00000001
- 因此 OP1 為 >> 1 即可
unsigned int fixu = -(logical & (OP2));
- 考慮 unsigned 系統右移後，[] 需要補 1 的情形，即 m < 0
- 例如 0x1000 >> 2 = 0x[00]10，而與 mask 做 | 運算後要為 0x[11]10，因此 mask 可以是 1100
- 試著驗證看看，當 m < 0 且 logical 為 true 時，fix 為 0x1111
- fix >> 2 結果為 0x[00]11，再與 fix 做
  $\oplus$ 運算得 0x[11]00，為我們預期的 mask
- 而在 signed 或是 m >= 0 的情況下，fix 總是為 0，得到的 mask 仍為 0，與 (m >> n) 做 | 運算後仍可保持預期結果
- 因此 OP2 為 m < 0 即可

測驗 2





bool isPowerOfFour(int num)
{
    return num > 0 && (num & (num - 1))==0 &&
           !(__builtin_ctz(num) OPQ);  
}

Built-in Function: int __builtin_ctz (unsigned int x)
Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined.

直接帶入 4 的次方數字去回推，例如 num 為 4 = 0x00000100、16 = 0x00010000、64 = 0x01000000
(num & (num - 1))==0
- 0x0100 & 0x0011 為 0，從操作看就是去判斷 num 中是否只有一個 1 bit
!(__builtin_ctz(num) OPQ)
- 可以觀察到 4 的次方數值尾數都是偶數個 0，判斷是否為偶數可以透過與 mask 0x1 做 & 運算來判斷，如果不為偶數的話，則 & 0x1 的結果為 1，因此需要取完 not 後才是為偶數個 0 的情況
OPQ 為 & 0x1

測驗 3

Leetcode 1342. Number of Steps to Reduce a Number to Zero

Given a non-negative integer num, return the number of steps to reduce it to zero. If the current number is even, you have to divide it by 2, otherwise, you have to subtract 1 from it.




int numberOfSteps (int num)
{
    return num ? AAA + 31 - BBB : 0;
}

Built-in Function: int __builtin_clz (unsigned int x)

Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.

population count 簡稱 popcount 或叫 sideways sum，是計算數值的二進位表示中，有多少位元是 1

從題目給定的描述，如果當前 num 為偶數，則 / 2 (>> 1)，否則減 1
根據題目的測資觀察，Input num = 8 = 0x0000100[0]
- LSB 為 0 ，即偶數，>> 1 得 0x0000010[0]，直到 LSB 不為 0，num = 0x0000000[1]，減去 1
- 共 4 次操作
因此透過 >> 1 逐一把 LSB 設成 0，而要將 LSB 設成 0 的情況只有在 LSB 為 1 的時候，所以我們只要知道說至少要右移幾次，而每次右移後又至少額外要執行減 1 的動作
- 31 - __builtin_clz(num) 可以知道至少要右移幾次
- __builtin_popcount(num) 表示有幾個 1 要被減成 0

善用 clz

LeetCode 1009. Complement of Base 10 Integer

For a given number N in base-10, return the complement of it's binary representation as a base-10 integer.

Example 1:
Input: 5
Output: 2
Explanation: 5 is "101" in binary, with complement "010" in binary, which is 2 in base-10.







int bitwiseComplement(int N){
    if (N == 0) return 1;
    int sum = 0, sr = 31 - __builtin_clz(N);
    for (int i = 0; i < sr; i++, N >>= 1)
        sum += !(N & 0x1) << i;
    return sum;
}

這題和測驗 2 很像，先利用 31 - __builtin_clz(N) 計算出需要右移的次數，接著找出 N 當中所有為 0 的 bit，計算在 complement 時所應表示的值，加總起來即為結果
- 例如
  $N = 101$ ，
  $N^{'} = 010$ ，N 的 0 在補數 N' 中表示為
  $2^{1}$
- 因此透過對 N 右移 sr 個 bit，判斷當前的 LSB 是否為 0，如果為 0 的話則在 complement 中表示為
  $2^{s r}$
- 將所有
  $2^{s r}$ 加總起來即為結果
而當 N 為 0，編譯器會先出現 runtime error: passing zero to clz(), which is not a valid argument (solution.c) 的警示，因此需要額外考慮 N = 0 的情況

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

41. First Missing Positive

Given an unsorted integer array, find the smallest missing positive integer.

Example 1:
Input: [1,2,0]
Output: 3

Example 2:
Input: [3,4,-1,1]
Output: 2

Example 3:
Input: [7,8,9,11,12]
Output: 1

測驗 4






















#include <stdint.h>
uint64_t gcd64(uint64_t u, uint64_t v) {
    if (!u || !v) return u | v;
    int shift;
    for (shift = 0; !((u | v) & 1); shift++) {
        u /= 2, v /= 2;
    }
    while (!(u & 1))
        u /= 2;
    do {
        while (!(v & 1))
            v /= 2;
        if (u < v) {
            v -= u;
        } else {
            uint64_t t = u - v;
            u = v;
            v = t;
        }
    } while (XXX);
    return YYY;
}

參考Binary GCD Algorithm 和 Greatest Common Divisor 特性和實作考量

```
  for (shift = 0; !((u | v) & 1); shift++) {
      u /= 2, v /= 2;
  }
```
- 每執行一次 for 迴圈就從 u 和 v 中提出 2，而最後會提出
  $2^{s h i f t}$ 之後再做輾轉相除法
- 直到 u 或 v 當中有一個不為偶數

If x and y are both even, gcd(x, y) = 2*gcd(x/2, y/2) because 2 is a common divisor.
Multiplication with 2 can be done with bitwise shift operator.

```
   while (!(u & 1))
      u /= 2; 
  do {
      while (!(v & 1))
       v /= 2;
```
- 考慮剩下的三種情形即特性
1. If x is even and y is odd, gcd(x, y) = gcd(x/2, y).
2. On similar lines, if x is odd and y is even, then
  gcd(x, y) = gcd(x, y/2). It is because 2 is not a common divisor.
3. If both x and y are odd, then gcd(x, y) = gcd(|x-y|/2, y)
```
      if (u < v) {
          v -= u;
      } else {
          uint64_t t = u - v;
          u = v;
          v = t;
      }
  } while (XXX);
```
- 取出 u 和 v 的差值 t，而如果 v 本身比 u 還大的話，減去 u 將自己當作差值
- 直到差值 t 為 0 (u = v or u = 0)，而差值 t 由變數 v 所存，因此XXX 為 v
Repeat steps until x = y, or until x = 0.
return YYY;
- 由於在前面已先提出共同有的
  $2^{s h i f t}$ ，因此最後回傳 u 時還要在乘上
  $2^{s h i f t}$ ，故 YYY 為 u << shift
the GCD is power(2, k) * y, where power(2, k) is 2 raise to the power of k and k is the number of common factors of 2 found in step 2.

測驗 5














#include <stddef.h>
size_t naive(uint64_t *bitmap, size_t bitmapsize, uint32_t *out)
{
    size_t pos = 0;
    for (size_t k = 0; k < bitmapsize; ++k) {
        uint64_t bitset = bitmap[k];
        size_t p = k * 64;
        for (int i = 0; i < 64; i++) {
            if ((bitset >> i) & 0x1)
                out[pos++] = p + i;
        }
    }
    return pos;
}

可用 ctz 改寫為效率更高且等價的程式碼:
















#include <stddef.h>
size_t improved(uint64_t *bitmap, size_t bitmapsize, uint32_t *out)
{
    size_t pos = 0;
    uint64_t bitset;
    for (size_t k = 0; k < bitmapsize; ++k) {
        bitset = bitmap[k];
        while (bitset != 0) {
            uint64_t t = KKK;
            int r = __builtin_ctzll(bitset);
            out[pos++] = k * 64 + r;
            bitset ^= t;                           
        }
    }
    return pos;
}

先觀察 naive
- 假設 bitset = 0x00101100 = bitmap[k=1]，p = 1 * 64
- for (int i = 0; i < 64; i++) { 搜尋整個 bitset
- if ((bitset >> i) & 0x1) 不斷將 bitset 右移判斷，如果 bitset 的 LSB 為 1
- out[pos++] = p + i; 則將該位置 p + i 放到 out 中
- 所以該函式在找 bitmap 中所有 bit 為 1 的位置
接下來觀察 ctz 版本
- 宣告 uint64_t bitset; 一次存放一組 bit array，當 bitset 不為 0，表示當中存在 bit 1
- 假設 bitset = 0x00001010，則 int r = __builtin_ctzll(bitset); 會計算 bitset 尾端有幾個 0，此時 r = 1，即可知道位於 k * 64 + 1 的位置有 bit 1，而矩陣 index 從 0 起算，因此尾端有幾個 0 恰可直接當作 index
- bitset ^= t; 將 bitset 與 mask t 進行
  $\oplus$ 運算，而先前已經計算過 [] 中的 1 (
  $b i t s e t = 0 x 000010 [1] 0$ )，因此希望造出的 mask 可讓 [] 設成 0，並保留其他地方，如此一來
  $b i t s e t^{'} = 0 x 000010 [0] 0$ ，即可用 ctz 計算出下一個 1 的位置
- 這樣的 mask 有很多種，考慮
  $\oplus$ 與 0 運算可保留原有的 bit，而與 1 運算則是對 bit 取 not 的特性，期望的 mask t 可以為
  $0 x 000000 [1] 0$
- 在測驗 3 的描述中提到 "x & -x，將 x 的 LSB 取出 (isolate LSB)"，這裡應是想表達取出 bit 1 當中最低位者，例如
  $0 x 00001010 & 0 x 11110110 = 0 x 00000010$
- 因此透過 bitset & -bitset 即可造出預期的 mask t

2020q3 Homework3 (quiz3)

tags: homework

測驗 1

解釋上述程式運作原理

測驗 2

測驗 3

善用 clz

LeetCode 1009. Complement of Base 10 Integer

41. First Missing Positive

測驗 4

測驗 5

Read more

2020q3 Homework (quiz8)

Assignment3: SoftCPU

2020q3 Homework (quiz5)

Assignment2: RISC-V Toolchain

tags: `homework`