[2020q3](http://wiki.csie.ncku.edu.tw/sysprog/schedule) Homework3 (quiz3)

--- tags: sysprog2020 --- # [2020q3](http://wiki.csie.ncku.edu.tw/sysprog/schedule) Homework3 (quiz3) contributed by < `nickchenchj` > > related links: [2020q3 第 3 週測驗題](https://hackmd.io/@sysprog/2020-quiz3) ## Problem 1 ### Task * Given two numbers, a signed integer `m` and an unsigned integer `n`, write a function that performs [Arithmetic Shift Right (ASR)](https://en.wikipedia.org/wiki/Arithmetic_shift) on `m` by the value of `n`. According to [ISO/IEC 9899:TC2](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf) (C99 standard), section 6.5.7 (page 84, 85): > “The result of E1 >> E2 is E1 right-shifted E2 bit positions. … If E1 has a signed type and a negative value, the resulting value is ==implementation-defined==.” * There are two ways to deal with right shifting **negative signed integers**: * Arithmetic Shift Right (ASR): After right shifting an integer, the empty position in the most significant bit (MSB) is filled with a copy of the original MSB. (the original MSB is `1` for negative integer) * Logical Shift Right (LSR): After right shifting an integer, the empty position in the MSB is filled with `0`. * For example, we assume `m = -1`, `n = 6`, and we do `m >> n`, the results of ASR and LSR are as follows: | Original | ASR | LSR | |:------------------:|:------------------:|:------------------:| | 1111 1111 ... 1111 | 1111 1111 ... 1111 | 0000 0011 ... 1111 | * Below is the implementation of a cross-platform ASR function: ```c= int asr_i(signed int m, unsigned int n) { const int logical = (((int) -1) OP1) > 0; unsigned int fixu = -(logical & (OP2)); int fix = *(int *) &fixu; return (m >> n) | (fix ^ (fix >> n)); } ``` #### OP1 = ? * `(a)` `<< 1` * `(b)` `>> 1` * `(c)` `* m + n` * `(d)` `+ m` * `(e)` `+ n` * `(f)` `- m` * `(g)` `- n` #### OP2 = ? * `(a)` `m` * `(b)` `n` * `(c)` `m < 0` * `(d)` `m == 0` * `(e)` `m > 0` * `(f)` `n < 0` * `(g)` `n == 0` * `(h)` `n > 0` ### Solution * This is a cross-platform ASR, which means that the result of `m >> n` will follow the rule of ASR no matter what right shift rule is implemented. * By observing both results of ASR and LSR, it is clear that the only difference between these two results is the zeros starting at the MSB of the number, and rest of the bit sequences are identical. * We can also see that the number of zeros is `n`, which is `6` in this example. * Therefore, from line 6 of the code snippet (shown below), if `m < 0`, `fix ^ (fix >> n)` must be a 32-bit sequence with `n` ones starting at the MSB. Otherwise, if `m >= 0`, the results of `m >> n` are identical for ASR and LSR, so `fix ^ (fix >> n)` must be a 32-bit sequence of zeros. ```c return (m >> n) | (fix ^ (fix >> n)); ``` | Type | integer value | `fix ^ (fix >> n)` | |:----:|:-------------:|:----------------------:| | ASR | $m \geq 0$ | 0000 0000 ... 0000 | | ASR | $m < 0$ | 0000 0000 ... 0000 | | LSR | $m \geq 0$ | 0000 0000 ... 0000 | | LSR | $m < 0$ | ==1111 11==00 ... 0000 | ###### Note: `n = 6` in this example * We can deduce `fix` from the table above: | Type | integer value | `fix` | |:----:|:-------------:|:------------------:| | ASR | $m \geq 0$ | 0000 0000 ... 0000 | | ASR | $m < 0$ | 0000 0000 ... 0000 | | LSR | $m \geq 0$ | 0000 0000 ... 0000 | | LSR | $m < 0$ | 1111 1111 ... 1111 | * It is also possible to figure out `logical & OP2` from `fix` (converted from `fixu`): | Type | integer value | `logical & OP2` | |:----:|:-------------:|:----------------------:| | ASR | $m \geq 0$ | 0000 0000 ... 0000 | | ASR | $m < 0$ | 0000 0000 ... 0000 | | LSR | $m \geq 0$ | 0000 0000 ... 0000 | | LSR | $m < 0$ | 0000 0000 ... 000==1== | * From the result, we can conclude that both `logical` and `OP2` must be `1` if the platform uses LSR and `m <= 0`. * Therefore, it is clear that `logical` is used to determine which type of right shift is used, and `OP2` is used for checking whether `m` is a negative value or not. * So, the best choices that match the result are `(b)` `>> 1` and `(c)` `m < 0`. :::success **Answers** * OP1: `>> 1` * OP2: `m < 0` ::: --- ## Problem 2 ### Task * Given an integer, write a function to determine if it is a power of four. There are two built-in functions provided by [GCC](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html): > Built-in Function: int __builtin_clz (unsigned int x) > * Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined. > > Built-in Function: int __builtin_ctz (unsigned int x) > * Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined. * Below is the implementation of the function: ```c= bool isPowerOfFour(int num) { return num > 0 && (num & (num - 1))==0 && !(__builtin_ctz(num) OPQ); } ``` #### OPQ = ? * `(a)` `> 0` * `(b)` `> 31` * `(c)` `>> 0x2` * `(d)` `>> 0x1` * `(e)` `| 0x1` * `(f)` `& 0x1` * `(g)` `^ 0x1` ### Solution * `num & (num - 1) == 0` in line 3 is used to check if `num` in binary form contains only one `1` in it. * If an integer is a power of four, `(__builtin_ctz(num) OPQ)` must be `false`, and the index of the 1-bit must be an even number (except for 0). Therefore, `__builtin_ctz(num)` will be an even number. * To check if a number is even or not, perform `& 0x1` on it, and the result will be `false` if the number is even. * In conclusion, the answer to this problem is `(f)` `& 0x1` :::success **Answer** * OPQ: `& 0x1` ::: --- ## Problem 3 ### Task * Given a non-negative integer `num`, write a function that returns the number of steps to reduce it to zero. If the current number is even, you have to divide it by 2, otherwise, you have to subtract 1 from it. (LeetCode: [1342. Number of Steps to Reduce a Number to Zero](https://leetcode.com/problems/number-of-steps-to-reduce-a-number-to-zero/) * Below is the implementation of the function: ```c= int numberOfSteps (int num) { return num ? AAA + 31 - BBB : 0; } ``` #### AAA = ? * `(a)` `__builtin_popcount(num)` * `(b)` `__builtin_clz(num)` #### BBB = ? * `(a)` `__builtin_popcount(num)` * `(b)` `__builtin_clz(num)` ### Solution According to [GCC Built-in Functions](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html): > Built-in Function: int __builtin_popcount (unsigned int x) > * Returns the number of 1-bits in x. > > Built-in Function: int __builtin_clz (unsigned int x) > * Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined. #### Idea 1. If the current number is even (`LSB = 0`), shift the number right by 1. 2. If the current number is odd (`LSB = 1`), subtract 1 from it. * To shift a certain bit in a non-zero integer to the LSB, the system needs to right shift the bit by the value of its index. * To shift the most significant 1-bit to the LSB, shift the non-zero integer right by the value of `31 - (__builtin_clz(num))`. Therefore, the number of steps of division is `31 - (__builtin_clz(num))` * All 1-bits in a non-zero integer will eventually be shifted to the position of the LSB (`index = 0`), when each of them reaches the the position of the LSB, one step of subtraction is needed. Hence, the number of steps of subtraction is the number of 1-bits in the number, which is equivalent to the value of `__builtin_popcount(num)`. * To summarize, the total number of steps is `(__builtin_popcount(num)) + 31 - (__builtin_clz(num))`, so answers to this problem are `(a)` `__builtin_popcount(num)` and `(b)` `__builtin_clz(num)`. :::success **Answers** * AAA: `__builtin_popcount(num)` * BBB: `__builtin_clz(num)` ::: --- ## Problem 4 ### Task * Given two numbers of type 64-bit unsigned integer, write a function that calculates the greatest common divisor (GCD) of the two numbers. * Below is the implementation of the GCD function (original): ```c= #include <stdint.h> uint64_t gcd64(uint64_t u, uint64_t v) { if (!u || !v) return u | v; while (v) { uint64_t t = v; v = u % v; u = t; } return u; } ``` * and this is an equivalent function without modulo operator (`%`): ```c= #include <stdint.h> uint64_t gcd64(uint64_t u, uint64_t v) { if (!u || !v) return u | v; int shift; for (shift = 0; !((u | v) & 1); shift++) { u /= 2, v /= 2; } while (!(u & 1)) u /= 2; do { while (!(v & 1)) v /= 2; if (u < v) { v -= u; } else { uint64_t t = u - v; u = v; v = t; } } while (XXX); return YYY; } ``` #### XXX = ? * `(a)` `u` * `(b)` `v` * `(c)` `u + v` * `(d)` `u - v` * `(e)` `u >> 1` * `(f)` `v >> 1` #### YYY = ? * `(a)` `u` * `(b)` `v` * `(c)` `u << v` * `(d)` `v << u` * `(e)` `u << shift` * `(f)` `v << shift` ### Solution * `shift` is used for storing the number of right shift performing on both `u` and `v` (`u >>= shift` and `v >>= shift`). * In the `do-while` loop, `v` is always no less than `u` at the end of each loop. However, there is one exception, when `v` is equal to `u`, `v` becomes `0` (`v = t = u - v`), and `u` is the value of `GCD >> shift` ($\frac{GCD}{2^{shift}}$). * Therefore, when the function is about to return a value, it needs to shift the `u` left by the value of `shift` (`u << shift`). :::success **Answers** * XXX: `v` * YYY: `u << shift` ::: --- ## Problem 5 ### Task * Given a bitmap stored in a 64-bit unsigned integer array, write a function that stores the indices of each 1-bit in an 32-bit unsigned integer array and returns the number of 1-bits. * Below is the implementation of the function: ```c= #include <stddef.h> size_t naive(uint64_t *bitmap, size_t bitmapsize, uint32_t *out) { size_t pos = 0; for (size_t k = 0; k < bitmapsize; ++k) { uint64_t bitset = bitmap[k]; size_t p = k * 64; for (int i = 0; i < 64; i++) { if ((bitset >> i) & 0x1) out[pos++] = p + i; } } return pos; } ``` * use `__builtin_clzll` to improve the efficiency of this function: The details of `__builtin_clzll` can be find in [GCC Built-in Functions](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html): > Built-in Function: int __builtin_clzll (unsigned long long) > * Similar to __builtin_clz, except the argument type is unsigned long long. ```c= #include <stddef.h> size_t improved(uint64_t *bitmap, size_t bitmapsize, uint32_t *out) { size_t pos = 0; uint64_t bitset; for (size_t k = 0; k < bitmapsize; ++k) { bitset = bitmap[k]; while (bitset != 0) { uint64_t t = KKK; int r = __builtin_ctzll(bitset); out[pos++] = k * 64 + r; bitset ^= t; } } return pos; } ``` #### KKK = ? * `(a)` `bitset` * `(b)` `bitset & 0x1` * `(c)` `bitset | 0x1` * `(d)` `bitset ^ 0x1` * `(e)` `bitset & -bitset` ### Solution * `r` is the index of the least significant 1-bit in `bitset`. After storing its index in `out`, it is no longer needed, so we have to find a way to eliminate it, and this is exactly what `bitset ^= t` does. * To eliminate the least significant 1-bit, `t` must contain only one 1-bit in it, and the index of the 1-bit must be the same as that of the bit we want to eliminate. | `bitset` | `t` | `bitset ^ t` | |:-----------------:|:-----------------:|:-----------------:| | 1011 10==1==0 | 0000 00==1==0 | 1011 10==0==0 | ###### Note: 8-bit unsigned integer is used for this demonstration * After observation, we know the answer to this problem is `(e)` `bitset & -bitset`. | `bitset` | `-bitset` | `bitset & -bitset` | |:-------------:|:-------------:|:------------------:| | 0110 100==1== | 1001 011==1== | 0000 000==1== | | 1011 10==1==0 | 0100 01==1==0 | 0000 00==1==0 | ###### Note: `bitset` is treated as signed integer when performing negative sign on it :::success **Answer** * KKK: `bitset & -bitset` :::