2022q1 Homework3 (quiz3)

# 2022q1 Homework3 (quiz3) contributed by < [`KJay221`](https://github.com/KJay221) > ## 測驗 `1` `GENMASK` 巨集，微處理器架構為 64 位元 ```c #define GENMASK(h, l) \ (((~0UL) >> (63-h)) & ((~0UL) >> (l) << (l))) ``` + LEFT = `63-h` + RIGHT = `l` ### 1. 解釋上述程式碼運作原理以 `GENMASK(6, 4)` 產生`01110000`為例，首先可以先將 `GENMASK` 看成 `a & b` `a` 的部份只有`11111111`向右位移，可推測 `a` 為`01111111`，所以 `LEFT` 為 `(8-(6+1))` ，若是64位元的話則為 `64-(h+1) = 63-h` 若 `a & b` 要得到正確答案`01110000`推測 `b` 為`11110000`，所以 `b` 要先右移 `l` 再左移 `l` 才會得到正確答案 ### 2. 比較 Linux 核心 `GENMASK` 巨集的實作，闡述其額外的考量 ### 3. 舉出 Linux 核心原始程式碼中二處 `GENMASK` 巨集和 [include/linux/bitfield.h](https://github.com/torvalds/linux/blob/master/include/linux/bitfield.h) 的應用案例 --- ## 測驗 `2` 考慮以下程式碼: ```c struct foo; struct fd { struct foo *foo; unsigned int flags; }; enum { FOO_DEFAULT = 0, FOO_ACTION, FOO_UNLOCK, } FOO_FLAGS; static inline struct fd to_fd(unsigned long v) { return (struct fd){(struct foo *)(v & ~3), v & 3}; } ``` `EXP1` = `(struct foo *)(v & ~3)` ### 1. 解釋上述程式碼運作原理若 `struct fd` 中的 `struct foo *foo` 要對 4 個位元組進行向下對齊，則其 `bit0、bit1` 必為 0，當傳入的地址 `v` 要向下對齊時，只要用 `v & ~3` 就能清空 `bit0、bit1` 達到對齊效果，又因應回傳型態的要求，要在前面加上 `(struct foo *)` 來強制轉型結果才會是正確的 ### 2. 在 Linux 核心原始程式碼找出類似技巧的實作 (例如檔案系統) 並解說 ## 測驗 `3` 考慮以下程式碼，能將輸入的 8 位元無號整數逐一位元反轉 ```c #include <stdint.h> uint8_t rev8(uint8_t x) { x = (x >> 4) | (x << 4); x = ((x & 0xCC) >> 2) | ((x & 0x33) << 2); x = ((x & 0xAA) >> 1) | ((x & 0x55) << 1); return x; } ``` `EXP2` = `(x & 0x33) << 2` `EXP3` = `(x & 0x55) << 1` ### 1. 解釋上述程式碼運作原理參考 [< hsuedw >](https://hackmd.io/@hsuedw/linux2022-quiz3#%E5%BB%B6%E4%BC%B8%E5%95%8F%E9%A1%8C-1-%E8%A7%A3%E9%87%8B%E4%B8%8A%E8%BF%B0%E7%A8%8B%E5%BC%8F%E7%A2%BC%E9%81%8B%E4%BD%9C%E5%8E%9F%E7%90%862) 的圖示和 [你所不知道的 C 語言：數值系統篇](https://hackmd.io/@sysprog/c-numerics#%E7%9C%81%E5%8E%BB%E8%BF%B4%E5%9C%88) + 原理就是先將前後 `4bits` 切成兩組並左右交換，接著這兩組 `4bits` 又各自再切成兩組 `2bits` 然後也左右交換，一直重複做到切成 `1bits` 為止 + 若是 `32bits` 則一開始則切成兩個 `16bits` 依此類推 ### 2. 在 Linux 核心原始程式碼找出類似技巧的實作並解說其應用場景 ## 測驗 `4` 延伸[〈你所不知道的 C 語言：前置處理器應用篇〉](https://hackmd.io/@sysprog/c-preprocessor)，我們嘗試將 [foreach 巨集](https://en.wikipedia.org/wiki/Foreach_loop) 予以擴充，使得支援以下寫法: ```c int i; foreach_int(i, 0, -1, 100, 0, -99) { printf("i is %i\n", i); } const char *p; foreach_ptr(p, "Hello", "world") { printf("p is %s\n", p); } ``` 預期輸出如下: ``` i is 0 i is -1 i is 100 i is 0 i is -99 p is Hello p is world ``` 對應的巨集定義: ```c #include <assert.h> #define _foreach_no_nullval(i, p, arr) \ assert((i) >= sizeof(arr) / sizeof(arr[0]) || (p)) #define foreach_int(i, ...) \ for (unsigned _foreach_i = (((i) = ((int[]){__VA_ARGS__})[0]), 0); \ _foreach_i < sizeof((int[]){__VA_ARGS__}) / sizeof(int); \ (i) = ((int[]){__VA_ARGS__, 0})[++_foreach_i]) #define foreach_ptr(i, ...) \ for (unsigned _foreach_i = \ (((i) = (void *) ((typeof(i)[]){__VA_ARGS__})[0]), 0); \ (i); (i) = (void *) ((typeof(i)[]){__VA_ARGS__, \ NULL})[++_foreach_i], \ _foreach_no_nullval(_foreach_i, i, \ ((const void *[]){__VA_ARGS__}))) ``` `EXP4` = `++_foreach_i` `EXP5` = `++_foreach_i` ### 1. 解釋上述程式碼運作原理參考 [< hsuedw >](https://hackmd.io/jpQ7AlnXRcKZ7HRqx5acfA?view#%E5%BB%B6%E4%BC%B8%E5%95%8F%E9%A1%8C-1-%E8%A7%A3%E9%87%8B%E4%B8%8A%E8%BF%B0%E7%A8%8B%E5%BC%8F%E7%A2%BC%E9%81%8B%E4%BD%9C%E5%8E%9F%E7%90%863) 對於 `__VA_ARGS__` 的解釋得出 `foreach_int(i, ...)` 主要的三個動作 + `unsigned _foreach_i = (((i) = ((int[]){__VA_ARGS__})[0]), 0);` 將 `_foreach_i` 初始化為零，並將 `i` 設成輸入的第 `0` 個元素 + `_foreach_i < sizeof((int[]){__VA_ARGS__}) / sizeof(int);` 為測試 `_foreach_i` 是否超出範圍 + `(i) = ((int[]){__VA_ARGS__, 0})[++_foreach_i])` 的功能為將 `i` 設成當前索引值所對應的元素並將 `_foreach_i` 加一，另外用 `++_foreach_i` 則是為了先將 `_foreach_i` 加一，再更新 `i` 的內容 ### 2. 在 Linux 核心原始程式碼找出類似技巧的實作並解說其應用場景 ## 測驗 `5` 針對 LeetCode [29. Divide Two Integers](https://leetcode.com/problems/divide-two-integers/)，以下是可能的實作: ```c #include <limits.h> int divide(int dividend, int divisor) { int signal = 1; unsigned int dvd = dividend; if (dividend < 0) { signal *= -1; dvd = ~dvd + 1; } unsigned int dvs = divisor; if (divisor < 0) { signal *= -1; dvs = ~dvs + 1; } int shift = 0; while (dvd > (EXP6)) shift++; unsigned int res = 0; while (dvd >= dvs) { while (dvd < (dvs << shift)) shift--; res |= (unsigned int) 1 << shift; EXP7; } if (signal == 1 && res >= INT_MAX) return INT_MAX; return res * signal; } ``` `EXP6` = `dvs << shift` `EXP7` = `dvd -= dvs << shift` ### 1. 解釋上述程式碼運作原理，指出可改進之處，並予以實作上述的程式碼運作邏輯與[硬體除法器](https://blog.csdn.net/m0_49540263/article/details/111354093)的概念相似，首先在 4~15 行的部分，先將正號決定好，並將負數轉成正數以便後面實現正數除法，17~27 行的部分則是做除法，並將結果儲存於 `res` 中，最後將 `res` 與 `signal` 相乘即是結果 ### 2. 在 Linux 核心原始程式碼中找出針對整數/浮點數/定點數除法特別撰寫的程式碼，並予以討論 ## 測驗 `6` 針對 LeetCode [149. Max Points on a Line](https://leetcode.com/problems/max-points-on-a-line/)，以下是可能的實作: ```c #include <stdbool.h> #include "list.h" struct Point { int x, y; }; struct point_node { int p1, p2; struct list_head link; }; static bool can_insert(struct list_head *head, int p1, int p2) { struct point_node *pn; list_for_each_entry (pn, head, link) return EXP8; return true; } static int gcd(int x, int y) { while (y) { int tmp = y; y = x % y; x = tmp; } return x; } static int maxPoints(struct Point *points, int pointsSize) { if (pointsSize <= 2) return pointsSize; int i, j, slope_size = pointsSize * pointsSize / 2 + 133; int *dup_cnts = malloc(pointsSize * sizeof(int)); int *hori_cnts = malloc(pointsSize * sizeof(int)); int *vert_cnts = malloc(pointsSize * sizeof(int)); int *slope_cnts = malloc(slope_size * sizeof(int)); memset(hori_cnts, 0, pointsSize * sizeof(int)); memset(vert_cnts, 0, pointsSize * sizeof(int)); memset(slope_cnts, 0, slope_size * sizeof(int)); for (i = 0; i < pointsSize; i++) dup_cnts[i] = 1; struct list_head *heads = malloc(slope_size * sizeof(*heads)); for (i = 0; i < slope_size; i++) INIT_LIST_HEAD(&heads[i]); for (i = 0; i < pointsSize; i++) { for (j = i + 1; j < pointsSize; j++) { if (points[i].x == points[j].x) hori_cnts[i]++, hori_cnts[j]++; if (points[i].y == points[j].y) vert_cnts[i]++, vert_cnts[j]++; if (points[i].x == points[j].x && points[i].y == points[j].y) dup_cnts[i]++, dup_cnts[j]++; if (points[i].x != points[j].x && points[i].y != points[j].y) { int dx = points[j].x - points[i].x; int dy = points[j].y - points[i].y; int tmp = gcd(dx, dy); dx /= tmp; dy /= tmp; int hash = dx * dy - 1333 * (dx + dy); if (hash < 0) hash = -hash; hash %= slope_size; if (can_insert(&heads[hash], i, j)) { struct point_node *pn = malloc(sizeof(*pn)); pn->p1 = i; pn->p2 = j; EXP9; } } } } for (i = 0; i < slope_size; i++) { int index = -1; struct point_node *pn; list_for_each_entry (pn, &heads[i], link) { index = pn->p1; slope_cnts[i]++; } if (index >= 0) slope_cnts[i] += dup_cnts[index]; } int max_num = 0; for (i = 0; i < pointsSize; i++) { if (hori_cnts[i] + 1 > max_num) max_num = hori_cnts[i] + 1; if (vert_cnts[i] + 1 > max_num) max_num = vert_cnts[i] + 1; } for (i = 0; i < slope_size; i++) { if (slope_cnts[i] > max_num) max_num = slope_cnts[i]; } return max_num; } ``` `EXP8` = `ans` `EXP9` = `ans` ### 1. 解釋上述程式碼運作原理，指出可改進之處，並予以實作 ### 2. 擴充 LeetCode 題目，考慮更多座標點的輸入 (例如超過 10 萬個) 時，設計有效的資料結構以降低記憶體開銷，並確保快速的執行