# 2022q1 Homework3 (quiz3)
contributed by < [`KJay221`](https://github.com/KJay221) >
## 測驗 `1`
`GENMASK` 巨集,微處理器架構為 64 位元
```c
#define GENMASK(h, l) \
(((~0UL) >> (63-h)) & ((~0UL) >> (l) << (l)))
```
+ LEFT = `63-h`
+ RIGHT = `l`
### 1. 解釋上述程式碼運作原理
以 `GENMASK(6, 4)` 產生`01110000`為例,首先可以先將 `GENMASK` 看成 `a & b`
`a` 的部份只有`11111111`向右位移,可推測 `a` 為`01111111`,所以 `LEFT` 為 `(8-(6+1))` ,若是64位元的話則為 `64-(h+1) = 63-h`
若 `a & b` 要得到正確答案`01110000`推測 `b` 為`11110000`,所以 `b` 要先右移 `l` 再左移 `l` 才會得到正確答案
### 2. 比較 Linux 核心 `GENMASK` 巨集的實作,闡述其額外的考量
### 3. 舉出 Linux 核心原始程式碼中二處 `GENMASK` 巨集和 [include/linux/bitfield.h](https://github.com/torvalds/linux/blob/master/include/linux/bitfield.h) 的應用案例
---
## 測驗 `2`
考慮以下程式碼:
```c
struct foo;
struct fd {
struct foo *foo;
unsigned int flags;
};
enum {
FOO_DEFAULT = 0,
FOO_ACTION,
FOO_UNLOCK,
} FOO_FLAGS;
static inline struct fd to_fd(unsigned long v)
{
return (struct fd){(struct foo *)(v & ~3), v & 3};
}
```
`EXP1` = `(struct foo *)(v & ~3)`
### 1. 解釋上述程式碼運作原理
若 `struct fd` 中的 `struct foo *foo` 要對 4 個位元組進行向下對齊,則其 `bit0、bit1` 必為 0,當傳入的地址 `v` 要向下對齊時,只要用 `v & ~3` 就能清空 `bit0、bit1` 達到對齊效果,又因應回傳型態的要求,要在前面加上 `(struct foo *)` 來強制轉型結果才會是正確的
### 2. 在 Linux 核心原始程式碼找出類似技巧的實作 (例如檔案系統) 並解說
## 測驗 `3`
考慮以下程式碼,能將輸入的 8 位元無號整數逐一位元反轉
```c
#include <stdint.h>
uint8_t rev8(uint8_t x)
{
x = (x >> 4) | (x << 4);
x = ((x & 0xCC) >> 2) | ((x & 0x33) << 2);
x = ((x & 0xAA) >> 1) | ((x & 0x55) << 1);
return x;
}
```
`EXP2` = `(x & 0x33) << 2`
`EXP3` = `(x & 0x55) << 1`
### 1. 解釋上述程式碼運作原理
參考 [< hsuedw >](https://hackmd.io/@hsuedw/linux2022-quiz3#%E5%BB%B6%E4%BC%B8%E5%95%8F%E9%A1%8C-1-%E8%A7%A3%E9%87%8B%E4%B8%8A%E8%BF%B0%E7%A8%8B%E5%BC%8F%E7%A2%BC%E9%81%8B%E4%BD%9C%E5%8E%9F%E7%90%862) 的圖示和 [你所不知道的 C 語言:數值系統篇](https://hackmd.io/@sysprog/c-numerics#%E7%9C%81%E5%8E%BB%E8%BF%B4%E5%9C%88)
+ 原理就是先將前後 `4bits` 切成兩組並左右交換,接著這兩組 `4bits` 又各自再切成兩組 `2bits` 然後也左右交換,一直重複做到切成 `1bits` 為止
+ 若是 `32bits` 則一開始則切成兩個 `16bits` 依此類推
### 2. 在 Linux 核心原始程式碼找出類似技巧的實作並解說其應用場景
## 測驗 `4`
延伸[〈你所不知道的 C 語言:前置處理器應用篇〉](https://hackmd.io/@sysprog/c-preprocessor),我們嘗試將 [foreach 巨集](https://en.wikipedia.org/wiki/Foreach_loop) 予以擴充,使得支援以下寫法:
```c
int i;
foreach_int(i, 0, -1, 100, 0, -99) {
printf("i is %i\n", i);
}
const char *p;
foreach_ptr(p, "Hello", "world") {
printf("p is %s\n", p);
}
```
預期輸出如下:
```
i is 0
i is -1
i is 100
i is 0
i is -99
p is Hello
p is world
```
對應的巨集定義:
```c
#include <assert.h>
#define _foreach_no_nullval(i, p, arr) \
assert((i) >= sizeof(arr) / sizeof(arr[0]) || (p))
#define foreach_int(i, ...) \
for (unsigned _foreach_i = (((i) = ((int[]){__VA_ARGS__})[0]), 0); \
_foreach_i < sizeof((int[]){__VA_ARGS__}) / sizeof(int); \
(i) = ((int[]){__VA_ARGS__, 0})[++_foreach_i])
#define foreach_ptr(i, ...) \
for (unsigned _foreach_i = \
(((i) = (void *) ((typeof(i)[]){__VA_ARGS__})[0]), 0); \
(i); (i) = (void *) ((typeof(i)[]){__VA_ARGS__, \
NULL})[++_foreach_i], \
_foreach_no_nullval(_foreach_i, i, \
((const void *[]){__VA_ARGS__})))
```
`EXP4` = `++_foreach_i`
`EXP5` = `++_foreach_i`
### 1. 解釋上述程式碼運作原理
參考 [< hsuedw >](https://hackmd.io/jpQ7AlnXRcKZ7HRqx5acfA?view#%E5%BB%B6%E4%BC%B8%E5%95%8F%E9%A1%8C-1-%E8%A7%A3%E9%87%8B%E4%B8%8A%E8%BF%B0%E7%A8%8B%E5%BC%8F%E7%A2%BC%E9%81%8B%E4%BD%9C%E5%8E%9F%E7%90%863) 對於 `__VA_ARGS__` 的解釋得出 `foreach_int(i, ...)` 主要的三個動作
+ `unsigned _foreach_i = (((i) = ((int[]){__VA_ARGS__})[0]), 0);` 將 `_foreach_i` 初始化為零,並將 `i` 設成輸入的第 `0` 個元素
+ `_foreach_i < sizeof((int[]){__VA_ARGS__}) / sizeof(int);` 為測試 `_foreach_i` 是否超出範圍
+ `(i) = ((int[]){__VA_ARGS__, 0})[++_foreach_i])` 的功能為將 `i` 設成當前索引值所對應的元素並將 `_foreach_i` 加一,另外用 `++_foreach_i` 則是為了先將 `_foreach_i` 加一,再更新 `i` 的內容
### 2. 在 Linux 核心原始程式碼找出類似技巧的實作並解說其應用場景
## 測驗 `5`
針對 LeetCode [29. Divide Two Integers](https://leetcode.com/problems/divide-two-integers/),以下是可能的實作:
```c
#include <limits.h>
int divide(int dividend, int divisor)
{
int signal = 1;
unsigned int dvd = dividend;
if (dividend < 0) {
signal *= -1;
dvd = ~dvd + 1;
}
unsigned int dvs = divisor;
if (divisor < 0) {
signal *= -1;
dvs = ~dvs + 1;
}
int shift = 0;
while (dvd > (EXP6))
shift++;
unsigned int res = 0;
while (dvd >= dvs) {
while (dvd < (dvs << shift))
shift--;
res |= (unsigned int) 1 << shift;
EXP7;
}
if (signal == 1 && res >= INT_MAX)
return INT_MAX;
return res * signal;
}
```
`EXP6` = `dvs << shift`
`EXP7` = `dvd -= dvs << shift`
### 1. 解釋上述程式碼運作原理,指出可改進之處,並予以實作
上述的程式碼運作邏輯與[硬體除法器](https://blog.csdn.net/m0_49540263/article/details/111354093)的概念相似,首先在 4~15 行的部分,先將正號決定好,並將負數轉成正數以便後面實現正數除法,17~27 行的部分則是做除法,並將結果儲存於 `res` 中,最後將 `res` 與 `signal` 相乘即是結果
### 2. 在 Linux 核心原始程式碼中找出針對整數/浮點數/定點數除法特別撰寫的程式碼,並予以討論
## 測驗 `6`
針對 LeetCode [149. Max Points on a Line](https://leetcode.com/problems/max-points-on-a-line/),以下是可能的實作:
```c
#include <stdbool.h>
#include "list.h"
struct Point {
int x, y;
};
struct point_node {
int p1, p2;
struct list_head link;
};
static bool can_insert(struct list_head *head, int p1, int p2)
{
struct point_node *pn;
list_for_each_entry (pn, head, link)
return EXP8;
return true;
}
static int gcd(int x, int y)
{
while (y) {
int tmp = y;
y = x % y;
x = tmp;
}
return x;
}
static int maxPoints(struct Point *points, int pointsSize)
{
if (pointsSize <= 2)
return pointsSize;
int i, j, slope_size = pointsSize * pointsSize / 2 + 133;
int *dup_cnts = malloc(pointsSize * sizeof(int));
int *hori_cnts = malloc(pointsSize * sizeof(int));
int *vert_cnts = malloc(pointsSize * sizeof(int));
int *slope_cnts = malloc(slope_size * sizeof(int));
memset(hori_cnts, 0, pointsSize * sizeof(int));
memset(vert_cnts, 0, pointsSize * sizeof(int));
memset(slope_cnts, 0, slope_size * sizeof(int));
for (i = 0; i < pointsSize; i++)
dup_cnts[i] = 1;
struct list_head *heads = malloc(slope_size * sizeof(*heads));
for (i = 0; i < slope_size; i++)
INIT_LIST_HEAD(&heads[i]);
for (i = 0; i < pointsSize; i++) {
for (j = i + 1; j < pointsSize; j++) {
if (points[i].x == points[j].x)
hori_cnts[i]++, hori_cnts[j]++;
if (points[i].y == points[j].y)
vert_cnts[i]++, vert_cnts[j]++;
if (points[i].x == points[j].x && points[i].y == points[j].y)
dup_cnts[i]++, dup_cnts[j]++;
if (points[i].x != points[j].x && points[i].y != points[j].y) {
int dx = points[j].x - points[i].x;
int dy = points[j].y - points[i].y;
int tmp = gcd(dx, dy);
dx /= tmp;
dy /= tmp;
int hash = dx * dy - 1333 * (dx + dy);
if (hash < 0)
hash = -hash;
hash %= slope_size;
if (can_insert(&heads[hash], i, j)) {
struct point_node *pn = malloc(sizeof(*pn));
pn->p1 = i;
pn->p2 = j;
EXP9;
}
}
}
}
for (i = 0; i < slope_size; i++) {
int index = -1;
struct point_node *pn;
list_for_each_entry (pn, &heads[i], link) {
index = pn->p1;
slope_cnts[i]++;
}
if (index >= 0)
slope_cnts[i] += dup_cnts[index];
}
int max_num = 0;
for (i = 0; i < pointsSize; i++) {
if (hori_cnts[i] + 1 > max_num)
max_num = hori_cnts[i] + 1;
if (vert_cnts[i] + 1 > max_num)
max_num = vert_cnts[i] + 1;
}
for (i = 0; i < slope_size; i++) {
if (slope_cnts[i] > max_num)
max_num = slope_cnts[i];
}
return max_num;
}
```
`EXP8` = `ans`
`EXP9` = `ans`
### 1. 解釋上述程式碼運作原理,指出可改進之處,並予以實作
### 2. 擴充 LeetCode 題目,考慮更多座標點的輸入 (例如超過 10 萬個) 時,設計有效的資料結構以降低記憶體開銷,並確保快速的執行