2021q1 Homework2 (quiz2)

# 2021q1 Homework2 (quiz2) contributed by < `secminhr` > > [題目](https://hackmd.io/@sysprog/linux2021-quiz2) ## 問題 ### 1. COND1 = ? 出現在這個函式裡面： ```c static list_ele_t *get_middle(struct list_head *list) { struct list_head *fast = list->next, *slow; list_for_each (slow, list) { if (COND1 || COND2) break; fast = fast->next->next; } return list_entry(TTT, list_ele_t, list); } ``` 從名字我們可以知道這個函式的用處就是要取得 `list` 中間的元素。看一下 `list_for_each` 可以知道這裡是要以 `slow` 為變數走訪 `list` 。並且在後面的 `fast = fast->next->next;` 我們知道 `fast` 會一次經過兩個。也就是說，當 `slow` 走過的數目會是 `fast` 的一半。那麼，當 `fast` 走到最尾端時，`slow` 將會是在中間。所以這裡的跳出條件會是「當 `fast` 到達尾端時」。因為 `fast` 一次經過兩個的關係，尾端會有兩種表示，對應 COND1 和 COND2 1. 奇數情況 ```graphviz digraph { rankdir=LR list->a:h a[shape=record, label="{<h>|<n>}"] b[shape=record, label="{<h>|<n>}"] a:n->b:h c[shape=record, label="{<h>|<n>}"] b:n->c:h d[shape=record, label="{<h>|<n>}"] c:n->d:h e[shape=record, label="{<h>|<n>}"] d:n->e:h e:n->a:h fast->e:h } ``` 這種狀況下，我們可以構造出判斷式 `fast->next == list` 。 2. 偶數情況 ```graphviz digraph { rankdir=LR list_head->a:h a[shape=record, label="{<h>|<n>}"] b[shape=record, label="{<h>|<n>}"] a:n->b:h c[shape=record, label="{<h>|<n>}"] b:n->c:h d[shape=record, label="{<h>|<n>}"] c:n->d:h d:n->a:h fast->c:h } ``` 因為一次經過兩個的關係，`fast` 不會到達最後一個。這種狀況可以構造出 `fast->next->next == list` 。於是我們知道 COND1 和 COND2 應寫 `fast->next == list` 和 `fast->next->next == list` 兩個判斷。因此，我們得到 COND1 = `fast->next == list` COND2 = `fast->next->next == list` （這兩者順序可以互換）答案為 `(c)` ### 2. COND2 = ? 上面已說明，答案為 `(b)` ### 3. MMM = ? MMM 出現的位置是作為 `list_cut_position` 的參數 ```c void list_merge_sort(queue_t *q) { if (list_is_singular(&q->list)) return; queue_t left; struct list_head sorted; INIT_LIST_HEAD(&left.list); list_cut_position(&left.list, &q->list, MMM); list_merge_sort(&left); list_merge_sort(q); list_merge(&left.list, &q->list, &sorted); INIT_LIST_HEAD(&q->list); list_splice_tail(&sorted, &q->list); } ``` `list_cut_position` 告訴我們最後一個參數的用途是決定切斷的點。 ``` @node: pointer to the node in which defines the cutting point ``` 根據 merge sort 的做法，我們要先把一個串列分成兩半。我們知道中間的元素是：`get_middle(&q->list)` 因此可以知道 MMM = `&get_middle(&q->list)->list` 。答案為 `(e)` ### 4. TTT = ? TTT 出現的函式 ```c static list_ele_t *get_middle(struct list_head *list) { struct list_head *fast = list->next, *slow; list_for_each (slow, list) { if (COND1 || COND2) break; fast = fast->next->next; } return list_entry(TTT, list_ele_t, list); } ``` 從第一題的說明中我們可以知道這時在中間的是 `slow` ，而這個函式就要要回傳在中間的那一個。因此， TTT = `slow` 。答案為 `(a)` ### 5. X = ? ### 6. Y = ? ### 7. Z = ? ```c uint16_t func(uint16_t N) { /* change all right side bits to 1 */ N |= N >> 1; N |= N >> X; N |= N >> Y; N |= N >> Z; return (N + 1) >> 1; } ``` 以二進位的表示來觀察這個函式應有的行為： | 輸入 | 回傳 | | -------- | -------- | | 1 | 1 | | 10| 10| |11 | 10| |100 | 100| |101 | 100| |110 | 100| |111 | 100| 可以發現就是留下第一個1。而要讓這個函式的最後一行 `(N + 1) >> 1` 達到上面的效果，就要讓右邊都變成 1 ，這樣加 1 個時候就會進一位，再右移一個就達到目標。例如： `101 -> 111 -> 1000 -> 100` 由此知道下面四行的作用就是讓右邊都變成 1 。 ```c N |= N >> 1; N |= N >> X; N |= N >> Y; N |= N >> Z; ``` 我們考慮最多0的狀況： 1000000000000000 經過 `N |= N >> 1` 變成： 1100000000000000 為了能盡量有效的填寫 1 ，我們讓 `N |= N >> 2` ： 1111000000000000 同上，讓 `N |= N >> 4` ： 1111111100000000 讓 `N |= N >> 8` ： 1111111111111111 可以發現目標已經達到。代表 X = 2 Y = 4 Z = 8 可以把右邊填滿 1 ，但是還需驗證這樣的操作會不會影響左邊的 0 。考慮一個左邊有 0 的數，例如： 0000000000000101 當右移一位時： 0000000000000010 可以注意到並不會污染左邊的 0 。我們可以說明這件事：如果在右移後會汙染最左邊的 1 左邊的 0 ，代表在最左邊的 1 「左邊還要有 1 」，如此一來，自相矛盾。因此左邊的 0 不會被影響。確定之後，我們便可得 5~7 題結果： 5. `(b)` 6. `(d)` 7. `(h)` ### 8. RRR = ? 出現的部分： ```c uint8_t data = *source++; size_t bitsize = (count > 8) ? 8 : count; if (read_lhs > 0) { RRR; if (bitsize > read_rhs) data |= (*source >> read_rhs); } ``` `read_lhs > 0` 表示要開始讀的位置不在一個 byte 的開頭例如下面是 `read_lhs` 為 2 的情況（ * 是要開始讀取的 bit ）： | | |* | | | | | | |--|--|--|--|--|--|--|--| |B0|B1|B2|B3|B4|B5|B6|B7| 可以注意到 `uint8_t data = *source++;` 這行已經讓 `data` 讀取了，從 `source` 開始讀 8 個 bit 。以例子來說現在 `data` 裡面每個 bit 的情況是 | | | | | | | | | |--|--|--|--|--|--|--|--| |B0|B1|B2|B3|B4|B5|B6|B7| 但是我們想要的資料應該從 B3 開始，而這裡的 RRR 在修正這件事。所以很顯然的，我們需要把 `data` 左移 `read_lhs` 位。 RRR = `data <<= read_lhs;` 答案為 `(a)` ### 9. DDD = ? ```c uint8_t original = *dest; uint8_t mask = read_mask[write_lhs]; if (bitsize > write_rhs) { /* Cross multiple bytes */ *dest++ = (original & mask) | (data >> write_lhs); original = *dest & write_mask[bitsize - write_rhs]; *dest = original | (data << write_rhs); } else { // Since write_lhs + bitsize is never >= 8, no out-of-bound access. DDD; *dest++ = (original & mask) | (data >> write_lhs); } ``` `write_rhs` 表示的是這個 byte 寫入點右邊有多少 bits 。以下面的例子來說， `write_rhs` 為 5 | | | |* | | | | | |--|--|--|--|--|--|--|--| |B0|B1|B2|B3|B4|B5|B6|B7| `bitsize` 則是這次要寫入的 bits 數。在 `else` 狀況中， `bitzize <= write_rhs` 也就是說這次的寫入不會跨越 bytes 。我們假設 `bitzise` 為 3 ，並省略 DDD 以判斷其用途。這個情況下，結果應該變成： | | | |* | | | | | |--|--|--|--|--|--|--|--| |B0|B1|B2|B3'|B4'|B5'|B6|B7| 但是最後一行 `*dest++ = (original & mask) | (data >> write_lhs);` 只會保護前 3 位，並把 `data` 放在正確位置，後面的 B6 B7 會被 mask 清除： | | | |* | | | | | |--|--|--|--|--|--|--|--| |B0|B1|B2|B3'|B4'|B5'|0|0| DDD 就是為了修正對後面的保護。原來的 mask 為：`11100000` 為了讓 `original & mask` 時能夠保持後面，我們期望 mask 為： `11100011` ，也就是我們希望 `mask |= 00000011` 。推廣成變數表示就是要讓 `mask |= ` 一個左邊有 `(write_lhs + bitsize)` 個 0 且右邊為 1 的 mask。 `write_mask` 提供了這樣的特性，所以 DDD = `mask |= write_mask[write_lhs+bitsize]` 答案為 `(b)` ### 10. CCC = ? ```c if (s->type == CSTR_ONSTACK) { int i = CCC; while (i < CSTR_STACK_SIZE - 1) { s->cstr[i] = *str; if (*str == 0) return s; ++s->hash_size; ++str; ++i; } s->cstr[i] = 0; } ``` 這裡的 `i` 作為放 `*str` 入 `s->cstr` 的 index，直接構造出 `int i = strlen(s->cstr);` ，但是這不在選項內。再觀察可以發現 `hash_size` 這個成員在 `CSTR_ONSTACK` 狀態時存放著 `cstr` 的長度。所以 CCC = `s->hash_size;` 。答案為 `(c)` ## 延伸 ### 測驗 1 這裡的 doubly linked list 將以這樣的圖表示： ```graphviz digraph { a[shape=record, label="\<prev\>|Data|\<next\>"] } ``` #### 運作原理： `list_merge_sort` **list_splice_tail** ```c /** * list_splice_tail() - Add list nodes from a list to end of another list * @list: pointer to the head of the list with the node entries * @head: pointer to the head of the list */ static inline void list_splice_tail(struct list_head *list, struct list_head *head) { struct list_head *head_last = head->prev; struct list_head *list_first = list->next, *list_last = list->prev; if (list_empty(list)) return; head->prev = list_last; list_last->next = head; list_first->prev = head_last; head_last->next = list_first; } ``` 全是 linked list 的操作，看圖就可以了。起始： ```graphviz digraph { rankdir=LR head -> head_st l2[shape=record, label="{<h>|H2|<t>}"] head_st[shape=record, label="{<h>|H1|<t>}"] head_st:t -> l2:h [dir=both] head_c[shape=none, label="..."] l2:t -> head_c [dir=both] tail_h[shape=record, label="{<h>|Hn|<t>}"] head_c -> tail_h:h [dir=both] tail_h:t -> head_st:h [dir=both] list -> list_st l[shape=record, label="{<h>|L2|<t>}"] list_st[shape=record, label="{<h>|L1|<t>}"] list_st:t -> l:h [dir=both] list_c[shape=none, label="..."] l:t -> list_c [dir=both] tail[shape=record, label="{<h>|Ln|<t>}"] list_c -> tail:h [dir=both] tail:t -> list_st:h [dir=both] } ``` 結束： ```graphviz digraph { rankdir=LR head -> head_st l2[shape=record, label="{<h>|H2|<t>}"] head_st[shape=record, label="{<h>|H1|<t>}"] head_st:t -> l2:h [dir=both] head_c[shape=none, label="..."] l2:t -> head_c [dir=both] tail_h[shape=record, label="{<h>|Hn|<t>}"] head_c -> tail_h:h [dir=both] tail:t -> head_st:h[dir=both] list -> list_st l[shape=record, label="{<h>|L2|<t>}"] list_st[shape=record, label="{<h>|L1|<t>}"] list_st:t -> l:h [dir=both] list_c[shape=none, label="..."] l:t -> list_c [dir=both] tail[shape=record, label="{<h>|Ln|<t>}"] list_c -> tail:h [dir=both] tail_h:t -> list_st:h [dir=both] } ``` 兩個串列被連接在一起。 **list_merge** ```c static void list_merge(struct list_head *lhs, struct list_head *rhs, struct list_head *head) { INIT_LIST_HEAD(head); if (list_empty(lhs)) { list_splice_tail(lhs, head); return; } if (list_empty(rhs)) { list_splice_tail(rhs, head); return; } while (!list_empty(lhs) && !list_empty(rhs)) { char *lv = list_entry(lhs->next, list_ele_t, list)->value; char *rv = list_entry(rhs->next, list_ele_t, list)->value; struct list_head *tmp = strcmp(lv, rv) <= 0 ? lhs->next : rhs->next; list_del(tmp); list_add_tail(tmp, head); } list_splice_tail(list_empty(lhs) ? rhs : lhs, head); } ``` 前面兩個狀況是當有一邊是 empty 時，只要跟頭接起來就好： ```c if (list_empty(lhs)) { list_splice_tail(lhs, head); return; } if (list_empty(rhs)) { list_splice_tail(rhs, head); return; } ``` 接下來就是比較哪個 `value` 比較小，從原來的串列移除後，放`head`：（小的先放 => 小的在前面） ```c while (!list_empty(lhs) && !list_empty(rhs)) { char *lv = list_entry(lhs->next, list_ele_t, list)->value; char *rv = list_entry(rhs->next, list_ele_t, list)->value; struct list_head *tmp = strcmp(lv, rv) <= 0 ? lhs->next : rhs->next; list_del(tmp); list_add_tail(tmp, head); } ``` 最後把有剩的那邊，剩下的接起來： ```c list_splice_tail(list_empty(lhs) ? rhs : lhs, head); ``` merge 就完成了。 **list_merge_sort** ```c void list_merge_sort(queue_t *q) { if (list_is_singular(&q->list)) return; queue_t left; struct list_head sorted; INIT_LIST_HEAD(&left.list); list_cut_position(&left.list, &q->list, &get_middle(&q->list)->list); list_merge_sort(&left); list_merge_sort(q); list_merge(&left.list, &q->list, &sorted); INIT_LIST_HEAD(&q->list); list_splice_tail(&sorted, &q->list); } ``` 最剛開始是 base case ，當只有一個元素的時候不用排。其他狀況我們就開始排序。 merge sort 首先把原來的串列切成兩半，這裡由 `list_cut_position(&left.list, &q->list, &get_middle(&q->list)->list);` 達成，分別為 `left` 和 `q` 。 [`get_middle()` 的說明](https://hackmd.io/1Qv-3uXtTkGb8VQPsCVsyQ?both#1-COND1--) 分成兩半之後，讓兩邊也排序好： ```c list_merge_sort(&left); list_merge_sort(q); ``` 之後將兩半 merge 起來結果是 `sorted` ： ```c list_merge(&left.list, &q->list, &sorted); ``` 最後把 `sorted` 弄到 `q` 上： ```c INIT_LIST_HEAD(&q->list); list_splice_tail(&sorted, &q->list); ``` --- #### 改進：關於 `queue_t` **1. 評估** 仔細看看 `queue_t` ： ```c typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; size_t size; struct list_head list; } queue_t; ``` 可以發現除了 `size` 之外，其他東西都可以由 `list_head` 提供。檢查使用 `size` 的地方： ```c static queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) return NULL; q->head = q->tail = NULL; q->size = 0; INIT_LIST_HEAD(&q->list); return q; } ``` ```c bool q_insert_head(queue_t *q, char *s) { if (!q) return false; list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!newh) return false; char *new_value = strdup(s); if (!new_value) { free(newh); return false; } newh->value = new_value; newh->next = q->head; q->head = newh; if (q->size == 0) q->tail = newh; q->size++; list_add_tail(&newh->list, &q->list); return true; } ``` 雖然有在維護 `size` 但是使用的地方只有用來判斷這個 `q` 是不是空的。如果我們能夠將 `q->size == 0` 的判斷取代，`queue_t` 將能被完全移除。 **2. 修改 - 移除 `size`** `q->size == 0` 這裡的判斷意義跟 `list_empty` 是一樣的，我們更改成： ```c if (list_empty(&(q->list))) q->tail = newh; ``` 並且移除 `size` 及其對應的操作。 ```c typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; struct list_head list; } queue_t; ``` ```c static queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) return NULL; q->head = q->tail = NULL; INIT_LIST_HEAD(&q->list); return q; } ``` ```c bool q_insert_head(queue_t *q, char *s) { if (!q) return false; list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!newh) return false; char *new_value = strdup(s); if (!new_value) { free(newh); return false; } newh->value = new_value; newh->next = q->head; q->head = newh; if (list_empty(&(q->list))) q->tail = newh; list_add_tail(&newh->list, &q->list); return true; } ``` 如此一來， `queue_t` 提供的東西將可以完全由 `list_head` 提供。便可移除 `queue_t` 。 [commit 0fda187903](https://github.com/secminhr/quiz2/tree/0fda1879039eaecceede4a6cce92df42a05a72b7) **3. 修改 - 移除 `queue_t`** 為了移除 `queue_t` 我們要找到每個成員如何用 `list_head` 對應。 `queue_t.list` 不需對應，我們將直接使用這個來取代 `queue_t` `queue_t.head` -> `list_entry(list.next, list_ele_t, list)` `queuet.tail` -> `list_entry(list.prev, list_ele_t, list)` 除了 `q_insert_head` 之外的修改都能用上面的對應。 `q_insert_head` 則多處理了第一個插入。 ```c bool q_insert_head(struct list_head *q, char *s) { if (!q) return false; list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!newh) return false; char *new_value = strdup(s); if (!new_value) { free(newh); return false; } newh->value = new_value; newh->next = list_empty(q) ? NULL : list_entry(q->next, list_ele_t, list); newh->list.next = q->next; q->next->prev = &(newh->list); newh->list.prev = q; q->next = &(newh->list); return true; } ``` 當 `newh` 是第一個的時候，他的 `next` 會是 NULL 。不能使用 `list_entry` 的原因是此時 `q->next` 還是自己。 ```c newh->next = list_empty(q) ? NULL : list_entry(q->next, list_ele_t, list); ``` [commit 05109a1738](https://github.com/secminhr/quiz2/tree/05109a173890a01eca6da8cb598e6620f49139ac) #### 改進：關於 `list_ele_t` `list_ele_t` 裡面有個成員是 `next` ，然而這個成員的功能（連結）與 `list_head` 的功能重複。將其移除後 `list_ele_t` 將變得更單純，其連結行為將完全由 `list_head` 代理。在 `q_free` 裡面使用 `next` 來走訪整個串列，我們以 `list_head.next` 取代。 ```c static void q_free(struct list_head *q) { if (!q) return; struct list_head *current = q->next; while (current != q) { list_ele_t *tmp = list_entry(current, list_ele_t, list); current = current->next; free(tmp->value); free(tmp); } free(q); } ``` `q_insert_head` 裡面維護 `next` 的部分拔除。 ```c bool q_insert_head(struct list_head *q, char *s) { if (!q) return false; list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!newh) return false; char *new_value = strdup(s); if (!new_value) { free(newh); return false; } newh->value = new_value; newh->list.next = q->next; q->next->prev = &(newh->list); newh->list.prev = q; q->next = &(newh->list); return true; } ``` [commit 0543e3f446](https://github.com/secminhr/quiz2/tree/0543e3f4469ff75ad5b6c5299ba9469d460f4a1a) ### 測驗二 #### 運作原理比較詳細的原理（包含數字的選擇等等）請看 [5~8 題](https://hackmd.io/1Qv-3uXtTkGb8VQPsCVsyQ?both#5-X--) ```c uint16_t func(uint16_t N) { /* change all right side bits to 1 */ N |= N >> 1; N |= N >> 2; N |= N >> 4; N |= N >> 8; return (N + 1) >> 1; } ``` power-of-2 在二進位的表現形式裡面一定只有一個 1 ，所以要回傳小於或等於 N 的 power-of-2 就是要保持 N 在二進位表示的第一個 1 並讓其他位都是 0 。這題的做法是讓第一個 1 的所有位變成 1 ，所以加 1 會進一位右邊全部變成 0，再右移一位就讓進位的那個 1 回到本來最左邊 1 的位置了。但是這裡的運算中間有一步會進一位，有沒有可能故意讓進位的 1 overflow 而結果變成 0？ ```c uint16_t func(uint16_t N) { /* change all right side bits to 1 */ N |= N >> 1; N |= N >> 2; N |= N >> 4; N |= N >> 8; printf("%d\n", (N+1) >> 1); return (N + 1) >> 1; } int main() { func(0xFFFF); return 0; } ``` ```shell $ ./main 32768 ``` 然而並沒有發生。 #### 進位問題先看看組語 *註：只截取 func 部分* *使用 objdump 以 intel 語法呈現* ```assembly 0000000000001149 <func>: 1149: f3 0f 1e fa endbr64 114d: 55 push rbp 114e: 48 89 e5 mov rbp,rsp 1151: 48 83 ec 10 sub rsp,0x10 1155: 89 f8 mov eax,edi 1157: 66 89 45 fc mov WORD PTR [rbp-0x4],ax 115b: 0f b7 45 fc movzx eax,WORD PTR [rbp-0x4] 115f: 66 d1 e8 shr ax,1 1162: 66 09 45 fc or WORD PTR [rbp-0x4],ax 1166: 0f b7 45 fc movzx eax,WORD PTR [rbp-0x4] 116a: 66 c1 e8 02 shr ax,0x2 116e: 66 09 45 fc or WORD PTR [rbp-0x4],ax 1172: 0f b7 45 fc movzx eax,WORD PTR [rbp-0x4] 1176: 66 c1 e8 04 shr ax,0x4 117a: 66 09 45 fc or WORD PTR [rbp-0x4],ax 117e: 0f b7 45 fc movzx eax,WORD PTR [rbp-0x4] 1182: 66 c1 e8 08 shr ax,0x8 1186: 66 09 45 fc or WORD PTR [rbp-0x4],ax 118a: 0f b7 45 fc movzx eax,WORD PTR [rbp-0x4] 118e: 83 c0 01 add eax,0x1 1191: d1 f8 sar eax,1 1193: 89 c6 mov esi,eax 1195: 48 8d 3d 68 0e 00 00 lea rdi,[rip+0xe68] # 2004 <_IO_stdin_used+0x4> 119c: b8 00 00 00 00 mov eax,0x0 11a1: e8 aa fe ff ff call 1050 <printf@plt> 11a6: 0f b7 45 fc movzx eax,WORD PTR [rbp-0x4] 11aa: 83 c0 01 add eax,0x1 11ad: d1 f8 sar eax,1 11af: c9 leave 11b0: c3 ret ``` 這裡有一些奇怪的地方： 1. N 放在 eax `movzx eax,WORD PTR [rbp-0x4]` 的用途是把 or 的結果放在 `eax` 跟我們預期的大小 (16 bits) 不一樣。我們可以發現使用 eax 是為了下面一點。 2. add 使用 eax ```assembly add eax,0x1 sar eax,1 ``` 這個選擇非常奇怪，對於 16 bits 的數字來說，應該使用 ax 就可以了。我們可以把位元數往上加，看看結果怎麼樣。 ```c uint32_t func(uint32_t N) { /* change all right side bits to 1 */ N |= N >> 1; N |= N >> 2; N |= N >> 4; N |= N >> 8; N |= N >> 16; printf("%d\n", (N+1) >> 1); return (N + 1) >> 1; } int main() { func(0xFFFFFFFF); return 0; } ``` ```shell $ ./main 0 ``` overflow 了。再看其組語： ```assembly 0000000000001149 <func>: 1149: f3 0f 1e fa endbr64 114d: 55 push rbp 114e: 48 89 e5 mov rbp,rsp 1151: 48 83 ec 10 sub rsp,0x10 1155: 89 7d fc mov DWORD PTR [rbp-0x4],edi 1158: 8b 45 fc mov eax,DWORD PTR [rbp-0x4] 115b: d1 e8 shr eax,1 115d: 09 45 fc or DWORD PTR [rbp-0x4],eax 1160: 8b 45 fc mov eax,DWORD PTR [rbp-0x4] 1163: c1 e8 02 shr eax,0x2 1166: 09 45 fc or DWORD PTR [rbp-0x4],eax 1169: 8b 45 fc mov eax,DWORD PTR [rbp-0x4] 116c: c1 e8 04 shr eax,0x4 116f: 09 45 fc or DWORD PTR [rbp-0x4],eax 1172: 8b 45 fc mov eax,DWORD PTR [rbp-0x4] 1175: c1 e8 08 shr eax,0x8 1178: 09 45 fc or DWORD PTR [rbp-0x4],eax 117b: 8b 45 fc mov eax,DWORD PTR [rbp-0x4] 117e: c1 e8 10 shr eax,0x10 1181: 09 45 fc or DWORD PTR [rbp-0x4],eax 1184: 8b 45 fc mov eax,DWORD PTR [rbp-0x4] 1187: 83 c0 01 add eax,0x1 118a: d1 e8 shr eax,1 118c: 89 c6 mov esi,eax 118e: 48 8d 3d 6f 0e 00 00 lea rdi,[rip+0xe6f] # 2004 <_IO_stdin_used+0x4> 1195: b8 00 00 00 00 mov eax,0x0 119a: e8 b1 fe ff ff call 1050 <printf@plt> 119f: 8b 45 fc mov eax,DWORD PTR [rbp-0x4] 11a2: 83 c0 01 add eax,0x1 11a5: d1 e8 shr eax,1 11a7: c9 leave 11a8: c3 ret ``` 這次則確實使用了符合的大小 32 bits (`eax`) 再提升到 64 bits 也可以發現使用符合的大小 (`rax`) 這讓 16 bits 的情況更像一個刻意的選擇。在 c 的規格裡可以發現為何會放在 32 bits 的空間裡。我們看看選擇的原因：`(N+1) >> 1` 首先我們看看 N+1 的這個 1 是什麼型態。 6.4.4.1 > The type of an integer constant is the **first** of the corresponding list in which its value can be represented. > | Suffix | Decimal Constant | Octal or Hexadecimal Constant | | -------- | -------- | -------- | | none | int long int long long int | int unsigned int long int unsigned long int long long int unsigned long long int | ...( 後面還有很長 ) 由此可知 1 的型態是 int (32 bit) 現在我們知道 + 兩邊的型態是 uint16_t 跟 int 再看 6.3.1.8 > Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type. 因為 int 可以表現所有 uint16_t 能表現的值，所以 uint16_t 會被轉成 int 來做運算。既然如此，進位沒有 overflow 就很正常了。 #### is_power_of_2 這個函式會回傳 n 是不是 2 的冪（規定 0 不是 2 的冪）： ```c /** * is_power_of_2() - check if a value is a power of two * @n: the value to check * * Determine whether some value is a power of two, where zero is * *not* considered a power of two. * Return: true if @n is a power of 2, otherwise false. */ static inline __attribute__((const)) bool is_power_of_2(unsigned long n) { return (n != 0 && ((n & (n - 1)) == 0)); } ``` 因為是二的冪所以二進位裡面只有一個 1 而且右邊全部是 0 ，當 n - 1 的時候會把原本的 1 變成 0 。所以如果 n 是二的冪的話 `n & (n - 1)` 會讓所有位都變成 0 。 #### __roundup_pow_of_two 這個函式會找到一個大於或等於 n 且最靠近的 2 的冪。 ```c /** * __roundup_pow_of_two() - round up to nearest power of two * @n: value to round up */ static inline __attribute__((const)) unsigned long __roundup_pow_of_two(unsigned long n) { return 1UL << fls_long(n - 1); } ``` `fls_long` 這個函式的用途是找到「從左邊開始第一個 1 」的位置。 MSB 的編號隨著情況改變，以 unsigned long 的長度而定， LSB 的編號則是 1 。另外如果數字是 0 則會回傳 0 。例如對於 `12`(`1100`) 他將回傳 4 。這裏要處理的問題是 1 應該向左移幾位。因為要大於或等於的關係，所以 ``` 如果本身是 2 的冪左移到最左邊的 1 的位置反之左移到最左邊的 1 的左邊一位 ``` 而這裡用了 `fls_long` 來解決這個問題。我們可以觀察到 `1 << fls_long(x)` 總是會讓 1 左移到最左邊的 1 的左邊一位。例如 `1100` -> `10000` `100` -> `1000` 但是這讓是 2 的冪的數很困擾，所以選擇用 `fls_long(n-1)` 。當是 n 是 2 的冪的時候， `-1` 會讓 n 退一位，這樣左移的操作就正好會到原來位置。反之當 n 不是 2 的冪的時候，`-1` 不會造成退位， `fls_long(n) == fls_long(n-1)` ，於是可以達到原本「最左邊的 1 的左邊一位」的效果。 #### __rounddown_pow_of_two ```c /** * __rounddown_pow_of_two() - round down to nearest power of two * @n: value to round down */ static inline __attribute__((const)) unsigned long __rounddown_pow_of_two(unsigned long n) { return 1UL << (fls_long(n) - 1); } ``` 既然是找小於或等於的，那左移的策略就變成左移到最左邊的 1 。上面已經觀察到 `1 << fls_long(x)` 的行為是讓 1 左移到最左邊的 1 的左邊一位，那我們用 `fls_long(n) - 1` 就可以移到最左邊的 1 那位了。