2018q3 Homework3 (list)

# 2018q3 Homework3 (list) contributed by < `ofAlpaca` > ###### tags: `CSIE5006` `HW` ## 自我檢查事項 #### 為何 Linux 採用 macro 來實作 linked list？一般的 function call 有何成本？ * 使用 macro 可以省去多餘的 function call 且執行速度較快，因為 preprocessor 會將 macro 展開至呼叫 macro 的程式碼內。 * 使用 function call 最大的成本就是需要**呼叫堆疊** (Stack) ，所以在執行時期較花費時間。 * 使用 macro 不會呼叫堆疊，但會展開至程式碼內，因此執行速度較 function call 快，屬於空間換取時間。 * 不過 function call 的優點是會做 **type checking** ，而 macro 不會。參考 : [Macro vs Function](https://www.geeksforgeeks.org/macros-vs-functions/) [Macro vs Function(二)](https://dotblogs.com.tw/ace_dream/2016/01/16/function_macro) --- #### Linux 應用 linked list 在哪些場合？舉三個案例並附上對應程式碼，需要解說，以及揣摩對應的考量 * linux kernel 中的 timer 使用了 linked list 來動態儲存計時器 * 計時器用於指定某段時間後才執行特定函式 * `timer_list` 計時器的結構，儲存著 timeout 與執行函式的 function pointer 等資訊 * 使用 doubly-linked list 來記錄 timer 除了能動態的增刪以外，還能根據 `expires` 的先後來排序 ```clike struct timer_list { struct list_head entry; unsigned long expires; struct tvec_base *base; void (*function)(unsigned long); unsigned long data; int slack; }; ``` 參考: [Timers](http://www.science.unitn.it/~fiorella/guidelinux/tlk/node121.html) --- #### GNU extension 的 [typeof](https://gcc.gnu.org/onlinedocs/gcc/Typeof.html) 有何作用？在程式碼中扮演什麼角色？ * `typeof` 是 operator ，用於回傳 object 的型別，是 gcc 的 extension。 * 常用於讓 macro 在傳遞參數時能動態接受多種型別，可以讓程式碼更加有彈性。例如 : ```clike #define max(a,b) \ ({ typeof (a) _a = (a); \ typeof (b) _b = (b); \ _a > _b ? _a : _b; }) ``` --- #### 解釋以下巨集的原理 ```Clike #define container_of(ptr, type, member) \ __extension__({ \ const __typeof__(((type *) 0)->member) *__pmember = (ptr); \ (type *) ((char *) __pmember - offsetof(type, member)); \ }) ``` 此 macro 的目的是為了從 struct 中的 member 推算出原本 struct 的位址。以下逐一分析 : * 先透過 `__typeof__` 得到 type 中的成員 `member` 的型別，並產生一個 pointer to 給該型別的 `__pmember` 。 * 將 `ptr` assign 給 `__pmember` 。 * `__pmember` 目前指向的是 `member` 的位址。 * `offsetof(type, member)` 可以算出 `member` 在 `type` 這個 struct 中的位移量， offset 。 * 將絕對位址 `(char*) __pmember` 減去 `offsetof(type, member)` ，可以得到 struct 的起始位址。 * 最後 `(type*)` 再將起始位置轉型為 pointer to `type` 。參考 : [你所不知道的 C 語言 : 指標篇](https://hackmd.io/s/HyBPr9WGl#C-%E8%AA%9E%E8%A8%80-offsetof-%E5%B7%A8%E9%9B%86%E7%9A%84%E5%AE%9A%E7%BE%A9) --- #### 除了你熟悉的 add 和 delete 操作，`list.h` 還定義一系列操作，為什麼呢？這些有什麼益處？ * 主要的目的是為了提供 linked list 的 Linux Kernel API 。 * 在 `list.h` 中的各種操作都是只針對 linked list 本身而已，完全不影響資料的部份。 * 透過這種寫法，只要將 `list_head` 包含進 structure ，就可以讓它也能隨心所欲地放資料，同時又不用自己維護一套 doubly linked list 。 ![](https://i.imgur.com/d3bG8t6.png) 參考 : [Linux kernel api](https://kernel.readthedocs.io/en/sphinx-samples/kernel-api.html#doubly-linked-lists) [運作流程](https://myao0730.blogspot.com/2016/12/linux.html) --- #### `LIST_POISONING` 這樣的設計有何意義？ * 根據 `list.h` 內的 `list_del()` 描述 : ``` * LIST_POISONING can be enabled during build-time to provoke an invalid memory * access when the memory behind the next/prev pointer is used after a list_del. * This only works on systems which prohibit access to the predefined memory * addresses. ``` 意思是當 next/prev pointer 所指向的記憶體空間被 `list_del()` 刪除後，為了避免 dangling pointer 的問題，所以讓 pointer 指向違法的記憶體位址，之後如果 pointer 被 reference 時就會觸發 page fault 。 * 將 pointer 指向會產生 page fault 的位址。`linux/poison.h` 有提到也可用 `0xdead000000000000` ，這樣就可以明顯地看出是 poisoning 。 ```clike #ifdef LIST_POISONING node->prev = (struct list_head *) (0x00100100); node->next = (struct list_head *) (0x00200200); #endif ``` :::info 位址 `0x00100100` 與 `0x00200200` 有什麼特別意思嗎？ ::: - [x] 根據 [這裡](https://wiki.litmus-rt.org/litmus/KernelDebugging)，這兩個位址都是 Well-known address ，在 linux kernel 中是常數位址，為了能讓人發現到有錯誤發生而故意挑選的位址。參考 : [pointer poison](https://stackoverflow.com/questions/27801360/what-is-the-meaning-of-0xdead000000000000/27806246) [dangling pointer](https://en.wikipedia.org/wiki/Dangling_pointer) [poison.h](https://github.com/torvalds/linux/blob/master/include/linux/poison.h) --- #### linked list 採用環狀是基於哪些考量？根據 Linux Device Driver Chapter 11 Linked lists 中講到 : > Since Linux lists are circular, the head of the list is not generally different from any other entry. 如果 linked list 是環狀結構，那麼每個節點的操作都是一樣的，因此不需要再特別去維護 head / tail pointer ，從任何節點都可以尋訪整個 linked list 。參考 : [linux device driver 3rd-edition](https://bootlin.com/doc/books/ldd3.pdf) [Quora : Why are all the linked lists circular in the Linux Kernel?](https://www.quora.com/Why-are-all-the-linked-lists-circular-in-the-Linux-Kernel) --- #### `list_for_each_safe` 和 `list_for_each` 的差異在哪？"safe" 在執行時期的影響為何？ * `list_for_each_safe` 和 `list_for_each` 的差異在於 `safe` 可以接受刪除節點。 * 先講結論，`safe` 主要是為了防止刪除當下的這個節點後會遺失其 next pointer 因而出現錯誤，所以多增加了一個 safe pointer 去保存它，這樣無論怎麼修改那個節點都不會影響到下一個節點。 ```clike #define list_for_each(node, head) \ for (node = (head)->next; node != (head); node = node->next) #define list_for_each_safe(node, safe, head) \ for (node = (head)->next, safe = node->next; node != (head); \ node = safe, safe = node->next) ``` --- #### for_each 風格的開發方式對程式開發者的影響為何？提示：對照其他程式語言，如 Perl 和 Python * for_each 迴圈風格最大的特點在於沒有明顯的 counter ，基本上就是把集合中的所有元素都跑過一遍。 * 可以避免 off-by-one error (OBOE) ，指的是執行迴圈時的邊界條件判斷寫錯，導致迴圈多跑或少跑一次。 * for_each in python : ```python set = [1,3,7,5,11,13,17,23] s = 0 for item in set: s = s + item ``` 與 C 語言的 for 迴圈比起來精簡很多，增加了程式碼的易讀性，不用特別去宣告 iterator 或是 counter ，也不用去擔心條件判斷是否正確。參考 : [Foreach_loop](https://en.wikipedia.org/wiki/Foreach_loop) --- #### 程式註解裡頭大量存在 `@` 符號，這有何意義？你能否應用在後續的程式開發呢？提示: 對照看 Doxygen * Doxygen 是一套自動生成文件的程式，能夠解析原始碼的註解後產生文件，而 `@` 就是其所有指令的開頭。 > All commands in the documentation start with a backslash (\) or an at-sign (@). * 在 `list.h` 中的 `@` 用的指令是 `param` ，就是告訴 Doxygen 描述這個 function 參數的用途。 > Starts a parameter description for a function parameter with name <parameter-name>, followed by a description of the parameter. * 如果之後要和他人進行協同開發作業，各函式的註解就是必要的了，一來可以幫助他人快速理解自己的程式碼內容，二來可以幫助未來的自己回想。 * 可以學習像 `kernel-doc` 風格註解 : ``` /** * function_name() - Brief description of function. * @arg1: Describe the first argument. * @arg2: Describe the second argument. * One can provide multiple line descriptions * for arguments. * * A longer description, with more discussion of the function function_name() * that might be useful to those using or modifying it. Begins with an * empty comment line, and may include additional embedded empty * comment lines. * * The longer description may have multiple paragraphs. * * Context: Describes whether the function can sleep, what locks it takes, * releases, or expects to be held. It can extend over multiple * lines. * Return: Describe the return value of foobar. * * The return value description can also have multiple paragraphs, and should * be placed at the end of the comment block. */ ``` 參考 : [Doxygen 筆記](https://blog.longwin.com.tw/2011/04/doxygen-document-generator-2011/) [Doxygen Manual - cmdparam](http://www.stack.nl/~dimitri/doxygen/manual/commands.html#cmdtparam) [Function documentation](https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation) --- #### `tests/` 目錄底下的 unit test 的作用為何？就軟體工程來說的精神為何？ * unit test 是用於檢驗程式的某一功能是否正確的測試方法。 * 在 `tests/` 底下的 `.c` 檔是用於檢驗 `list.h` 中的 linked list 操作是否正確的 unit test 。 * 在 make 時期會將 unit test 編譯後並執行，若是 linked list 的操作不符合預期，則會 **throw assertion** 並中止程式。 ```shell $ make CC tests/list_add_tail.o LD tests/list_add_tail *** Validating tests/list_add_tail *** list_add_tail: tests/list_add_tail.c:34: main: Assertion `i == 1' failed. Aborted (core dumped) Makefile:49: recipe for target 'tests/list_add_tail.ok' failed make: *** [tests/list_add_tail.ok] Error 134 ``` 參考 : [Unit testing](https://en.wikipedia.org/wiki/Unit_testing) --- #### `tests/` 目錄的 unit test 可如何持續精進和改善呢？ ![](https://i.imgur.com/nBD17BY.png) --- ## Make `lab0-c` to `lab-list` ### 改寫目標 * 學習 linux-list 裡頭的技巧與 unit test 的設計 * 改寫為 doubly-linked list 且考慮到 circular * `qtest` 也需要修改至可支援 --- ### `queue.h` * 資料結構調整 * 增加 `list_head` 來作為 doubly-linked list 的主要部分，包含 `next` 與 `prev` 指標。 * 將原本 `list_ele_t` 的 `next` 指標改為 `list_head` 。 * 將原本 `queue_t` 內 `list_ele_t` 的 `head` 與 `tail` 指標移除，改為 `list_head` 。 * 結構如下圖 : ![](https://i.imgur.com/8ipW5oS.png) * 透過 linux-kernel 風格的 linked list 可以不用再特別處理 head 或 tail 指標，因為大家都是相同的。 ```clike= struct list_head { struct list_head *prev; struct list_head *next; }; typedef struct ELE { char *value; struct list_head list; } list_ele_t; typedef struct { size_t size; // size of the linked list struct list_head head; /* Doubly Linked list of elements */ } queue_t; ``` * 新增 macro * 為了能夠從 `list_head` 找到 `list_ele_t` 的 entry ，所以增加 `container_of` 巨集。 * 增加 `list_for_each_safe` 巨集來尋訪 linked list 。 ```clike= #define container_of(ptr, type, member) \ __extension__({ \ const __typeof__(((type *) 0)->member) *__pmember = (ptr); \ (type *) ((char *) __pmember - offsetof(type, member)); \ }) #define list_for_each_safe(node, safe, head) \ for (node = (head)->next, safe = node->next; node != (head); \ node = safe, safe = node->next) ``` * 新增 linked list 的操作函式，源自於 [linux-list](https://github.com/sysprog21/linux-list) 的 `list.h` ```clike= static inline void list_add(struct list_head *node, struct list_head *entry) { struct list_head *next = entry->next; next->prev = node; node->next = next; node->prev = entry; entry->next = node; } static inline void list_add_tail(struct list_head *node, struct list_head *head) { struct list_head *prev = head->prev; prev->next = node; node->next = head; node->prev = prev; head->prev = node; } static inline void list_del(struct list_head *node) { struct list_head *next = node->next; struct list_head *prev = node->prev; next->prev = prev; prev->next = next; } static inline void INIT_LIST_HEAD(struct list_head *head) { head->next = head; head->prev = head; } ``` --- ### `queue.c` * `q_new()` * 新增初始化 `list_head` ，使得 `next` 與 `prev` 都指向自己的 `list_head` 。 ```clike= queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (q == NULL) return NULL; INIT_LIST_HEAD(&q->head); q->size = 0; return q; } ``` * `q_free()` * 透過 `list_for_each_safe` 尋訪各個 `list_head` 並處理好要刪除節點的指標。 * 再使用 `container_of` 來取得 `list_ele_t` 的 entry 並釋放其記憶體。 ```clike= void q_free(queue_t *q) { if (q != NULL) { struct list_head *li = NULL, *lis = NULL; list_ele_t *rm_node; list_for_each_safe(li, lis, &q->head) { list_del(li); rm_node = container_of(li, list_ele_t, list); free(rm_node); } free(q); } } ``` * `q_insert_head()` * 產生新的 `list_ele_t * newh` ，並初始化其 `list_head` 。 * 從 `q` 內的 `list_head head` 的 next 插入。 * 示意圖，此為 circular 只是沒畫出指標。 ![](https://i.imgur.com/g0DSZia.png) ```clike= bool q_insert_head(queue_t *q, char *s) { list_ele_t *newh; if (q != NULL) { newh = malloc(sizeof(list_ele_t)); if (newh != NULL) { newh->value = strdup(s); INIT_LIST_HEAD(&newh->list); list_add(&newh->list, &q->head); q->size++; return true; } } return false; } ``` * `q_insert_tail()` * 原理和 `q_insert_head` 相同，只差在是由 prev 那端插入。 * 示意圖，此為 circular 只是沒畫出指標。 ![](https://i.imgur.com/UXXIYau.png) ```clike= bool q_insert_tail(queue_t *q, char *s) { list_ele_t *newh; if (q != NULL) { newh = malloc(sizeof(list_ele_t)); if (newh != NULL) { newh->value = strdup(s); INIT_LIST_HEAD(&newh->list); list_add_tail(&newh->list, &q->head); q->size++; return true; } } return false; } ``` * `q_remove_head()` * 所要刪除的節點位於 `q` 的下個節點。 * 使用 `container_of` 後得到刪除節點的 entry ，並在 `list_del` 刪除它。 * 最後在釋放其記憶體。 * 在刪除之後， `queue_t` 的 next 與 prev 指標會指向自己的 `list_head` 。 ![](https://i.imgur.com/mVHrCGq.png) ```clike= bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (q != NULL) { if (q->head.next != &q->head) { list_ele_t *rm_node = container_of(q->head.next, list_ele_t, list); if (sp != NULL) { memcpy(sp, rm_node->value, bufsize); // copy the removed string to sp sp[bufsize - 1] = '\0'; // add terminator at the end } list_del(q->head.next); free(rm_node); // free the removed node q->size--; return true; } } return false; } ``` * `q_reverse()` * 使用 `list_for_each_safe` 尋訪每個節點，並將其 next 與 prev 指標對調。 * `list_for_each_safe` 的 `head` 不會走訪到，因為判斷式 `node != (head)` 會在走回 `head` 時跳出迴圈。 ```clike #define list_for_each_safe(node, safe, head) \ for (node = (head)->next, safe = node->next; node != (head); \ node = safe, safe = node->next) ``` * 所以還需要做最後一次指標對調才得以完成反轉佇列。 ```clike= void q_reverse(queue_t *q) { if (q != NULL) { struct list_head *li = NULL, *lis = NULL, *tmp = NULL; list_for_each_safe(li, lis, &q->head) { tmp = li->next; li->next = li->prev; li->prev = tmp; } tmp = li->next; li->next = li->prev; li->prev = tmp; } ``` --- ### `qtest.c` * 原本的 `qtest.c` 可以直接透過 `list_ele_t` 來獲得 value 。 * 但是由於修改為 doubly-linked list ，鏈接的部分不再是 `list_ele_t` 而是 `list_head` 。 * 這也是大部分 `qtest.c` 需要修改的部分，以 `show_queue()` 下要印出 queue 內容的部分為例。 * 修改前 : (透過走訪 `list_ele_t` 來印出 value ) ```clike= list_ele_t *e = q->head; if (exception_setup(true)) { while (ok && e && cnt < qcnt) { if (cnt < big_queue_size) report_noreturn(vlevel, cnt == 0 ? "%s" : " %s", e->value); e = e->next; cnt++; ok = ok && !error_check(); } } ``` * 需要先使用 `container_of` 來得知 `list_ele_t` 的 entry 位址才能取得其 value 。 * 修改後 : (透過 `list_head` 來走訪，使用 `container_of` 來取得 entry ) ```clike= struct list_head *e = &q->head; if (exception_setup(true)) { list_ele_t *ent; while (ok && cnt < qcnt) { e = e->next; ent = container_of(e, list_ele_t, list); if (cnt < big_queue_size) report_noreturn(vlevel, cnt == 0 ? "%s" : " %s", ent->value); cnt++; ok = ok && !error_check(); } } ``` * 其中的 while 迴圈內 `ok && e && cnt < qcnt` 判斷式改為 `ok && cnt < qcnt` ，原因是在 circular 的結構下， `list_head * e` 只可能會指回自己，不可能指向 `NULL` 。 * 在原 `show_queue()` 的 447 行，將 `if (e == NULL) ... else` 拿掉，原因是 `e` 在 circular 的結構下是不會為 `NULL` 的，若不移除，一定會進入 else 而產生 ERROR 。 ```clike=447 if (e == NULL) { if (cnt <= big_queue_size) report(vlevel, "]"); else report(vlevel, " ... ]"); } else { report(vlevel, " ... ]"); report( vlevel, "ERROR: Either list has cycle, or queue has more than %d elements", qcnt); ok = false; } ``` --- ### 結果 * `make test` 也可以達到 ==100/100== * 但是相較於原本的 `lab0-c` 除了程式碼更加精簡外，效率不進反退，也可能是我實作的問題 :confused: * `lab0-c` 的 `perf stat ./scripts/driver.py` ``` Performance counter stats for './scripts/driver.py': 514.600727 task-clock (msec) # 1.001 CPUs utilized 33 context-switches # 0.064 K/sec 0 cpu-migrations # 0.000 K/sec 99,410 page-faults # 0.193 M/sec 1,835,221,189 cycles # 3.566 GHz 4,152,494,737 instructions # 2.26 insn per cycle 803,315,987 branches # 1561.047 M/sec 1,004,655 branch-misses # 0.13% of all branches 0.513896153 seconds time elapsed ``` * `lab-list` 的 `perf stat ./scripts/driver.py` ``` Performance counter stats for './scripts/driver.py': 574.978929 task-clock (msec) # 1.001 CPUs utilized 39 context-switches # 0.068 K/sec 8 cpu-migrations # 0.014 K/sec 115,044 page-faults # 0.200 M/sec 2,110,763,505 cycles # 3.671 GHz 4,498,759,005 instructions # 2.13 insn per cycle 831,566,851 branches # 1446.256 M/sec 1,089,808 branch-misses # 0.13% of all branches 0.574192893 seconds time elapsed ```