2019q1 Homework1 (list)

# 2019q1 Homework1 (list) contributed by <[`ndsl7109256`](https://github.com/ndsl7109256)> :::danger 注意作業要求，符合格式規範 :notes: jserv ::: ## 作業要求 - 回答「Linux 風格的 linked list」裡頭「自我檢查清單」的所有問題，需要附上對應的參考資料和必要的程式碼 - 從 linux-list 學習裡頭的技巧，包含裡頭 unit test 的設計，透過給定的 linked list 操作，實作出 merge sort * 附上你的修改過程，也需要讓`qtest` 得以支援 * 可將 dict 裡頭的測試資料拿來作效能分析的輸入 * 思考提升排序效率的方法，需要做平均和最壞狀況的分析 ## 自我檢查事項: - [x] 為何 Linux 採用 macro 來實作 linked list？一般的 function call 有何成本？ - [ ] Linux 應用 linked list 在哪些場合？舉三個案例並附上對應程式碼，需要解說，以及揣摩對應的考量 - [x] GNU extension 的 [typeof](https://gcc.gnu.org/onlinedocs/gcc/Typeof.html) 有何作用？在程式碼中扮演什麼角色？ - [x] 解釋以下巨集的原理 ```Clike #define container_of(ptr, type, member) \ __extension__({ \ const __typeof__(((type *) 0)->member) *__pmember = (ptr); \ (type *) ((char *) __pmember - offsetof(type, member)); \ }) ``` - [x] 除了你熟悉的 add 和 delete 操作，`list.h` 還定義一系列操作，為什麼呢？這些有什麼益處？ - [x] `LIST_POISONING` 這樣的設計有何意義？ - [x] linked list 採用環狀是基於哪些考量？ - [ ] 什麼情境會需要對 linked list 做排序呢？舉出真實世界的應用，最好在作業系統核心出現 - [ ] 什麼情境會需要需要找到 [第 k 大/小元素](https://www.geeksforgeeks.org/kth-smallestlargest-element-unsorted-array/) 呢？又，你打算如何實作？ - [x] `list_for_each_safe` 和 `list_for_each` 的差異在哪？"safe" 在執行時期的影響為何？ - [x] for_each 風格的開發方式對程式開發者的影響為何？ * 提示：對照其他程式語言，如 Perl 和 Python - [x] 程式註解裡頭大量存在 `@` 符號，這有何意義？你能否應用在後續的程式開發呢？ * 提示: 對照看 Doxygen - [x] `tests/` 目錄底下的 unit test 的作用為何？就軟體工程來說的精神為何？ - [ ] `tests/` 目錄的 unit test 可如何持續精進和改善呢？ ## 為何 Linux 採用 macro 來實作 linked list？一般的 function call 有何成本？先看一下 gcc 裡對 macro 的定義 > A macro is a fragment of code which has been given a name. Whenever the name is used, it is replaced by the contents of the macro. > [Macros](https://gcc.gnu.org/onlinedocs/cpp/Macros.html)[color=#9d31ea] preprocessor 會在 compile 前將每個 macro 替換成事先 define 好的內容。程式可以直接循序執行，不像 function 要做 push 或 pop 的動作。如此一來便可節省時間。但 macro 的缺點便是因為將每個 macro 展開運作，不像 function 使用固定的 memory ，會隨著使用次數的增長增加占用的 memory space。 - Memory test 寫了段簡單的 [程式(memory.c)](https://gist.github.com/ndsl7109256/44959f81e20f8e452ced6b1375d5c1a6) 並分別運行 macro 和 function 版本，看他們使用了多少 memory。 ``` PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4570 os2018 20 0 4376 824 760 R 97.0 0.0 19:56.08 macro 4585 os2018 20 0 4376 792 728 R 89.4 0.0 18:16.96 function ``` 藉由 ==top== 查看 process 使用 memory 的情形，便可發現 macro 版的比 fuction 版的多用了點 memory - Time test [程式(time.c)](https://gist.github.com/ndsl7109256/44959f81e20f8e452ced6b1375d5c1a6) ``` $time ./macro real 0m0.014s user 0m0.003s sys 0m0.000s $time ./function real 0m0.039s user 0m0.014s sys 0m0.000s ``` 下 ==time== 執行兩個程式發現 function 運行時間比 macro 多了 2 到 3 倍。 ## Linux 應用 linked list 在哪些場合？舉三個案例並附上對應程式碼，需要解說，以及揣摩對應的考量 ## GNU extension 的 typeof 有何作用？在程式碼中扮演什麼角色 typeof 用於索引變數的 type ，可以吃兩種參數。 - expression - - type - [Typeof](https://gcc.gnu.org/onlinedocs/gcc/Typeof.html) ## 解釋以下巨集的原理 ```c= #define container_of(ptr, type, member) \ __extension__({ \ const __typeof__(((type *) 0)->member) *__pmember = (ptr);\ (type *) ((char *) __pmember - offsetof(type, member));\ }) ``` - \_\_extension\_\_ 此 macro 主要用於防止 gcc 產生警告，雖然我用 7.3.0 版本的 gcc ，就算把__extension__拿掉也不會產生警告。 - ((type *) 0)->member ((type *) 0) 用於得到一個 type 型態的 pointer ，其中的 0 換成其他**整數**也可以。再指向 member - offsetof定義在 [<linux/stddef.h>](https://elixir.bootlin.com/linux/latest/source/tools/include/linux/kernel.h#L22) 中，用來計算某一個struct結構的成員相對於該結構起始位址的 offset 。 >#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER) 將 member 的 addrss 減去 offset 便可得到 struct 的起始位置。 ```c= #include <stdlib.h> #include <stdio.h> #include <string.h> #define offsetof(TYPE,MEMBER) ((size_t)&((TYPE *)0)->MEMBER) #define container_of(ptr, type, member) \ __extension__({ \ const __typeof__(((type *) 0)->member) *__pmember = (ptr);\ (type *) ((char *) __pmember - offsetof(type, member)); }) typedef struct { char name[16]; int weight; char wife1[12]; char wife2[12]; char wife3[12]; } weeb; int main (void) { weeb nerd = {"your_name",87,"Miku","Kotori","Umi"}; printf("address of struct %p\n", &nerd);//0x7ffc3584dc00 printf("address of wife3 %p\n", &nerd.wife3);//0x7ffc3584dc2c printf("id offset %ld\n", offsetof(weeb,wife3));//output is 44(16 + 4 + 12 +12) printf("offset of id %ld\n", (long)&nerd.wife3 - (long)&nerd ); printf("get address of struct from that of member %p\n", container_of(&(nerd.wife3),weeb, wife3));//0x7ffc3584dc00 return 0; } ``` ## 除了你熟悉的 add 和 delete 操作，`list.h` 還定義一系列操作，為什麼呢？這些有什麼益處？ [==list.h==](https://elixir.bootlin.com/linux/latest/source/tools/firewire/list.h#L2) ## `LIST_POISONING` 這樣的設計有何意義？ ``` ``` 確保不會使用到還沒初始化的 entry ，如果被 reference 將會產生 page fault [==poison.h==](https://elixir.bootlin.com/linux/latest/source/include/linux/poison.h) [Linux鏈結串列struct list_head 研究](https://myao0730.blogspot.com/2016/12/linux.html) ## linked list 採用環狀是基於哪些考量？在一個特定的排序中不斷循環時就需要使用到 circular linked list，在 CPU 排程就會使用到可以快速的對尾端進行操作。若操作不慎使得中間的 link 斷開，有機會復原。可以實做一些環狀或是沒有明確頭尾的資料結構。 ## 什麼情境會需要對 linked list 做排序呢？舉出真實世界的應用，最好在作業系統核心出現 ## 什麼情境會需要需要找到 [第 k 大/小元素](https://www.geeksforgeeks.org/kth-smallestlargest-element-unsorted-array/) 呢？又，你打算如何實作？ ## list_for_each_safe` 和 `list_for_each` 的差異在哪？"safe" 在執行時期的影響為何？ >#define list_for_each(pos, head) \\ 　　　　for (pos = (head)->next; pos != (head); pos = pos->next) 如果 list_for_each 裡面做刪除節點，可能連 head 本身都刪掉了，==pos != (head)==就無法做判斷了。 >#define list_for_each_safe(pos, n, head) \\ for (pos = (head)->next, n = pos->next; pos != (head); \\ pos = n, n = pos->next) 而 safe 版本的多了個 ==n== 指向 pos 的 next ，確保， n 無效的話就會少走一次 ## for_each 風格的開發方式對程式開發者的影響為何？不用了解 array 或 list 的實際物件數也可以做 traverse ，可讀性也比較好一點。 ## 程式註解裡頭大量存在 @符號，這有何意義？你能否應用在後續的程式開發呢？(提示: 對照看 Doxygen） Doxygen 用於產生程式的說明文件，@有幾種常見用法。 - @file 檔案的註解說明 - @author 作者的資訊 - @brief 用於 class 或 function 的註解中，後面為 class 或 function 的簡易說明 - @param @param arg_name 參數說明主要用於函式說明中，後面接參數的名字，然後再接關於參數的說明。 - @return 後面接函數回傳值的說明。用於 function 的註解中。說明該函數的傳回值。 - @retval @retval value 回傳說明主要用於函式說明中，說明特定傳回值的意義。後面接回傳值，再接該回傳值的說明。 ## `tests/` 目錄底下的 unit test 的作用為何？就軟體工程來說的精神為何？針對 function 逐個做測試，屬於軟體工程裡 ==Implementation== 的過程，確認每個 function 皆正常執行後才能進到下個階段。在每個 unit test 裡便用了許多 `assert` 確定每步皆正常執行。 ## `tests/` 目錄的 unit test 可如何持續精進和改善呢？ ###### tags: `sysprog`,`2019spring`