2019q1 Homework1 (list)

contributed by < NoahNut >

實驗環境

$ uname -a
Linux noah-UX330UA 4.15.0-43-generic #46~16.04.1-Ubuntu SMP Fri Dec 7 13:31:08 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ gcc -version
gcc (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010

Linux 中的 Linked list

Linux 中的 Linked list 是用 doubly linked list 所實作，並且用到大量的 macro 跟 inline function。

注意書寫方式，doubly linked list 才是正確術語
jserv

macro
macro 中的內容會在編譯前就將其內容展開，這樣可以讓程式碼整潔且多個地方都可以使用，且可以撰寫成類似函式的形式，這樣不用花費跳到呼叫新函式的代價。
inline function
inline 會要求編譯器將這個函式展作為最佳化，但編譯器可以自行決定是否要採納。

注意看 C99/C11 規格書對於 inline 的描述，特別是 6.7.4 一節。不要只是憑記憶/憑感覺紀錄，找出第一手材料並解讀
jserv

Macro vs Function

在 list.h 中可以看到有大量的 for 迴圈是用 macro
而使用 macro 來代替函式，從函式本身運作的方式就知道還要將值存入特地的暫存器然後在記憶體跳轉，函式結束後 return 回去後還要將上一個函式的值從暫存器中拿回來，所以使用別的函式本身的 cost 就會比較高，而 macro 就會像是函式的存在，但是並不會進行類似函式的操作，而在程式碼中直接展開。

Linked list used in Linux kernel

`epoll`

epoll 在 linux manpage 中的描述，是作為監控多個檔案描述器，看哪個是否能夠作 I/O，其功能跟 select 很類似，在 linux manpage 中也是監控多個檔案描述器，等待某個檔案描述器準備好作 I/O 而兩者的差別有下列幾點。

select 是搜尋過每個 I/O 檔案描述器，來看看哪個能夠被使用，而 epoll 只需要搜尋加入 Ready queue 的檔案描述器。
select 在每個使用這個 syscall 的時候都會需要將檔案描述器在 User space 跟 kernel space 之間複製傳遞，而 epoll 並不會重複複製。

struct epitem 是如果有檔案描述器使用 epoll 這個系統呼叫都會有這個結構的 entry ，而這個結構，利用 Linked list 將檔案描述符與 epoll 的 ready list 跟 waiting queue 還有 file 連接起來。

    /* List header used to link this structure to the eventpoll ready list */
    struct list_head rdllink;
    ...
    /* List containing poll wait queues */
    struct list_head pwqlist;
    ...
    /* List header used to link this item to the "struct file" items list */
    struct list_head fllink;

~~個人認為~~在這邊利用 linked list 是將其他資料結構連接起來，這樣就只需要有一個 ready list 跟 wait queues 並且可以利用在list.h中的 container_of 讀取其中的值，這樣可以讓型態多變，不用兩邊的 struct 型態都需要改變。

不需要在共筆提及「個人認為」，可改說「推論」，並附上足以佐證的材料。共筆當然是你的所見所聞所感。
jserv

ep_scan_ready_list 這個函式是 epoll 在搜尋 ready list 中可用 I/O 的函式。

GNU extension typeof

typeof 在編譯時期會將在其中值的型態提取出來，例如

typeof(10) -> int
typeof(10.0) -> float

typeof 就會直接推導在括號中的值為 int 或 float 等型態，利用這樣的方式就可以藉由傳入任意的值，來作新變數的型態宣告。
在 linux/include/linux/kernel.h 中比較兩個不為零的數就有利用到這個技術。

#define min_not_zero(x, y) ({			\
	typeof(x) __x = (x);			\
	typeof(y) __y = (y);			\
__x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); })

藉由 typeof 來判斷傳入 macro 的值來的型態，來新宣告兩個變數，根據 GNU onlinedoc 這樣新宣告變數的技巧，讓值的操作只限制在括號中的這個 scope 這樣可以預防未來可能發生的變數衝突。

解釋以下巨集的原理

#define container_of(ptr, type, member)                            \
    __extension__({                                                \
        const __typeof__(((type *) 0)->member) *__pmember = (ptr); \
        (type *) ((char *) __pmember - offsetof(type, member));    \
    })

這個巨集是作為利用結構中成員的記憶體位址，找到結構最初的記憶體位址。
在這邊我們藉由簡單的程式，印出 struct 和 member 的記憶體位置還有作 container_of 之後的結果。

typedef struct test_one 
{
    int number;
    int number_two;
    char new[24];
} t_o;

#define OFFSET(type, member) (((type *) 0)->member)

int main()
{
   t_o new = {0, 1, "aaaaaa"};
   printf("offset of number %ld\n" ,offsetof(t_o, number));
   printf("offset of number_two %ld\n" ,offsetof(t_o, number_two));
   printf("offset of new %ld\n" ,offsetof(t_o, new));
   printf("add of t_o struct %p\n", &new);
   printf("add of number %p\n", &new.number);
   printf("add of number_two %p\n", &new.number_two);
   printf("add of new %p\n", &new.new);
   printf("Container find the struct start address %p\n", container_of(&new.new, t_o, new));
   
   printf("(((type *) 0)->member) %d\n", OFFSET(t_o, number));
   return 0;
}

(type *) 0 是一個被轉型為 type 型態的 NULL pointer，在這邊是作為 typeof 取出 member 型態所必須，因為 (type *) -> member，在語法上是不允許的。
根據實驗如果對於 (((type *) 0)->member) 作操作會得到其 member 在 struct offset
執行結果

(((type *) 0)->member) member_one 0
(((type *) 0)->member) member_two 4

利用 typeof 來提取 member 的型態，利用這個型態宣告一個 pointer 指向 (ptr) 這個 member variable。
得到這個 member variable 的位址後，就可以利用位址減掉 member 在 struct 中的 offset 就得到這個 struct 的起始位址。
而為何要將 __pmember 先作 (char *) 的轉型後在跟 offset 作運算？根據實驗的結果就算不作 (char *) 的轉換也能得到相同的記憶體位置。

offset of number 0
offset of number_two 4
offset of new 8
add of t_o struct 0x7ffea78c0ba0  
add of number 0x7ffea78c0ba0
add of number_two 0x7ffea78c0ba4
add of new 0x7ffea78c0ba8
Container find the struct start address 0x7ffea78c0ba0

`list.h` 中的函式或巨集

除了 delete 跟 add 之外還有許多對 linked list 操作的函式或巨集:
- list_is_singular
- list_splice(struct list_head *list, struct list_head *head)
  - 將 head 的所有 node 接到 list 後
- list_splice_tail(struct list_head *list, struct list_head *head)
  - 將 list 所有的 node 接到 head 的後面
- list_splice_init
- list_splice_tail_init
- list_cut_position
- list_move
- list_move_tail
- list_for_each
- list_for_each_safe
- list_for_each_entry
- list_for_each_entry_safe

`LIST_POISONING` 設計理念

LIST_POISONING 定義在 list_del 這個函式中，功能是當 list node 被 delete 後會將這個 node 的 prev 跟 next 分別指向特定的記憶體位置
```
    node->prev = (struct list_head *) (0x00100100);
    node->next = (struct list_head *) (0x00200200);
```
在 list_del 的註解上表示這個刪除並不是將節點中的值清空或者將記憶體釋放，利用 GDB 去追蹤後，可以發現就算使用了 list_del 那個被移出 node 其 prev 跟 next 還是指向其原本所指的記憶體位置
- LIST_POISIONING 是根據使用者自身決定是否使用，不開啟的情形下，如果再次被使用到就會是個 uninitialized 的 node。
  (感覺這邊就有漏洞可以被利用，只要得到某個被刪除 node 的記憶體位置就能夠拿到整個 linked list 或破壞)
將 LIST_POISIONING define 後使用 list_for_each 這個函式去使用 list_del 將整個 linked list 刪除，會產生 Segmentation fault，實際用 GDB 去追蹤其記憶體的變化後，可以發現在 delete 後 prev 跟 next 分別被換成 0x00100100 跟 0x00200200，在 list_for_each 函式中
```
 node = node -> next;     
```
node 就會指向一個非法的記憶體位置，造成錯誤
在 list.h 除了 list_for_each 之外還有定義 list_for_each_safe ，這個函式就是來處理上述的問題，會先將現在 node 的 next 保存下來這樣就不會有刪除到現在 node 然後又使用它造成錯誤
```
#define list_for_each_safe(node, safe, head)                     \
  for (node = (head)->next, safe = node->next; node != (head); \
       node = safe, safe = node->next)
```

Merge sort

Reference
Why this 0 in ((type*)0)->member in C?