# 2019q1 Homework1(list) contributed by < `wang0630` > #### Q1: 為何 Linux 採用 macro 來實作 linked list?一般的 function call 有何成本? Instead of using a function, using a macro has several advantages, including: 1. The program is slightly faster. Function calls often requires additional care, such as **storing the old context in registers for later return**, **copying arguments and return values of a function**, etc. Note. this overhead can be dealed with **inline** keyword in C99 standard. 2. Macros accept most of types as their arguments. However, functions often only take particular types, making the latter is less generic than the former. #### Q2: Linux 應用 linked list 在哪些場合?舉三個案例並附上對應程式碼,需要解說,以及揣摩對應的考量 A interesting thing pointed out by https://notes.shichao.io/lkd/ch6/ We tend to define a node like this: ```clike struct mynode { int mydata; // some data struct mynode* prev; struct mynode* next; } ``` but linux kernel defines linked list as follow: ```clike struct mynode { int mydata; // some data struct list_head head; } struct list_head { struct list_head* prev; struct list_head* next; } ``` linux version of linked list wraps all elements to one nice and simple data structure. ##### Usage: 1. task_struct A iconic usage of linked list is task_struct which represents a task or a process. linux uses linked list to handle parent/child/sibling releationship among processes. It is represneted as followed: ```clike struct task_struct { // ...... struct list_head children; struct list_head sibling; // ...... } ``` The relationship can be described as: ex. process a has process b,c as children ```clike= a.children.next = &b.sibling; b.sibling.next = &c.sibling; c.sibling.next = &a.children; // and prev field is linked backward ``` To iterate through the whole list of example above, we use the macro defined in linux/list.h ```clike struct list_head *iterator, *starting = &a.children; list_for_each(iterator, starting) { struct list_head* children_of_process_a = list_entry(iterator, struct task_struct, sibling); // do something with childern_of_process_a } ``` 2. Scheduler in Linux Linux uses **Completely Fair Scheduler** as its scheduling algorithm. Instead of using strict rules to define the time quantum for each task, CFS assigns propotion of CPU time to each task. This propotion is based on the **nice value** assigned to each task: ```clike static const int prio_to_weight[40] = { /* -20 */ 88761, 71755, 56483, 46273, 36291, /* -15 */ 29154, 23254, 18705, 14949, 11916, /* -10 */ 9548, 7620, 6100, 4904, 3906, /* -5 */ 3121, 2501, 1991, 1586, 1277, /* 0 */ 1024, 820, 655, 526, 423, /* 5 */ 335, 272, 215, 172, 137, /* 10 */ 110, 87, 70, 56, 45, /* 15 */ 36, 29, 23, 18, 15, }; ``` How does CFS determine a nice value to a task? A nice value is defined by the **vruntime** field in the `struct sched_entity`: ```clike struct sched_entity { struct load_weight load; /* for load-balancing */ struct rb_node run_node; struct list_head group_node; unsigned int on_rq; u64 exec_start; u64 sum_exec_runtime; u64 vruntime; // addtional properties.... } ``` According to **Operating system concept, ninth edition**, sched_entity is the basic unit in linux CFS algorithm, and it is used to track how much time a task has been running by storing the **virtual run time** in vruntime field. The virtual run time is associated to with a decay factor based on the priority of a task. lower-priority tasks have higher decay factor; hence, it's vruntime will grow and drop more rapidly than that of higher-priority tasks. For example, a normal task with nice value zero(default) has run for 200ms, and its vruntime will also be 200ms. However, if a lower-priority task runs for 200ms, its vruntime will be more than 200ms and vice versa. For linked list usage, CFS keeps every running tasks in a red-black tree **whose key is based on the value of vruntime**. To select the next task to run, CFS simply chooses the task with the smallest vruntime, which will be the leftmost task in the RB-tree; Even more, since traversing through the whole RB tree takes complexity of **O(logn)**, linux keeps a reference of the task with the smallest vruntime value in the field called` rb_leftmost` linux keeps `sched_entity` inside a queue called `cfs_rq`, and use `struct sched_entity __pick_next_entity` to pick next entity to run: ```clike static struct sched_entity *__pick_next_entity(struct cfs_rq *cfs_rq) { // rb_leftmost is the cache mentioned above struct rb_node *left = cfs_rq->rb_leftmost; if (!left) // no task return NULL; // rb_entry is the macro container_of // return the pointer to the rb_leftmost // run_node is the rb tree linkage inside struct sched_entity return rb_entry(left, struct sched_entity, run_node); } ``` Enqueueing a new task by using `__enqueue_entity` ```clike static void __enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se) { struct rb_node **link = &cfs_rq->tasks_timeline.rb_node; struct rb_node *parent = NULL; struct sched_entity *entry; // macro of se->vruntime - cfs_rq->runtime s64 key = entity_key(cfs_rq, se); // is the new node the leftmost node? int leftmost = 1; /* * Find the right place in the rbtree: */ while (*link) { parent = *link; // get the pointer entry = rb_entry(parent, struct sched_entity, run_node); /* * We dont care about collisions. Nodes with * the same key stay together. */ if (key < entity_key(cfs_rq, entry)) { link = &parent->rb_left; } else { link = &parent->rb_right; /* * if traversing ever goes to the right once, * it will never be the leftmost node * so the leftmost flag is cleared */ leftmost = 0; } } /* * Maintain a cache of leftmost tree entries (it is frequently * used): */ if (leftmost) cfs_rq->rb_leftmost = &se->run_node; rb_link_node(&se->run_node, parent, link); rb_insert_color(&se->run_node, &cfs_rq->tasks_timeline); } ``` 3.