# 2023q1 Homework1 (lab0) contributed by < [Raymonna](https://github.com/komark06) > ## Environment Information ```shell $ gcc --version gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 6 On-line CPU(s) list: 0-5 Thread(s) per core: 1 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 158 Model name: Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz Stepping: 10 CPU MHz: 800.062 CPU max MHz: 4000.0000 CPU min MHz: 800.0000 BogoMIPS: 5599.85 Virtualization: VT-x L1d cache: 192 KiB L1i cache: 192 KiB L2 cache: 1.5 MiB L3 cache: 9 MiB ``` ## Implementation of `queue.c` ### q_new ```c struct list_head *q_new() { struct list_head *newh = malloc(sizeof(struct list_head)); if(!newh){ free(newh); return NULL; } INIT_LIST_HEAD(newh); return newh; } ``` In `q_new` function, two things need to be consider: - Satisfy the structure of circular linked list - The first list_head object `head` is not a member of element_t, which indicates that it contains no value, only consisting of `list_head *next` and `list_head *prev` ### q_free ```c void q_free(struct list_head *l) { if (!l) return; element_t *cur, *safe; list_for_each_entry_safe (cur, safe, l, list) { q_release_element(cur); } free(l); } ``` Let's look at what the macro `list_for_each_entry_safe` looks like. ```c for(cur = container_of(l->next, element_t, list), safe = container_of(cur->list.next, element_t, list); &cur->list != l; cur = safe, safe = container_of(safe->list.next, element_t, list)) ``` - The macro `container_of` could give you the address of the structure(`element_t`) while giving it the information of the member of this structure(`list_head *`). For more details, please refer to [the lecture provided by Professor Huang](https://hackmd.io/@sysprog/linux-macro-containerof#container_of-%E5%B7%A8%E9%9B%86%E4%BD%9C%E7%82%BA%E8%B3%87%E6%96%99%E5%B0%81%E8%A3%9D%E7%9A%84%E5%9F%BA%E7%A4%8E) - If you look at the first time, you may wonder why it need another list_head `safe` for iteration. Remember that we've mentioned the first list_head structure `head` is not a member of `element_t` structure; therefore, if we ignore it, segmentation fault would occur! - While taking a look at the `q_release_element`, we could dive into the details of the implementation of `test_free` and see how the magic number has been used.([Details of test_free implementation in the lecture note](https://hackmd.io/@sysprog/linux2022-lab0#%E8%BF%BD%E8%B9%A4%E8%A8%98%E6%86%B6%E9%AB%94%E9%85%8D%E7%BD%AE%E5%92%8C%E9%87%8B%E6%94%BE%E7%9A%84%E7%8B%80%E6%B3%81)) ### q_insert_head and q_insert_tail In the insert and remove sections, I refer to the implementation provided by [chiangkd](https://hackmd.io/2ug0Xq4bTNij2eWQgYR4mg?view#q_insert_head-%E5%92%8C-q_insert_tail-%E5%AF%A6%E4%BD%9C) ```c static inline element_t *new_element(const char *s) { if (!s) return NULL; element_t *new_e = malloc(sizeof(element_t)); if (!new_e) return NULL; size_t len = strlen(s); new_e->value = malloc(len + 1); if (!new_e->value) { free(new_e); return NULL; } memcpy(new_e->value, s, len); new_e->value[len] = '\0'; return new_e; } ``` One thing needs to be noticed is that the length of `new_e` needs to be `strlen(s) + 1` for putting the **'\0'** at the end of the string. ```c bool q_insert_head(struct list_head *head, char *s) { if (!head) return false; element_t *ne = new_element(s); if (!ne) return false; list_add(&ne->list, head); return true; } ``` ```c bool q_insert_tail(struct list_head *head, char *s) { if (!head) return false; element_t *ne = new_element(s); if (!ne) return false; list_add_tail(&ne->list, head); return true; } ``` ### q_remove_head and q_remove_tail ```c static inline element_t *remove_element(element_t *obj, char *sp, size_t bufsize) { if (!obj) return NULL; if (sp && bufsize != 0) { size_t len = strlen(obj->value); if (len > bufsize - 1) len = bufsize - 1; memcpy(sp, obj->value, len); sp[len] = '\0'; } list_del(&obj->list); return obj; ``` Notice that 1. since we need to return the obj after calling `q_remove_*`, we wouldn't release `q_release_element(obj)` here. 2. we need to check the size between len and bufsize in order to avoid from buffer overflow. ```c element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; return remove_element(list_first_entry(head, element_t, list), sp, bufsize); } ``` Be aware of the `list_empty(head)`. The lack of it may cause segmentation fault. ```c element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; return remove_element(list_last_entry(head, element_t, list), sp, bufsize); } ``` ### q_delete_mid ```c bool q_delete_mid(struct list_head *head) { if (!head || list_empty(head)) return false; struct list_head *tail = head->prev, *fast = head->next, *slow = fast; while (fast != tail && fast->next != tail) { fast = fast->next->next; slow = slow->next; } list_del(slow); q_release_element(container_of(slow, element_t, list)); return true; } ``` Use the property that size of `fast` would always 2 times larger than `slow`, which indicates the `slow` would be the middle list_head when `fast` reaches the end of the list. ### q_delete_dup Before designing the code, we have to notice that in this problem, we need to delete all duplicated nodes, which includes the node itself. For example, original is [1, 2, 2, 3, 3, 5], and the one after deleting the repeated nodes is [1, 5]. ```c= bool q_delete_dup(struct list_head *head) { if (!head) return false; struct list_head *cur = head->next; while(cur != head && cur->next != head){ struct list_head *cur_next = cur->next; element_t *cur_e = container_of(cur, element_t, list); element_t *next_e = container_of(cur->next, element_t, list); bool is_dup = false; while(!strcmp(cur_e->value, next_e->value)){ is_dup = true; struct list_head *tmp_next = next_e->list.next; list_del(&next_e->list); q_release_element(next_e); if(tmp_next == head) break; next_e = container_of(tmp_next, element_t, list); } cur_next = cur->next; if(is_dup){ list_del(&cur_e->list); q_release_element(cur_e); } cur = cur_next; } return true; } ``` `q_delete_dup` code explanation: 1. Line 7: Check if the unvisited list contains 0 or 1 node 2. Line 12-16: If the current string equals to the next undeleted string, we need to delete the **next element**(not current one). And we get a new `next_e`. 3. Notice that the `cur_next` save the information of the newlest next list_head of `cur`. We couldn't ignore it since we may need to delete `cur_e`(the element_t structure containing `cur`), which may cause we couldn't use `cur->next` ### q_swap In this function, we have to notice that the `cur` is an pointer to pointer. If we use pointer in this case, either you would inevitably change the value inside the `cur` or you have to do extra work on dealing with the boundary condition. ```c= void q_swap(struct list_head *head) { if (!head) return; struct list_head **cur = &head->next; for (; *cur != head && (*cur)->next != head; cur = &(*cur)->next->next) { struct list_head *first = *cur, *second = first->next; first->next = second->next; second->next = first; second->prev = first->prev; first->prev = second; *cur = second; } } ``` ### q_reverse ```c= void q_reverse(struct list_head *head) { if (!head) return; struct list_head *cur, *safe; list_for_each_safe (cur, safe, head) { list_move(cur, head); } } ``` ### q_reverseK This is the code I saw from [ chiangkd ](https://hackmd.io/duPg38XOTL-XKa7_h0Ielw?view#q_reverseK), which is clear and uses the macros wisely. ```c= void q_reverseK(struct list_head *head, int k) { if (!head || k <= 1) return; struct list_head *cur, *safe, *tmp = head; LIST_HEAD(dummy); int count = 0; list_for_each_safe (cur, safe, head) { count++; if (count == k) { list_cut_position(&dummy, tmp, cur); q_reverse(&dummy); list_splice_init(&dummy, tmp); count = 0; tmp = safe->prev; } } } ``` `q_reverse` code explanation: 1. In line 12, it split the list into 2 sub lists, which sets `dummy` and `tmp` as the first list_head, respectively. 2. In line 13, everse the sub list beginning with `dummy` 3. In line 14, combine the two sub lists. Notice that when looking at the implementation of `list_spice_init`, the `dummy` has been initialized as a legal circular linked list. ### q_sort It is nothing but a typical divide and conquer algorithm ```c= void q_sort(struct list_head *head) { if (!head || head->next == head->prev) return; head->prev->next = NULL; head->next = merge_sort(head->next); struct list_head *ptr = head; for (; ptr->next; ptr = ptr->next) ptr->next->prev = ptr; ptr->next = head; head->prev = ptr; } ``` ```c= struct list_head *merge_two_queue(struct list_head *L1, struct list_head *L2) { struct list_head *head = NULL, **ptr = &head; for (struct list_head **cur = NULL; L1 && L2; *cur = (*cur)->next) { if (strcmp(container_of(L1, element_t, list)->value, container_of(L2, element_t, list)->value) >= 0) cur = &L2; else cur = &L1; *ptr = *cur; ptr = &(*ptr)->next; } *ptr = (struct list_head *) (void *) ((uintptr_t) (void *) L1 | (uintptr_t) (void *) L2); return head; } ``` In `merge_sort`, we firstly need to find the middle list_head(in line 7-9), and combine it(call `merge_two_queue`) ```c= //A process of dividing static struct list_head *merge_sort(struct list_head *head) { if (!head || !head->next) return head; struct list_head *slow = head, *fast = head->next; for (; fast && fast->next; fast = fast->next->next, slow = slow->next) ; fast = slow->next; slow->next = NULL; return merge_two_queue(merge_sort(head), merge_sort(fast)); } ``` ### q_descend ```c int q_descend(struct list_head *head) { if (!head) return 0; int len = 0; struct list_head *cur = head->prev; while (cur != head) { len++; if (cur->prev == head) break; element_t *c = container_of(cur, element_t, list), *p = container_of(cur->prev, element_t, list); if (strcmp(c->value, p->value) > 0) { list_del(&p->list); q_release_element(p); len--; } else { cur = cur->prev; } } return len; } ``` ### Current Grade There is a strange situation. When I use `make test`, it sometimes shows 95/100, sometimes shows 100/100. While I use `make valgrind`, it always shows 95/100. According to the error message, it should be some memory leakage in the function `q_insert_tail`, but I'm still working on it, trying to deal with the solution. ## Add new command -- Shuffle ```c= //in queue.c void q_shuffle(struct list_head *head) { if (!head || list_empty(head)) return; int size = q_size(head); // shuffle for (int i = 0; i < size;) { // not iterate i , iterate size struct list_head *start = head->next; int rand_idx = rand() % size; // random number in range 0~ (size-1) for (int j = 0; j < rand_idx; j++) { start = start->next; } list_del(start); list_add_tail(start, head); size--; } return; } ``` ```c= //in queue.h void q_shuffle(struct list_head *head); ``` ```c= //in qtest.c static bool do_shuffle(int argc, char *argv[]) { if (argc != 1) { report(1, "%s takes no arguments", argv[0]); return false; } if (!l_meta.l) report(3, "Warning: Try to access null queue"); error_check(); set_noallocate_mode(true); if (exception_setup(true)) q_shuffle(l_meta.l); exception_cancel(); set_noallocate_mode(false); show_queue(3); return !error_check(); } .... static void console_init() { ADD_COMMAND(shuffle, "shuffle", ""); } ``` ## Performance analysis for list_sort ### Modify the source code in [list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) to embed it to lab0 - Step 1: add three functions.(Notice that we have remove the unnecessary variables, `cmp` and `priv`) - `static struct list_head *merge(struct list_head *a, struct list_head *b)` - `static void merge_final(struct list_head *head, struct list_head *a, struct list_head *b)` - `void list_sort(struct list_head *head)` - Step 2: add compare function ```c= static int cmp(struct list_head *a, struct list_head *b) { return strcmp(container_of(a, element_t, list)->value, container_of(b, element_t, list)->value); } ``` - Step 3: add necessary macros, `likely` and `unlikely` ```c= # define likely(x) __builtin_expect(!!(x), 1) # define unlikely(x) __builtin_expect(!!(x), 0) ``` - Step 4: add the new command for `list_sort`(just like we've done for `shuffle` command) ### Performance between q_sort and list_sort - For q_sort, ```shell= $perf stat --repeat 5 -e instructions,cycles ./qtest -f traces/trace-15-perf.cmd Performance counter stats for './qtest -f traces/trace-15-perf.cmd' (5 runs): 7,5665,9045 instructions # 0.43 insn per cycle ( +- 0.01% ) 17,6220,2427 cycles ( +- 0.19% ) 0.45850 +- 0.00111 seconds time elapsed ( +- 0.24% ) ``` - For list_sort, ```shell= $ perf stat --repeat 5 -e instructions,cycles ./qtest -f traces/trace-15-perf-lsort.cmd Performance counter stats for './qtest -f traces/trace-15-perf-lsort.cmd' (5 runs): 7,0725,6998 instructions # 0.51 insn per cycle ( +- 0.01% ) 13,8684,4861 cycles ( +- 0.64% ) 0.36797 +- 0.00706 seconds time elapsed ( +- 1.92% ) ``` ### <s>Time complexity</s> analysis between q_sort and list_sort :::warning We don't tend to analyze the time complexity. Instead, we care about the measured time and the factors to make programs run faster. :notes: jserv :::