# 2021q1 Homework1 (quiz1) contributed by < `Billy4195` > ###### tags: `linux2021` > [2021q1 第 1 週測驗題](https://hackmd.io/@sysprog/linux2021-quiz1) - [x] Explain how the source code works. - [x] Use Graphviz to visualize the linked-list quiz1 on HackMD - [x] The [random()](https://linux.die.net/man/3/random) is used in the test program, but there is some pattern for the generated number when we execute it several times. Try to fix it and use some other [Pseudorandom number generator](https://en.wikipedia.org/wiki/Pseudorandom_number_generator) - [x] Refer to [Optimized QuickSort — C Implementation (Non-Recursive)](https://alienryderflex.com/quicksort/), use non-recursive method to rewrite the quick sort function. - [ ] The linked-list is implemented in Linux kernel as well, but it use circulr doubly-linked list. [linux-list](https://github.com/sysprog21/linux-list) is an simplified implementation of linked-list in Linux kernel. The implementation of quick sort is provided in examples/ directory. Please analyze the difference betweeen two implementations, and try to modify the quick sort in [linux-list](https://github.com/sysprog21/linux-list) to avoid using recursive call. Reference: [The Linux Kernel API - List Management Functions](https://www.kernel.org/doc/html/latest/core-api/kernel-api.html) - [ ] Read "A Comparative Study of Linked List Sorting Algorithms" written by Ching-Kuang Shene [(冼鏡光)](https://zh.wikipedia.org/zh-tw/%E5%86%BC%E9%8F%A1%E5%85%89),think about the high efficient linked-list sorting algorithm and implemented it in [linux-list](https://github.com/sysprog21/linux-list). ## Explain how the source code works [GitHub Link](https://github.com/Billy4195/linked-list) ### Q1 LLL From the naming of list_concat and the argument name, we could guess that the list_concat() is used to concatenate two linked-list, more precisely, list_concat() concatenates the `right` list to the last element of `left`. To find the last element in the `left` list, we need to traverse through the `left` list, the last element is the one that the next ptr stores `NULL`. As a result, the while loop is to traverse the elements in `left` list, to find out the last element that should point to the head of `right` list. One point is that the `left` is the pointer of pointer of `node_t`, so during the traversal it store the address of next pointer of the current node. The answer should be ( c ). #### Traverse left list (1) ```graphviz digraph { node [shape=box] //All nodes will this shape and colour l [label="left"] r [label="right"] l -> A -> B -> C r -> D -> E -> F {rank=same;A B C D E F} // Put them on the same level } ``` #### Traverse left list (2) ```graphviz digraph { node [shape=box] //All nodes will this shape and colour l [label="left"] r [label="right"] l -> B A -> B -> C r -> D -> E -> F {rank=same;A B C D E F} // Put them on the same level } ``` #### Traverse left list (3) ```graphviz digraph { node [shape=box] //All nodes will this shape and colour l [label="left"] r [label="right"] l -> C A -> B -> C r -> D -> E -> F {rank=same;A B C D E F} // Put them on the same level } ``` #### Concatenate left and right ```graphviz digraph { node [shape=box] //All nodes will this shape and colour l [label="left"] r [label="right"] l -> C A -> B -> C -> D r -> D -> E -> F {rank=same;A B C D E F} // Put them on the same level } ``` ### Q2 AAA & Q3 BBB According to the requirement, the sorted list should be ascending order. We use pivot to divide the given `list` into two list, the right one should be larger than left one. So when the `n->value > value`, the n should be add to `right` list(AAA), then BBB is `left` list. #### Select the first element as pivot ```graphviz digraph { node [shape=box] //All nodes will this shape and colour left right pivot -> 7 7 -> 1 -> 10 -> 13 -> 9 -> 4 -> 3 {rank=same; 1 3 4 7 9 10 13} } ``` #### Traverse each element in list to split them into `left` and `right` lists ```graphviz digraph { node [shape=box] //All nodes will this shape and colour left right list -> 1 pivot -> 7 1 -> 10 -> 13 -> 9 -> 4 -> 3 {rank=same; 1 3 4 7 9 10 13} } ``` #### Node value(1) less than pivot value(7) should be added to `left` list ```graphviz digraph { node [shape=box] //All nodes will this shape and colour left -> 1 right list -> 10 pivot -> 7 10 -> 13 -> 9 -> 4 -> 3 {rank=same; 1 3 4 7 9 10 13} } ``` #### Node value(10) greater than pivot value(7) should be added to `right` list ```graphviz digraph { node [shape=box] //All nodes will this shape and colour left -> 1 right -> 10 list -> 13 pivot -> 7 13 -> 9 -> 4 -> 3 {rank=same; 1 3 4 7 9 10 13} } ``` #### Node value(13) greater than pivot value(7) should be added to `right` list ```graphviz digraph { node [shape=box] //All nodes will this shape and colour left -> 1 right -> 10 -> 13 list -> 9 pivot -> 7 9 -> 4 -> 3 {rank=same; 1 3 4 7 9 10 13} } ``` ### Q4 CCC Quick sort is an divide-and-conquer method, the sub-problem should be merged. We need to merge the sorted `left` list, pivot and the sorted `right` list in order. So the answer will be ( b ) ```graphviz digraph { node [shape=box] //All nodes will this shape and colour result left -> 1 pivot -> 7 right -> 9 1 -> 3 -> 4 9 -> 10 -> 13 {rank=same; 1 3 4 7 9 10 13} } ``` #### After combine all sorted sublist ```graphviz digraph { node [shape=box] //All nodes will this shape and colour result -> 1 left -> 1 pivot -> 7 right -> 9 1 -> 3 -> 4 9 -> 10 -> 13 4 -> 7 7 -> 9 {rank=same; 1 3 4 7 9 10 13} } ``` ```c typedef struct __node { int value; struct __node *next; } node_t; ``` ```c static inline void list_add_node_t(node_t **list, node_t *node_t) { node_t->next = *list; *list = node_t; } static inline void list_concat(node_t **left, node_t *right) { while (*left) //LLL; left = &((*left)->next); *left = right; } void quicksort(node_t **list) { if (!*list) return; node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? AAA : BBB, n); } quicksort(&left); quicksort(&right); node_t *result = NULL; list_concat(&result, left); CCC; *list = result; } static bool list_is_ordered(node_t *list) { bool first = true; int value; while (list) { if (first) { value = list->value; first = false; } else { if (list->value < value) return false; value = list->value; } list = list->next; } return true; } ``` ```c static void list_display(node_t *list) { printf("%s IN ORDER : ", list_is_ordered(list) ? " " : "NOT"); while (list) { printf("%d ", list->value); list = list->next; } printf("\n"); } int main(int argc, char **argv) { size_t count = 20; node_t *list = NULL; while (count--) list = list_make_node_t(list, random() % 1024); list_display(list); quicksort(&list); list_display(list); if (!list_is_ordered(list)) return EXIT_FAILURE; list_free(&list); return EXIT_SUCCESS; } ``` ## Improve random() ### Problem According to the manual page of [`random()`](https://linux.die.net/man/3/random), the generated random number is controlled by an seed value. It could be specified by using `srandom()`. If the srand() is not used,`random()` is automatically seeded with value 1. ### Solution For each time we execute the program, the seed pass through `srandom()` should be different. According to [DDerveialm's work](https://hackmd.io/PbxsLqSzSfqp_mgBmzGIBg?view) and [man page](https://linux.die.net/man/2/time), the simplest method to make `random()` better is just pass the current time to the `srandom()`, e.g. `srandom(time(NULL))` ```shell $ bin/main NOT IN ORDER : 950 799 230 229 317 508 564 605 987 966 68 208 43 431 640 734 473 162 214 534 IN ORDER : 43 68 162 208 214 229 230 317 431 473 508 534 564 605 640 734 799 950 966 987 $ bin/main NOT IN ORDER : 657 52 691 139 844 456 489 168 944 800 809 89 564 368 873 934 264 875 265 451 IN ORDER : 52 89 139 168 264 265 368 451 456 489 564 657 691 800 809 844 873 875 934 944 $ bin/main NOT IN ORDER : 992 620 919 851 253 668 34 612 346 258 997 593 821 201 600 889 130 339 148 489 IN ORDER : 34 130 148 201 253 258 339 346 489 593 600 612 620 668 821 851 889 919 992 997 $ bin/main NOT IN ORDER : 992 620 919 851 253 668 34 612 346 258 997 593 821 201 600 889 130 339 148 489 IN ORDER : 34 130 148 201 253 258 339 346 489 593 600 612 620 668 821 851 889 919 992 997 $ bin/main NOT IN ORDER : 276 352 573 423 290 8 844 464 375 1014 787 98 40 64 637 265 455 878 326 997 IN ORDER : 8 40 64 98 265 276 290 326 352 375 423 455 464 573 637 787 844 878 997 1014 ``` But as we execute the program twice less than one second, we could see the same result again(in 3rd and 4th execution). > TODO solve same random value problem when running the program in short time. ## Non-recursive quick sort According to "[Optimized QuickSort — C Implementation (Non-Recursive)](https://alienryderflex.com/quicksort/)", the author use array as the stack to implement an non recursive version of quick sort. But when we wish to apply the same method to sort singly linked-list, it has an problem that the linked-list can't support random access. Inspired by [林韋翰](https://www.facebook.com/groups/system.software2021/permalink/857571485085909/), we can store all nodes in an array, sorts it and then link them together. My implementation is done in this [commit](https://github.com/Billy4195/linked-list/commit/b1c47b28f5a84a1bdb91d3a21b0ca25f0badea0b) ### Explain #### quicksort() `quicksort()` is the entry point to sort an given array, but the linked-list is not support random access. We firstly allocate an array to store all elements for random access(Line 5-26), then we use `__quicksort()` to execute the actual sorting algorithm. The sorted elements are stored in the array, so we need to reconstruct the linked-list(Line 28-31). ```c= void quicksort(node_t **list) { int capacity = 0, size = 0, i; node_t *cur = *list; node_t **list_arr = malloc(sizeof(node_t*) * 16); if (NULL == list_arr) goto malloc_err; capacity = 16; while (cur) { if (size == capacity) { node_t **new_arr = malloc(sizeof(node_t*) * capacity * 2); if (NULL == new_arr) goto malloc_err; memcpy((void*)new_arr, (void*)list_arr, sizeof(node_t*)*capacity); free(list_arr); list_arr = new_arr; capacity *= 2; } list_arr[size] = cur; cur = cur->next; list_arr[size]->next = NULL; size++; } __quicksort(list_arr, size); for (i=0; i < size; i++) { *list = list_arr[i]; list = &((*list)->next); } free(list_arr); return; malloc_err: printf("Error: malloc fail capacity:%d!\n", capacity); return; } ``` `__quicksort()` is real implementation of the non-recursive quick sort. The `left_stk` and `right_stk` array is used to store the left index and right index for the elements to be sorted. 1. Select the left most elements as pivot, and backup its value. (Line 16) 2. Check from right index to find any value less than pivot (Line 18-21) 3. Check from left index to find any value greater than pivot (Line 22-25) 4. Split to two sub-problem and store in stack (Line 28-30) 5. Choose the smaller part to solve (Line 32-39) ```c= static void __quicksort(node_t **list, int size) { if (!*list) return; int left_stk[MAX_STACK_SIZE], right_stk[MAX_STACK_SIZE]; int stk_size = 0; int left_idx, right_idx, swap; node_t *pivot; left_stk[stk_size] = 0, right_stk[stk_size] = size; while (stk_size >= 0) { left_idx = left_stk[stk_size]; right_idx = right_stk[stk_size] - 1; if (left_idx < right_idx) { pivot = list[left_idx]; while (left_idx < right_idx) { while (left_idx < right_idx && list[right_idx]->value >= pivot->value) right_idx--; if (left_idx < right_idx) list[left_idx++] = list[right_idx]; while (left_idx < right_idx && list[left_idx]->value <= pivot->value) left_idx++; if (left_idx < right_idx) list[right_idx--] = list[left_idx]; } list[left_idx] = pivot; left_stk[stk_size+1] = left_idx+1; right_stk[stk_size+1] = right_stk[stk_size]; right_stk[stk_size] = left_idx; if ((right_stk[stk_size] - left_stk[stk_size]) < (right_stk[stk_size+1] - left_stk[stk_size+1])) { swap = left_stk[stk_size]; left_stk[stk_size] = left_stk[stk_size+1]; left_stk[stk_size+1] = swap; swap = right_stk[stk_size]; right_stk[stk_size] = right_stk[stk_size+1]; right_stk[stk_size+1] = swap; } stk_size++; } else { stk_size--; } } } ``` ### Pros & Cons #### Pros * An non-recursive implementation could prevent the stack overflow #### Cons * Extra memory usage, as we need to allocate an array to store all elements ## Difference between linux-list and linux linked-list ## Non recursive quick sort in linux-list