# Link List Sort - 1 contributed by < `jhan1998` > ## Functions works ### 1. `list_add_node_t` ```c static inline void list_add_node_t(node_t **list, node_t *node_t) { node_t->next = *list; *list = node_t; } ``` The purpose of this function is to add a node_t node in front of `*list`, and finally make `**list` point to node_t to update `*list`, we can see that the parameters are passed by pointer to pointer Come in, so in the end, as long as the dereference list is used, the node can be updated directly without returning parameters. #### Flowchart ```graphviz digraph list { rankdir=LR; node[shape=record] struct1 [label="{ node_t}"]; struct2 [label="{**list}"]; struct3 [label="{*list}"]; struct4 [label="{node}"]; struct5 [label="{node}"]; struct1 struct2 -> struct3 -> struct4 -> struct5; } ``` * node_t->next = *list; ```graphviz digraph list { rankdir=LR; node[shape=box] struct1 [label="node_t"]; struct2 [label="**list"]; struct3 [label="*list"]; struct4 [label="node"]; struct5 [label="node"]; {rank="same";struct1-> struct3;} struct2 -> struct3 -> struct4 -> struct5; } ``` * *list = node_t ```graphviz digraph list { rankdir=LR; node[shape=box] struct1 [label="*list(node_t)"]; struct2 [label="**list"]; struct3 [label="*list(old)"]; struct4 [label="node"]; struct5 [label="node"]; struct2 -> struct1 -> struct3 -> struct4 -> struct5; } ``` ### 2. `list_concat` ```c static inline void list_concat(node_t **left, node_t *right) { while (*left) left = &((*left)->next); *left = right; } ``` The function of list_concat is to connect two different lists, left and right. #### Flowchart The while loop will first visit all the left nodes, and finally stay on NULL 1. ```graphviz digraph list { rankdir=LR; node[shape=box] right_head [label="right"]; right_node1 [label="right_node"]; right_node2 [label="right_node"]; right_head -> right_node1 -> right_node2; struct1 [label="*left(head)"]; struct2 [label="**left"]; struct3 [label="node"]; struct4 [label="node"]; struct5 [label="node"]; struct2 -> struct1 -> struct3 -> struct4 -> struct5; } ``` 2. ```graphviz digraph list { rankdir=LR; node[shape=box] right_head [label="right"]; right_node1 [label="right_node"]; right_node2 [label="right_node"]; right_head -> right_node1 -> right_node2; struct1 [label="*left(old)(head)"]; struct2 [label="**left"]; struct3 [label="*left"]; struct4 [label="node"]; struct5 [label="NULL"]; struct1 -> struct3 -> struct4 -> struct5; {rank="same";struct2->struct3} } ``` 3. ```graphviz digraph list { rankdir=LR; node[shape=box] right_head [label="right"]; right_node1 [label="right_node"]; right_node2 [label="right_node"]; right_head -> right_node1 -> right_node2; struct1 [label="node(head)"]; struct2 [label="**left"]; struct3 [label="*left(old)"]; struct4 [label="*left"]; struct5 [label="NULL"]; struct1 -> struct3 -> struct4 -> struct5; {rank="same";struct2->struct4} } ``` 4. ```graphviz digraph list { rankdir=LR; node[shape=box] right_head [label="right"]; right_node1 [label="right_node"]; right_node2 [label="right_node"]; right_head -> right_node1 -> right_node2; struct1 [label="node(head)"]; struct2 [label="node"]; struct3 [label="node"]; struct4 [label="node"]; struct5 [label="NULL(*left)"]; struct1 -> struct3 -> struct4 -> struct5; {rank="same";struct2->struct5} } ``` 5. Finally `*left = right`, so the last element of left will point to the head of right, thus completing the concatenation of the two heads. ```graphviz digraph list { rankdir=LR; node[shape=box] struct1 [label="node(head)"]; struct2 [label="node"]; struct3 [label="node"]; struct4 [label="node"]; struct5 [label="right(*left)"]; right_node1 [label="right_node"]; right_node2 [label="right_node"]; struct1 -> struct3 -> struct4 -> struct5; {rank="same";struct2->struct5} struct5 -> right_node1 -> right_node2; } ``` ### 3. `quicksort` ```c void quicksort(node_t **list) { if (!*list) return; node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? &right : &left, n); } quicksort(&left); quicksort(&right); node_t *result = NULL; list_concat(&result, left); list_concat(&result, pivot); list_concat(&result, right); *list = result; } ``` `quicksort()` will initially set `pivot` to the initial node ```c node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; ``` Next, the list will be divided, and the nodes of `< pivot` and `> pivot` will be connected to the left and right respectively. ```c node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? &right : &left, n); } ``` Then pass back to the left half to do the same thing, and then pass back to the right half. ```c quicksort(&left); quicksort(&right); ``` The termination condition is when the list is split to the point where it cannot be split anymore. ```c if (!*list) return; ``` The next step is to merge the split matrix. We can see that the `pivot` node is separated, so when concat finishes the left half of the list, we must first connect the `pivot` to continue on the right half. do the same thing. ```c node_t *result = NULL; list_concat(&result, left); list_concat(&result, pivot); list_concat(&result, right); *list = result; ``` Finally, we will get a sorted list. ### 4. Execution #### list.c ```c #include <stdbool.h> #include <stdio.h> #include <stdlib.h> typedef struct __node { int value; struct __node *next; } node_t; static inline void list_add_node_t(node_t **list, node_t *node_t) { node_t->next = *list; *list = node_t; } static inline void list_concat(node_t **left, node_t *right) { while (*left) left = &((*left)->next); *left = right; } void quicksort(node_t **list) { if (!*list) return; node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? &right : &left, n); } quicksort(&left); quicksort(&right); node_t *result = NULL; list_concat(&result, left); list_concat(&result, pivot); list_concat(&result, right); *list = result; } static bool list_is_ordered(node_t *list) { bool first = true; int value; while (list) { if (first) { value = list->value; first = false; } else { if (list->value < value) return false; value = list->value; } list = list->next; } return true; } static void list_display(node_t *list) { printf("%s IN ORDER : ", list_is_ordered(list) ? " " : "NOT"); while (list) { printf("%d ", list->value); list = list->next; } printf("\n"); } static void list_make_node_t(node_t **list, int n) { node_t *nnode = malloc(sizeof(node_t)); nnode->next = *list; nnode->value = n; *list = nnode; } static void list_free(node_t **list) { while ((*list)->next) { node_t *tmp = *list; list = &tmp->next; free(tmp); } free(*list); } int main(int argc, char **argv) { size_t count = 20; node_t *list = NULL; while (count--) list_make_node_t(&list, random() % 1024); list_display(list); quicksort(&list); list_display(list); if (!list_is_ordered(list)) return EXIT_FAILURE; list_free(&list); return EXIT_SUCCESS; } ``` #### outcome: ```shell $ ./list NOT IN ORDER : 1016 84 706 124 326 483 763 498 939 186 205 809 236 74 255 81 115 105 966 359 IN ORDER : 74 81 84 105 115 124 186 205 236 255 326 359 483 498 706 763 809 939 966 1016 ``` --- ## random The test program uses random, but after running it a few times, we will find that the output is similar ```shell $ ./list NOT IN ORDER : 1016 84 706 124 326 483 763 498 939 186 205 809 236 74 255 81 115 105 966 359 IN ORDER : 74 81 84 105 115 124 186 205 236 255 326 359 483 498 706 763 809 939 966 1016 $ ./list NOT IN ORDER : 1016 84 706 124 326 483 763 498 939 186 205 809 236 74 255 81 115 105 966 359 IN ORDER : 74 81 84 105 115 124 186 205 236 255 326 359 483 498 706 763 809 939 966 1016 $ ./list NOT IN ORDER : 1016 84 706 124 326 483 763 498 939 186 205 809 236 74 255 81 115 105 966 359 IN ORDER : 74 81 84 105 115 124 186 205 236 255 326 359 483 498 706 763 809 939 966 1016 ``` It can be known from the description of [`rand(3)`](https://linux.die.net/man/3/random) > The random() function uses a nonlinear additive feedback random number generator employing a default table of size 31 long integers to return successive pseudo-random numbers in the range from 0 to RAND_MAX. The period of this random number generator is very large, approximately $16 * (2^{31} - 1)$. > > The srandom() function sets its argument as the seed for a new sequence of pseudo-random integers to be returned by random(). `These sequences are repeatable by calling srandom() with the same seed value.` If no seed value is provided, the random() function is automatically seeded with a value of 1. random() will get the value from the random number table, because we didn't use srandom() to set any seed value, so the resulting sequence will be the same. Here we set srandom(time(NULL)) --- [time(2)](https://linux.die.net/man/2/time) > time() returns the time as the number of seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). ### PRNG Next we try to introduce a pseudo random number generator. The algorithm I implemented here is [Mersenne twister](https://en.wikipedia.org/wiki/Mersenne_Twister) (Mersenne twist method) #### 原理 參考 [Mersenne twister 論文](http://www.math.sci.hiroshima-u.ac.jp/m-mat/MT/ARTICLES/mt.pdf) Mersenne twister was developed by by Makoto Matsumoto and Takuji Nishimura in 1997. It is a new algorithm to solve the low quality of random numbers generated by PRNG in the past. Among them, the algorithm uses the most stringent index k-distributed to v-bit accuracy of PRNG. Assume that a PRNG can generate a $w\ bits$ random number sequence $x_i$ with a period of $P$. If the following conditions are met, it will be called k-distributed to v-bit accuracy. The number formed by the first $v\ bits$ of $\vec{x}$ is called $trunc_v(\vec{x})$. Now consider the $kv-bits$ binary sequence with period $P$ $$ (trunc_v(\vec{x_i}), trunc_v(\vec{x}_{i+1}),\cdots,trunc_v(\vec{x}_{i+k-1}))_{(1 \le i \lt P)} $$ This sequence will produce $2^{kv}$ different values, and all values except $0$ have the same probability of appearing in $P$ time, all are $\dfrac{P}{2^{kv}}$, $0 $ is $\dfrac{P}{2^{kv}} - 1$ . Considering that $k$ has an upper limit, for each $v = 1,2,\cdots,w$ there must be a maximum $k=k(v)$ that makes it k-distributed to v-bit accuracy. According to the definition, in the period of $P$, there will be at most $P$ combinations, so according to the definition, $2^{k(v)v} - 1 \le P$ can be known. From the definition point of view, the MT algorithm is very good. As a 32 bits PRNG, its period is $P = 2^{19937} - 1$, 623-distributed to 32-bit accuracy. From the above definition, $\dfrac{19937}{32} = 623$ has reached the theoretical limit $k(v)$. The mathematical definition is mainly divided into two parts, rotation (twist) and extraction (tempering). The rotation part mainly operates according to the following formula: $$ \vec{x}_{k+n} = \vec{x}_{k+m} \oplus(\vec{x}^{(u)}_{k}|\vec{x}^{(l)}_{k+1}){\bf A} $$ $\vec{x}^{(u)}$ is composed of the highest $w - r$ bits of $\vec{x}$ $\vec{x}^{(l)}$ is the lowest $r$ bits composition of $\vec{x}$ ${\bf A}$ is defined in the paper as $$ \begin{pmatrix} & 1\\ & & 1\\ & & &\ddots\\ & & & &1\\ a_{w-1} & a_{w-2} & a_{w-3} & \cdots & a_0 \end{pmatrix} $$ So we can know $$ \vec{x}{\bf A} = \left\{ \begin{array}{r} \vec{x} >> 1 \ \ \ \ \ \ \ \ \ \ \ \ if\ x_0 = 0 \\ (\vec{x} >> 1)\oplus \vec{a} \ \ \ if\ x_0 = 0 \end{array} \right. $$ The meaning of being called rotation is that the above formula can be regarded as a series of Linear Feedback Shifting Registers (LFSR for short), which simply means `given the output of the previous state, the linear function of the output is used as the shift of the input Scratch register`, for example, suppose a 4-level LFSR takes the result from the lowest bit every time, and then XORs with the highest bit and the second lowest bit to fill in the shifted highest bit ![](https://i.imgur.com/SwijHoW.png) If the initial state is $(1000)_2$ ``` 1000 1100 1110 0111 0011 0001 1000 # 復原 ``` This constructs an LFSR with a period length of 6. And this kind of bit operation cycle is like rotation, the above formula is equivalent to an LFSR so this is why it is called rotation. References: * [Mersenne Twister Algorithm](https://liam.page/2018/01/12/Mersenne-twister/) Rotation gives us a long cycle, and then we can extract (tempering) output from the result of each rotation. Just multiply the output obtained by each rotation to the right by an invertible matrix. The approach in the paper is $$ \begin{split} &{\bf y} \impliedby {\bf x} \oplus({\bf x >> u}) \\ &{\bf y} \impliedby {\bf y} \oplus(({\bf y << s})\ {\bf AND}\ {\bf b})\\ &{\bf y} \impliedby {\bf y} \oplus(({\bf y << t})\ {\bf AND}\ {\bf c})\\ &{\bf z} \impliedby {\bf y} \oplus({\bf y >> l}) \end{split} $$ In this way, the output $z$ of the current rotation can be extracted, where $u, s, t, l$ are integer parameters, and $b, c$ are 32-bits bit masks. #### Implementation Initialize the generator first, and the seed will be the first item ```c // Mersenne twister void init(int seed) { mt[0] = seed; for (int i = 1; i < MT_SIZE; i++) { mt[i] = (int32_t)1812433253 * (mt[i - 1] ^ mt[i - 1] >> 30) + i; } } ``` The rotating part is: ```c if (mti == 0) { //twist for (int i = 0; i < MT_SIZE; i++) { int y = (int32_t) (mt[i] & 0x80000000) + (mt[(i + 1) % 624] & 0x7fffffff); mt[i] = (y >> 1) ^ mt[(i + 397) % 624]; if (y % 2 != 0) mt[i] = mt[i] ^ 0x9908b0df; } } ``` The extracted part is: ```shell int32_t res = mt[mti]; res = res ^ res >> 11; res = res ^ res << 7 & 2636928640; res = res ^ res << 15 & 4022730752; res = res ^ res >> 18; ``` #### Outcome: ```shell $ ./list seed: 1615270328 NOT IN ORDER : 988 226 495 854 553 558 852 210 389 962 921 986 718 964 332 576 1002 621 505 74 IN ORDER : 74 210 226 332 389 495 505 553 558 576 621 718 852 854 921 962 964 986 988 1002 ``` #### Flaws Since Mersenne twister is not a cryptographically secure random number (CSPRNG), the algorithm of rotation and extraction is based on bit wise operation and can be regarded as an LFSR, so I speculate that it should be possible to push back the current state by using bit wise operation. Use state to get the next sequence. ### CryptMT Here is a brief description of the steps in [CryptMT](https://www.ecrypt.eu.org/stream/ciphers/cryptmtfubuki/cryptmtfubuki.pdf) on how to generate random numbers: 1. Generate one pseudorandom word $gen_rand$ by MT. 2. Multiply it to accumulate: $$ accum ← accum × (gen\_rand\ | \ 1). $$ 3. Output the most significant 8 bits of $accum$. Go to Step 1 Among them, $accum$ will be initialized to 1 at the beginning, and $(gen\_rand\ | \ 1)$ This step is to make the multiplier an odd number, otherwise $accum$ will become 0 after several calculations, and for For added security, the first 64 bytes of output will be discarded. Since Mersenne twister is a PRNG based on LFSR, it can still calculate the law when there are enough samples. In order to achieve the strength of CSPRNG in CryptMT, the previous $\frac34$ bits are discarded in the last step because If the full length of $accum$ is used, the attacker can still find out the law of MT in it by judging the change of $accum$, so as to judge the number that will be generated in the future. Using the MSB as the output is also an important step, since the LSB is always 1, and the accumulated second bit will be consistent with the sum of the second bit output so far, the attacker can use this to deduce the internal state of the MT. From the above, one can know the changes made to achieve CSPRNG. #### Implementation Since the output of CryptMT is only 8 bits I will combine the four outputs to create a brand new 32 bits unsigned integer. The steps are very simple, first `init_cryptMT()` discards the first 64 outputs ```c void init_cryptMT() { for(int i = 0; i < 64; i++){ accum *= (exract_random() | 0x1); } } ``` Immediately afterwards, each output will be stored in the last 8 bits of the variable `num`, and `num` will be the result we want after 4 consecutive times. ```c uint32_t getCryptMT() { uint32_t num = 0; for(int i = 0; i < 4; i++){ accum *= (exract_random() | 0x1); num = num << 8; num |= (accum >> 24); } return num; } ``` **Outcome** ```c jhan1998@jhan1998-MacBookPro:~/linux2021/linked-list-sort$ ./list NOT IN ORDER : 212 741 358 410 250 117 876 302 627 715 534 983 748 299 87 49 26 992 691 782 IN ORDER : 26 49 87 117 212 250 299 302 358 410 534 627 691 715 741 748 782 876 983 992 ``` ## Locality When watching [你所不知道的 C 語言:函式呼叫篇](https://hackmd.io/@sysprog/c-function?type=view), it mentioned that the program will be configured more than required when malloc There is also a large space to prevent errors, so I am curious whether this will affect the sorting time of the linked list. Let’s take a look at the difference between the size of `node_t` in `list.c` and the actual configuration space: **sizeof(node_t) :** ```c sizeof(node_t) : 16 ``` **Allocated size :** ```c NOT IN ORDER : 986 0x5643c388a8d0 817 0x5643c388a8b0 404 0x5643c388a890 136 0x5643c388a870 541 0x5643c388a850 136 0x5643c388a830 479 0x5643c388a810 745 0x5643c388a7f0 437 0x5643c388a7d0 315 0x5643c388a7b0 345 0x5643c388a790 45 0x5643c388a770 628 0x5643c388a750 865 0x5643c388a730 48 0x5643c388a710 708 0x5643c388a6f0 192 0x5643c388a6d0 942 0x5643c388a6b0 898 0x5643c388a690 667 0x5643c388a670 ``` Since each node has a difference of 0x20, 16 bits of space will be allocated to the node, so there is Internal Fragmentation. Change to the space required for a configuration and then use the indicator operation to initialize the node: ```c NOT IN ORDER : 15 0x55963b731260 217 0x55963b731270 259 0x55963b731280 773 0x55963b731290 278 0x55963b7312a0 700 0x55963b7312b0 229 0x55963b7312c0 936 0x55963b7312d0 662 0x55963b7312e0 364 0x55963b7312f0 176 0x55963b731300 902 0x55963b731310 192 0x55963b731320 904 0x55963b731330 331 0x55963b731340 465 0x55963b731350 519 0x55963b731360 296 0x55963b731370 598 0x55963b731380 226 0x55963b731390 ``` The beginning of each node is exactly 16bits apart, which is exactly the size of a node. It should be noted here that because all the space is configured at one time, you cannot free the nodes one by one slowly. You must first remember that the starting position is the last time to free all the configured spaces, otherwise a double free error will occur. Let's test to see if there is any difference between the two: ![](https://i.imgur.com/hoNzpsR.png) There is actually no difference in the test, because the space of Fragmentation is not large, so the gap caused by locality will not be very significant, so in fact, just follow the usual habits to malloc. ## Optimized QuickSort — C Implementation (Non-Recursive) The code implemented in the article is as follows: ```c void quickSort(int *arr, int elements) { #define MAX_LEVELS 300 int piv, beg[MAX_LEVELS], end[MAX_LEVELS], i=0, L, R, swap ; beg[0]=0; end[0]=elements; while (i>=0) { L=beg[i]; R=end[i]-1; if (L<R) { piv=arr[L]; while (L<R) { while (arr[R]>=piv && L<R) R--; if (L<R) arr[L++]=arr[R]; while (arr[L]<=piv && L<R) L++; if (L<R) arr[R--]=arr[L]; } arr[L]=piv; beg[i+1]=L+1; end[i+1]=end[i]; end[i++]=L; if (end[i]-beg[i]>end[i-1]-beg[i-1]) { swap=beg[i]; beg[i]=beg[i-1]; beg[i-1]=swap; swap=end[i]; end[i]=end[i-1]; end[i-1]=swap; }} else { i--; }}} ``` Analyzing the following code, we can find that in order not to perform recursive calls, two new arrays, beg and end, have been added to represent the state of the stack. Each loop is equivalent to a new layer of calls. After knowing the current state from beg and end, proceed A new round of sorting, and finally when the stack is finished, the sorting is complete. Since the article is implemented with a sorted array, two arrays are needed to record the start and end, but linked-list can use traversal to know the end, so the implementation here can only use one array to throw the list into it. This is the implementation: ```c #define MAX_LEVELS 300 void quicksort_itr(node_t **list) { if (!*list || !(*list)->next) return; node_t *stack[MAX_LEVELS]; int stack_top = 0; stack[stack_top] = *list; node_t *result = NULL, *L = NULL, *R = NULL; while (stack_top >= 0) { L = stack[stack_top]; R = get_list_tail(&L); if(L != R){ node_t *pivot = L; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value >= value ? &right : &left, n); } stack[stack_top++] = left; stack[stack_top++] = pivot; stack[stack_top] = right; } else { if (L) list_add_node_t(&result, L); stack_top--; } } *list = result; } ``` The program uses the stack array to store the sorted state to avoid recursive calls, and initially puts the list into stack[0]. ```c node_t *stack[MAX_LEVELS]; int stack_top = 0; stack[stack_top] = *list; node_t *result = NULL; ``` The next division is the same as the original quicksort. Here we put the left under the stack first, so that we can prioritize the list on the right when processing, and we can also put the pivot here to effectively reduce the consumption of concat. ```c node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value >= value ? &right : &left, n); } stack[stack_top++] = left; stack[stack_top++] = pivot; stack[stack_top] = right; ``` Finally, judge whether there is an element left on the right, and if so, start to collect the elements that have been divided in the stack by result. ```c else { if (L) list_add_node_t(&result, L); stack_top--; } ``` If we want to do the part on the left side of the list first, we can also exchange the order of the satck: ```c stack[stack_top++] = right; stack[stack_top++] = pivot; stack[stack_top] = left; ``` Let left be done first on the stack, and also change `list_add_node_t(&result, L)` to `list_concat(&result, L)`. Now use three different quicksorts for analysis: ![](https://i.imgur.com/zWAb3WR.png) ![](https://i.imgur.com/b7lGUDO.png) We can see that doing the right half first will have a little better performance than ordinary quicksort, but doing the right half first will seriously affect the performance. I think the reason is the relationship between `list_add_node_t(&result, L)` and `list_concat(&result, L)`, from the code point of view: **list_add_node_t** ```c static inline void list_add_node_t(node_t **list, node_t *node_t) { node_t->next = *list; *list = node_t; } ``` **list_concat** ```c static inline void list_concat(node_t **left, node_t *right) { while (*left) left = &((*left)->next); *left = right; } ``` We can clearly know that `list_add_node_t()` is placed in the front, so the time complexity is $O(1)$, but the time complexity of `list_concat()` will reach $O(N)$, so there is the above gap. ## Difference between linux's linked list and linux-list ### linux-list The list in linux-list is a two-way circular list implementation that starts with a head and connects the elements below to the element at the end, and then connects back to the head. ```c static inline void list_add(struct list_head *node, struct list_head *head) { struct list_head *next = head->next; next->prev = node; node->next = next; node->prev = head; head->next = node; } ``` It can be seen from `list_add()` that if the list is empty, adding an element will be connected back to the head. Next look at quicksort itself: ```c static void list_qsort(struct list_head *head) { struct list_head list_less, list_greater; struct listitem *pivot; struct listitem *item = NULL, *is = NULL; if (list_empty(head) || list_is_singular(head)) return; INIT_LIST_HEAD(&list_less); INIT_LIST_HEAD(&list_greater); pivot = list_first_entry(head, struct listitem, list); list_del(&pivot->list); list_for_each_entry_safe (item, is, head, list) { if (cmpint(&item->i, &pivot->i) < 0) list_move_tail(&item->list, &list_less); else list_move(&item->list, &list_greater); } list_qsort(&list_less); list_qsort(&list_greater); list_add(&pivot->list, head); list_splice(&list_less, head); list_splice_tail(&list_greater, head); } ``` The part of the algorithm is the same as the quicksort of quiz1. Here, `list_less` and `list_greater` are equivalent to left and right. They are first divided and then recursively called to merge. The explanation of the api is as follows: 1. list_move_tail : To move an element to the back of another list, the element will be removed from the original list first. 2. list_move: To move an element to the front of another list, the element will be removed from the original list first. 3. list_splice: Move a list to the front of another list. 4. list_splice_tail: Move a list to the back of another list. ### Linux's linked list Referring to [linux/include/linux/list.h](https://github.com/torvalds/linux/blob/master/include/linux/list.h), The implementation in it is actually similar to linux-list, But the api in list.h will wrap many layers at a time, often the top-level api will then call the lower api to operate according to the situation, for example: ```c static inline void __list_add(struct list_head *new, struct list_head *prev, struct list_head *next) { if (!__list_add_valid(new, prev, next)) return; next->prev = new; new->next = next; new->prev = prev; WRITE_ONCE(prev->next, new); } ``` An add action will implement the basic `__list_add()` top-level other add apis to call `list_add` and `list_add_tail` according to different needs. ```c static inline void list_add(struct list_head *new, struct list_head *head) { __list_add(new, head, head->next); } static inline void list_add_tail(struct list_head *new, struct list_head *head) { __list_add(new, head->prev, head); } ``` There is also the application of WRITE_ONCE, according to the annotations of [linux/include/asm-generic/rwonce.h](https://github.com/torvalds/linux/blob/master/include/asm-generic/rwonce.h) > Prevent the compiler from merging or refetching reads or writes. The compiler is also forbidden from reordering successive instances of READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some particular ordering. One way to make the compiler aware of ordering is to put the two invocations of READ_ONCE or WRITE_ONCE in different C statements. These two macros will also work on aggregate data types like structs or unions. Their two major use cases are: (1) Mediating communication between process-level code and irq/NMI handlers, all running on the same CPU, and (2) Ensuring that the compiler does not fold, spindle, or otherwise mutilate accesses that either do not require ordering or that interact with an explicit memory barrier or atomic instruction that provides the required ordering. ### 改寫 linux-list Rewrite based on optimize quicksort ```c #define MAX_LEVEL 300 static void list_qsort(struct list_head *head) { if (list_empty(head) || list_is_singular(head)) return; struct list_head list_less, list_greater; struct list_head stack[MAX_LEVEL]; struct listitem *pivot; int top = 0; for(int i = 0; i < MAX_LEVEL; i++) INIT_LIST_HEAD(&stack[i]); list_splice_init(head, &stack[top]); struct list_head stack_top; INIT_LIST_HEAD(&stack_top); INIT_LIST_HEAD(&list_less); INIT_LIST_HEAD(&list_greater); while (top >= 0) { list_splice_init(&stack[top], &stack_top); if (!list_empty(&stack_top) && !list_is_singular(&stack_top)) { pivot = list_first_entry(&stack_top, struct listitem, list); list_del(&pivot->list); struct listitem *item = NULL, *is = NULL; list_for_each_entry_safe (item, is, &stack_top, list) { if (cmpint(&item->i, &pivot->i) <= 0) list_move(&item->list, &list_less); else list_move(&item->list, &list_greater); } list_splice_init(&list_less, &stack[top++]); list_move_tail(&pivot->list, &stack[top++]); list_splice_init(&list_greater, &stack[top]); INIT_LIST_HEAD(&stack_top); } else { list_splice_init(&stack_top, head); top--; } } } ``` Basically, the above writing method is the same. What should be noted here is the importance of `init`. After passing the element to other lists, we must `init` first, otherwise a segmentation fault will occur in the next operation. This is really a debug for a while. In addition, since `pivot` is a node and not a head, here we use `list_move_tail()` to connect to the stack, otherwise the pivot will be lost. **Outcome:** ```c CC tests/qsort_itr.o LD tests/qsort_itr *** Validating tests/qsort_itr *** in order [ Verified ] ```