# 2021q1 Homework1 (quiz1)
contributed by < `tigger12613` >
> [2021q1 第 1 週測驗題](https://hackmd.io/@sysprog/linux2021-quiz1)
> [GitHub](https://github.com/tigger12613/quiz1)
## 作業要求
- [x] 1. 解釋程式運作原理,搭配 Graphviz,比照 Linked List 練習題 在 HackMD 筆記上視覺化展現
- [x] 1.1 測試程式使用到 random,多執行幾次可發現輸出結果相仿,請修正,並引入其他 Pseudorandom number generator
- [x] 2. 參考 [Optimized QuickSort — C Implementation (Non-Recursive)](https://alienryderflex.com/quicksort/) 並重寫上述 quick sort 程式碼,避免使用遞迴呼叫
- [ ] 3. Linux 核心內部也有 linked list 實作,但是 circulr doubly-linked list,[linux-list](https://github.com/sysprog21/linux-list) 仿效 Linux 核心的實作並予以簡化,在 examples/ 目錄提供 quick sort 實作,請探討 Linux 的 linked list 和上述程式碼的落差,並改寫 linux-list 的 quick sort 範例,避免使用遞迴呼叫
- [ ] 4. 研讀 Ching-Kuang Shene (冼鏡光) 教授撰寫的 A Comparative Study of Linked List Sorting Algorithms,思考高效率的 linked list 排序演算法,並落實於上述 (3) 的程式碼中
## 在電腦上重現程式
quiz 缺少兩個函式,予以實作
```cpp
static node_t *list_make_node_t(node_t *list, long num) {
node_t *tmp = malloc(sizeof(node_t));
if (!tmp) {
return list;
}
tmp->value = num;
tmp->next = list;
return tmp;
}
static void list_free(node_t **list) {
node_t *tmp;
while (*list) {
tmp = *list;
*list = (*list)->next;
free(tmp);
}
}
```
運作正常
``` shell
$ valgrind --leak-check=full ./link_list
==45637== Memcheck, a memory error detector
==45637== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==45637== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==45637== Command: ./link_list
==45637==
NOT IN ORDER : 1016 84 706 124 326 483 763 498 939 186 205 809 236 74 255 81 115 105 966 359
IN ORDER : 74 81 84 105 115 124 186 205 236 255 326 359 483 498 706 763 809 939 966 1016
==45637==
==45637== HEAP SUMMARY:
==45637== in use at exit: 0 bytes in 0 blocks
==45637== total heap usage: 21 allocs, 21 frees, 1,344 bytes allocated
==45637==
==45637== All heap blocks were freed -- no leaks are possible
==45637==
==45637== For lists of detected and suppressed errors, rerun with: -s
==45637== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
```
## 程式運作過程
### Main
產生長度為 20 的 linked list 並隨機給予 0 ~ 1023 的數值,列印 linked list ,排序,再列印 linked list ,確認排序是否正確,free linked list
```cpp
int main(int argc, char **argv) {
size_t count = 20;
node_t *list = NULL;
while (count--)
list = list_make_node_t(list, random() % 1024);
list_display(list);
quicksort(&list);
list_display(list);
if (!list_is_ordered(list))
return EXIT_FAILURE;
list_free(&list);
return EXIT_SUCCESS;
}
```
### List add node
將節點插入 list 的頭部
```cpp
static inline void list_add_node_t(node_t **list, node_t *node_t) {
node_t->next = *list;
*list = node_t;
}
```
```graphviz
digraph a {
rankdir=LR;
node[shape=box];
"node"->"head"
"head"->"..."
"..."->"tail"
}
```
### List concat
將 right list 鏈結在 left list 後面
```cpp
static inline void list_concat(node_t **left, node_t *right) {
while (*left)
left = &((*left)->next);
*left = right;
}
```
```graphviz
digraph a {
rankdir=LR;
node[shape=box];
node1 [label = "..."]
"left"->"..."
"..."->"left tail"
"left tail" -> "right"[color = red label="concat"]
"right"->node1
node1->"right tail"
}
```
### Quick sort
```cpp=
void quicksort(node_t **list)
{
if (!*list)
return;
node_t *pivot = *list;
int value = pivot->value;
node_t *p = pivot->next;
pivot->next = NULL;
node_t *left = NULL, *right = NULL;
while (p) {
node_t *n = p;
p = p->next;
list_add_node_t(n->value > value ? &right : &left, n);
}
quicksort(&left);
quicksort(&right);
node_t *result = NULL;
list_concat(&result, left);
list_concat(&result, pivot); list_concat(&result, right);
*list = result;
}
```
這段用於把 pivot 拆出來
```cpp=6
node_t *pivot = *list;
int value = pivot->value;
node_t *p = pivot->next;
pivot->next = NULL;
```
while loop 把比 pivot 大的放到 right list 上面,小或等於的放到 list list 上面
```cpp=12
while (p) {
node_t *n = p;
p = p->next;
list_add_node_t(n->value > value ? &right : &left, n);
}
```
```graphviz
digraph sort{
rankdir=LR;
node[shape=box];
right
left
pivot
n1 [label = "> pivot"]
n2 [label = " <= pivot"]
right-> n1
left->n2
}
```
接下來排序 left list and right list ,然後接起來
```cpp=18
quicksort(&left);
quicksort(&right);
node_t *result = NULL;
list_concat(&result, left);
list_concat(&result, pivot); list_concat(&result, right);
*list = result;
```
```graphviz
digraph sort{
rankdir=LR;
node[shape=box];
right
result
pivot
{
rank = same
left -> lleft
}
lright [label = " sorted and > pivot"]
lleft [label = " sorted and <= pivot"]
{
rank = same
right-> lright
}
lleft->pivot [label = "concat" color = "red"]
pivot->lright [label = "concat" color = "red"]
result->lleft
}
```
### 隨機種子
加上隨機種子在使用 random() 之前就可以每一次都產生不同的隨機數
```cpp
srand(time(NULL));
```
PRNG 使用初始值產生一串帶有隨機性質的數字序列,如果初始值相等,序列也會相等,因此不是真隨機。在使用時使用 time 作為初始值可以讓初始值看似隨機數,產生的數字序列也會看起來隨機。
:::warning
倘若 time 函式得到的時間數值是可預測的話,PRNG 的輸出也可預測,該如何改進?留意到 entropy 相關討論
:notes: jserv
:::
## Iterative quicksort
新增一個 stack 用於紀錄還未排序的 linked list
sorted 紀錄已經排序好的範圍
每次 loop 從 stack 中最右邊的 linked list 進行切分,如果只有一個節點,代表這個節點是 stack 上數值最大的節點,並且小於等於 sorted 鏈上的每一個節點,所以將其放到 sorted 鏈 的第一個上,然後從 stack 上移除($i--$),持續重複這個過程直到 stack 為空。
```cpp
void quicksort_iterative(node_t **list)
{
#define MAX_LEVELS 1000
if (!*list)
return;
int i = 0, j = 0;
node_t *stack[MAX_LEVELS];
node_t *sorted;
sorted = NULL;
stack[0] = *list;
while(i>=0 && i<MAX_LEVELS){
if(!stack[i]){
i--;
continue;
}
if(!stack[i]->next){
stack[i]->next = sorted;
sorted = stack[i];
i--;
continue;
}
node_t *pivot = stack[i];
int value = pivot->value;
node_t *p = pivot->next;
pivot->next = NULL;
node_t *left = NULL, *right = NULL;
while (p) {
node_t *n = p;
p = p->next;
list_add_node_t(n->value > value ? &right : &left, n);
}
stack[i] = left;
stack[i+1] = pivot;
stack[i+2] = right;
i +=2;
}
*list = sorted;
}
```
接著對效能測試,重複 5000 次對長度 1000 的 list 做排序
iterative 版本快了46% 的速度
| recursive version | iterative version |
| ----------------- | ----------------- |
| 0.832864 sec | 0.567636 sec |
`MAX_LEVELS`的數值選擇應該考慮到 worst case 的狀況,如果每一次的 pivot 都選到最小的那一個 也就是說 left 都為 NULL ,也就是需要 $2 \times (n-1)$ 個位置去儲存 linked list node
```graphviz
digraph {
node [color=black shape=none margin=0 height=.3 ]
values [label=<
<table border="0" cellborder="1" cellspacing="0">
<tr>
<td port="f0" bgcolor = "lightyellow"> NULL </td>
<td port="f1" bgcolor = "lightyellow"> pivot1 </td>
<td port="f2" bgcolor = "lightyellow"> NULL </td>
<td port="f3" bgcolor = "lightyellow"> pivot2 </td>
<td port="f4" bgcolor = "lightyellow"> ... </td>
<td port="f5" bgcolor = "lightyellow"> NULL </td>
<td port="f6" bgcolor = "lightyellow"> pivot_end </td>
<td port="f7" bgcolor = "lightyellow"> right_end </td>
</tr>
</table>
>];
"stack"->values:f0
{ rank=same; "stack"; values }
}
```
:::warning
`MAX_LEVELS` 數值設定的依據為何?做數學分析來指定更適切的數值。
:notes: jserv
:::