# 2022q1 Homework1 (lab0)
contribute by <`chiangkd`>
###### tags: `Linux Kernel 2022 spring` `linux2022`
## 實驗環境
```
$gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
$lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 39 bits physical, 48 bits virtual
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
Stepping: 9
CPU MHz: 2800.000
CPU max MHz: 3800.0000
CPU min MHz: 800.0000
BogoMIPS: 5599.85
L1d cache: 128 KiB
L1i cache: 128 KiB
L2 cache: 1 MiB
L3 cache: 6 MiB
NUMA node0 CPU(s): 0-7
```
## 作業要求
根據以提供界面完成`queue.c`
[K01: lab0](https://hackmd.io/@sysprog/linux2022-lab0#K01-lab0)
- [x] 研讀 Linux 核心原始程式碼的 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 並嘗試引入到 lab0-c 專案,比較你自己實作的 merge sort 和 Linux 核心程式碼之間效能落差
- [x] 在 `qtest` 提供新的命令 `shuffle`,允許藉由 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) 演算法,對佇列中所有節點進行洗牌 (shuffle) 操作
- [ ] 在 `qtest` 提供新的命令 `web`,提供 web 伺服器功能,注意: web 伺服器運作過程中,`qtest` 仍可接受其他命令
- [ ] 可依據前述說明,嘗試整合 [tiny-web-server](https://github.com/7890/tiny-web-server)
---
- [x] `q_new`: 建立新的「空」佇列;
- [x] `q_free`: 釋放佇列所佔用的記憶體;
- [x] `q_insert_head`: 在佇列開頭 (head) 插入 (insert) 給定的新節點 (以 LIFO 準則);
- [x] `q_insert_tail`: 在佇列尾端 (tail) 插入 (insert) 給定的新節點 (以 FIFO 準則);
- [x] `q_remove_head`: 在佇列開頭 (head) 移去 (remove) 給定的節點;
- [x] `q_release_element`: 釋放特定節點的記憶體空間;
- [x] `q_size`: 計算目前佇列的節點總量;
- [x] `q_delete_mid`: 移走佇列中間的節點
- [x] `q_delete_dup`: 在已經排序的狀況,移走佇列中具備重複內容的節點
- [x] `q_swap`: 交換佇列中鄰近的節點
- [x] `q_reverse`: 以反向順序重新排列鏈結串列,該函式不該配置或釋放任何鏈結串列節點,換言之,它只能重排已經存在的節點
- [x] `q_sort`: 以==遞增順序==來排序鏈結串列的節點
---
## 開發過程
**結構體定義**
>queue.h
```c
typedef struct {
char *value;
struct list_head list;
} element_t;
```
>list.h
```c
struct list_head {
struct list_head *prev;
struct list_head *next;
};
```
### q_new()
```c
struct list_head *q_new()
{
struct list_head* q_empty = malloc(sizeof(struct list_head)); // Initialization
if(!q_empty) // allocate failed
return NULL;
q_empty->prev = q_empty; // point to itself
q_empty->next = q_empty; // point to itself
return q_empty;
}
```
根據C語言規格書==7.20.3 Memory management function==
>If the space cannot be allocated, a null pointer isreturned. If the size of the space requested is zero, the behavior is [implementation-defined](https://en.wikipedia.org/wiki/Unspecified_behavior#Implementation-defined_behavior): either a null pointer is returned, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.
[Under what circumstances can malloc return NULL?](https://stackoverflow.com/questions/9101597/under-what-circumstances-can-malloc-return-null) 中討論到在實作嵌入式系統時會有 exhaust memory 問題,在 [C Program on Linux to exhaust memory](https://stackoverflow.com/questions/1865501/c-program-on-linux-to-exhaust-memory) 有討論到 Linux 的 exhaust memory 操作
所以需要加上判斷是否分配成功的判斷式
### q_free()
```c
void q_free(struct list_head *l)
{
if (!l)
return;
struct list_head *next_node = l->next;
while (next_node->next != l) {
element_t *l_ele = container_of(next_node, element_t, list);
next_node = next_node->next;
q_release_element(l_ele);
}
free(l);
}
```
上述這段程式碼回報會有2個block沒有清乾淨
```
ERROR: Freed queue, but 2 blocks are still allocated
```
在閱讀了[這篇文章](https://blog.csdn.net/weixin_43329614/article/details/83420435)發現結構體的記憶體配置問題
看起來是新定義的 `next_node` 沒有 free 乾淨,重新檢閱程式碼之後發現 next_node 沒有必要性且增加清除記憶體的困難,改成以下
```c
void q_free(struct list_head *l)
{
if (!l)
return;
// struct list_head *next_node = l->next;
while (l->next != l) {
element_t *l_ele = container_of(l->next, element_t, list);
l->next = l->next->next;
q_release_element(l_ele);
}
free(l);
}
```
即可正常運行。
#### 利用巨集撰寫程式碼(較為簡潔)
```c
void q_free(struct list_head *l)
{
if (!l)
return;
// struct list_head *next_node = l->next;
element_t *curr, *next;
list_for_each_entry_safe(curr, next, l, list) {
list_del(&curr->list);
q_release_element(curr);
}
free(l); // free it because in function q_new has malloc
}
```
### q_insert_head()
```c
bool q_insert_head(struct list_head *head, char *s)
{
if (!head) {
return false;
}
element_t *new_ele = malloc(sizeof(element_t)); // element insert initial
if (!new_ele) { // allocate failed
return false;
}
size_t len = strlen(s) + 1;
new_ele->value = malloc(len * sizeof(char)); // plus 1 for '\0'
if (!new_ele->value) { // allocate failed
free(new_ele);
return false;
}
memcpy(new_ele->value, s, len);
list_add(&new_ele->list,head);
return true;
}
```
上述這段程式碼在 commit 的時候會產生 Memory leak 的 error
```
queue.c:56:5: error: Memory leak: new_ele.value [memleak]
return true;
^
```
將new_ele新增free之後即可解決問題
```c
if (!new_ele->value) { // allocate failed
free(new_ele);
return false;
}
```
:::info
此部份尚待釐清,暫時認為是這樣可以避免存取權遺失
:::
將冗長部份以 `list.h` 的函式取代
```c
list_add(&new_ele->list,head);
```
### q_insert_tail
```c
bool q_insert_tail(struct list_head *head, char *s)
{
if (!head)
return false;
element_t *new_ele = malloc(sizeof(element_t));
if (!new_ele) {
free(new_ele);
return false;
}
size_t len = strlen(s) + 1;
new_ele->value = malloc(len * sizeof(char));
memcpy(new_ele->value, s, len);
// insert tail
struct list_head *last_node = head->prev; // store last point
head->prev = &new_ele->list;
new_ele->list.prev = last_node;
new_ele->list.next = head;
last_node->next = &new_ele->list;
return true;
}
```
和 q_insert_head 概念差不多。
將
```c
head->prev = &new_ele->list;
new_ele->list.prev = last_node;
new_ele->list.next = head;
last_node->next = &new_ele->list;
```
改為
```c
list_add_tail(&new_ele->list, head);
```
### q_remove_head()
```c
element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
if (!head)
return NULL;
sp = malloc(bufsize); // allocate
element_t* re_ele = container_of(head->next,element_t,list);
memcpy(sp, re_ele->value, bufsize);
head->next = head->next->next;
head->next->prev = head;
return re_ele;
}
```
上述程式碼雖然可以成功移除 head,不過在只剩下一個元素時使用 `rh` 指令會產生error
`ERROR: Failed to store removed value`
檢閱 `qtest.c` 程式碼 `do_remove` 的判斷條件
```c
if (!is_null) {
// q_remove_head and q_remove_tail are not responsible for releasing
// node
q_release_element(re);
removes[string_length + STRINGPAD] = '\0';
if (removes[0] == '\0') {
report(1, "ERROR: Failed to store removed value");
ok = false;
}
```
將程式碼改為使用 `strncpy` 並新增跳出條件避免程式 crash
```c
element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
if (!head || (head->next == head || head->prev == head))
return NULL;
element_t *re_ele = container_of(head->next, element_t, list);
strncpy(sp, re_ele->value, bufsize - 1);
sp[bufsize - 1] = '\0';
head->next = head->next->next;
head->next->prev = head;
return re_ele;
}
```
用 `list_del` 取代冗長部份並新增 `bufsize` 條件
```c
element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
if (!head || list_empty(head))
return NULL;
element_t *re_ele = container_of(head->next, element_t, list);
list_del(head->next);
if(bufsize) {
strncpy(sp, re_ele->value, bufsize - 1);
sp[bufsize - 1] = '\0';
}
return re_ele;
}
```
### q_remove_tail()
和 `q_remove_head()` 概念雷同
```c
element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize)
{
if (!head || (head->next == head || head->prev == head))
return NULL;
element_t *re_ele = container_of(head->prev, element_t, list);
strncpy(sp, re_ele->value, bufsize - 1);
sp[bufsize - 1] = '\0';
head->prev = head->prev->prev;
head->prev->next = head;
return re_ele;
}
```
### q_size()
透過 while loop 走訪整個 queue 計算
```c
int q_size(struct list_head *head)
{
int num = 0;
if (!head)
return num;
struct list_head *next = head->next;
while(next != head) {
next = next->next;
num++;
}
return num;
}
```
### q_delete_mid()
利用快慢指標以及 indirect pointer ,在找到終點後跳出 while 迴圈,藉由 while 迴圈下一行跳過將 next 的指向跳過中間點
```c
bool q_delete_mid(struct list_head *head)
{
if (!head || list_empty(head))
return false;
struct list_head **indir;
indir = &head;
struct list_head *fast = head;
while(!(fast->next == head || fast->next->next == head)) {
indir = &(*indir)->next;
fast = fast->next->next;
}
q_release_element(list_entry((*indir)->next, element_t, list));
(*indir)->next = (*indir)->next->next;
return true;
}
```
#### Iteration Process
```graphviz
digraph del_mid {
node [shape=box]
rankdir=LR
// backbone
node1 -> node2 -> node3 -> node4
head->node1
// mark
indir [shape=plaintext;label="indir"]
indir->head
fast [shape=plaintext;label="fast"]
fast->node1
}
```
```graphviz
digraph del_mid {
node [shape=box]
rankdir=LR
// backbone
node1 -> node2 -> node3 -> node4
head->node1
// mark
indir [shape=plaintext;label="indir"]
indir->node1
fast [shape=plaintext;label="fast"]
fast->node3
}
```
#### Segmentation fault occurred
```
cmd> dm
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
Aborted (core dumped)
```
找了好久都沒有發現錯誤
直到看到[你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list?fbclid=IwAR2S6mLPVT7P8hu13lvSeurE2K86offyDvGjP2yYAYc57VOmkC9ForhizYM#%E4%BD%A0%E6%89%80%E4%B8%8D%E7%9F%A5%E9%81%93%E7%9A%84-C-%E8%AA%9E%E8%A8%80-linked-list-%E5%92%8C%E9%9D%9E%E9%80%A3%E7%BA%8C%E8%A8%98%E6%86%B6%E9%AB%94)以及臉書社團都有人討論,根據討論先用一個 `temp` 指向 `indir` 的next在進行 `list_del` 以及 `q_release_element`
```c
bool q_delete_mid(struct list_head *head)
{
if (!head || list_empty(head))
return false;
struct list_head **indir;
indir = &head;
struct list_head *fast = head;
while (fast->next!= head && fast->next->next != head) {
indir = &(*indir)->next;
fast = fast->next->next;
}
struct list_head* temp = (*indir)->next;
list_del((*indir)->next);
q_release_element(list_entry(temp, element_t, list));
return true;
// https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/
}
```
### q_delete_dup()
```c
bool q_delete_dup(struct list_head *head)
{
if (!head)
return false;
bool is_dup = false;
element_t *curr_ele; // element_t curr
element_t *next_ele; // next element_t
// define list_for_each_entry_safe(entry, safe, head, member)
list_for_each_entry_safe(curr_ele,next_ele,head,list) {
if (curr_ele->list.next != head) { // not recycle
if (strcmp(curr_ele->value, next_ele->value) == 0) {
list_del(&curr_ele->list);
q_release_element(curr_ele);
is_dup = true;
} else if (is_dup) {
list_del(&curr_ele->list);
q_release_element(curr_ele);
is_dup = false;
}
}
}
// https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/
return true;
}
```
一開始的寫法透過掃過整個 list 並釋放 list 以及 element_t
上述程式碼會產生 sementation fault
```
egmentation fault occurred. You dereferenced a NULL or invalid pointer
```
最後才發現這樣寫的邏輯是
1. 如果沒有繞完一圈(第一個if)
2. 判斷是否一樣(第二個if)並移除
3. 移除2.中遺留的最後一點
2.3.重複刪除了
把第二個 if 改為 else if 即解決
```c
else if(is_dup) {
list_del(&curr_ele->list);
q_release_element(curr_ele);
is_dup = false;
}
```
==修正==
上述寫法如果在 duplicate 出現在尾端的時候會出現錯誤(因為無法進到 `else if` 的區域),因此將第一個 if 和第二個 if 合併判斷
### q_swap()
```c
void q_swap(struct list_head *head)
{
if(!head)
return;
struct list_head *second;
second = head->next;
while(second!=head && second->next != head) {
struct list_head *temp = second;
list_move(second, temp->next);
second = second->next;
}
// https://leetcode.com/problems/swap-nodes-in-pairs/
}
```
原本寫的亂糟糟,參考了 [laneser](https://hackmd.io/@laneser/linux2022_lab0) 之後覺的想法很棒,又有看到 `list.h` 中有 `list_move` 可以取代 `list_del` 加上 `list_add` 的組合,故上方的 code 誕生
### q_reverse()
```c
void q_reverse(struct list_head *head)
{
if(!head)
return;
//struct list_head *temp;
// reverse prev and next;
struct list_head *node, *temp;
struct list_head *safe;
list_for_each_safe(node, safe, head) {
temp = node->next;
node->next = node->prev;
node->prev = temp;
}
// head is not reverse yet
temp = head ->next;
head->next = head->prev;
head->prev = temp;
}
```
將每個節點的 `next` 以及 `prev` 對調,起先用 `list_for_each` 函式進行結果一直有問題,之後發現是在迴圈內部交換節點會影響到下一圈迴圈的迭代,改用 `safe` 版本就可以解決
### q_sort()
這裡採用mergesort進行實做,參照了 [leetcode#148](https://leetcode.com/problems/sort-list/)
考慮到這次是雙向環狀 Linked list ,且多了一個 head (一開始寫的時候忘記有個 head ,一直抓不到 bugQQ )
將雙向環狀結構斷開,轉變為非環狀的結構,並且傳入 `head->next` 進去 `mergesort` 排序,如此一來寫法就和 [leetcode#148](https://leetcode.com/problems/sort-list/) 一樣了,只是在最後記得把雙向以及環狀的結構補回去。
**merge_recur**
- 利用快慢指標將 list 的中點找出來並用 recursion 遞迴到最後
```c
struct list_head* merge_recur(struct list_head *head)
{
if(!head->next)
return head;
struct list_head* slow = head;
// split list
for(struct list_head* fast = head->next;fast && fast->next; fast = fast->next->next) {
slow = slow->next;
}
struct list_head* mid = slow->next; // the start node of right part
slow->next = NULL;
struct list_head* left = merge_recur(head);
struct list_head* right = merge_recur(mid);
return merge_two_list(left, right);
}
```
**merge_two_list**
- 將兩個 list 排序並結合
- 一開始我在此處先進行 `prev` 的修補,但是如果這樣操作的話在下一圈的遞迴會進入 infinite loop ,所以改為在 `q_sort` 中做 `prev` 以及 circular structure 的修補,一方面增加可讀性,一方面避免遞迴錯誤
```c
struct list_head* merge_two_list(struct list_head* left, struct list_head* right)
{
struct list_head head;
struct list_head* h = &head;
if(!left && !right){
return NULL;
}
while(left && right) {
if(strcmp(list_entry(left, element_t, list)->value,
list_entry(right, element_t, list)->value) < 0) {
h->next = left;
left = left->next;
h = h ->next;
} else {
h->next = right;
right = right->next;
h = h ->next;
}
}
// after merge, there are still one node still not connect yet
if (left) {
h->next = left;
} else if (right) {
h->next = right;
}
return head.next;
}
```
**q_sort本人**
傳入 `head->next` 進行 merge sort 可以保留原本的 `head` ,在排序過後在進行 `prev` 以及 `circular` 的修補
```c
void q_sort(struct list_head *head)
{
// Leetcode #148 Sort List
// https://leetcode.com/problems/sort-list/
if (!head || head->next == head)
return;
// disconnect the circular structure
head->prev->next = NULL;
head->next = merge_recur(head->next);
// reconnect the list (prev and circular)
struct list_head *c = head, *n = head->next;
while(n) {
n->prev = c;
c = n;
n = n->next;
}
c->next = head;
head->prev = c;
}
```
### 操作修正&優化
完整寫完之後發現總分只有 `71/100`
```
# Test of insert_head, insert_tail, remove_head, remove_tail, and delete_mid
ERROR: Freed queue, but 4 blocks are still allocated
--- trace-02-ops 0/6
# Test of insert_head, insert_tail, delete duplicate, and sort
ERROR: Duplicate strings are in queue or distinct strings are not in queue
--- trace-06-ops 0/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
--- trace-09-robust 0/6
# Test operations on NULL queue
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
--- trace-10-robust 0/6
+++ TESTING trace trace-17-complexity:
# Test if time complexity of q_insert_tail, q_insert_head, q_remove_tail, and q_remove_head is constant
Probably constant time
Probably constant time
Testing remove_head...(0/10)
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
--- trace-17-complexity 0/5
--- TOTAL 71/100
```
#### 修正紀錄
**trace-02-ops**
`delete_mid` 沒有清理乾淨,根據[你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list?fbclid=IwAR2S6mLPVT7P8hu13lvSeurE2K86offyDvGjP2yYAYc57VOmkC9ForhizYM#%E4%BD%A0%E6%89%80%E4%B8%8D%E7%9F%A5%E9%81%93%E7%9A%84-C-%E8%AA%9E%E8%A8%80-linked-list-%E5%92%8C%E9%9D%9E%E9%80%A3%E7%BA%8C%E8%A8%98%E6%86%B6%E9%AB%94)中的討論修正完成
**trace-06-ops 0/6**
修正 `if` 判斷條件修正完成
**trace-09-robust**
缺少判斷 `bufsize` 不存在的狀態,加上條件後修正完成
**trace-17-complexity**
在修正 `q_remove_head` 以及 `q_remove_tail` 之後自動修正完成
**trace-10-robust**
一開始找不到哪個 function 出問題去看了 traces 的檔案之後跟據 `trace-10-robust.cmd` 的指令逐行輸入,發現是 `reverse` , `size` , `sort` 會在佇列是空的時候讓程式 crash, 增加 `NULL` 跳出的條件後修正完成
==100分拉~~==
```
--- TOTAL 100/100
```
### Memory leak Problem
在完成之後要進行 git commit 時報出在 `q_insert_head` 以及 `q_insert_tail` 有 memory leak 發生
將 `q_insert_head` 以及 `q_insert_tail` 中冗長部份以 `list_add` 以及 `list_add_tail` 代替即可解決
==原因尙不清楚==
## 研讀 Linux 核心原始程式碼的 [list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c)
詳細筆記寫在[這裡](https://hackmd.io/erWlfVMfQqyUe9JVbOLlBA)
## 引入 [list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 到 lab0-c 專案
將 [list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 以及 [list_sort.h](https://github.com/torvalds/linux/blob/master/include/linux/list_sort.h) 放入專案資料夾
在 `Makefile` 中新增 `list_sort.o` 到 `OBJ`,並嘗試編譯檢閱少了哪些檔案
```
make
CC qtest.o
CC list_sort.o
list_sort.c:3:10: fatal error: linux/bug.h: No such file or directory
3 | #include <linux/bug.h>
| ^~~~~~~~~~~~~
compilation terminated.
```
發現在 `list_sort.c` 中 `#include` 了很多沒接觸過的 header file,考慮到大部分的函式並不會用到,因此從[筆記](https://hackmd.io/erWlfVMfQqyUe9JVbOLlBA)中查找有使用到的函式並統一複製至 `list_sort.h`,在 `list_sort.c` 源碼中 `line 56` 手動將 `u8` 改為 `uint8_t` 並在 `list_sort.h` 中 `#include <stdint.h>`
再次嘗試編譯
```
make
CC qtest.o
CC queue.o
CC dudect/constant.o
CC list_sort.o
list_sort.c: In function ‘merge_final’:
list_sort.c:84:7: error: implicit declaration of function ‘unlikely’ [-Werror=implicit-function-declaration]
84 | if (unlikely(!++count))
| ^~~~~~~~
list_sort.c: In function ‘list_sort’:
list_sort.c:217:7: error: implicit declaration of function ‘likely’ [-Werror=implicit-function-declaration]
217 | if (likely(bits)) {
| ^~~~~~
list_sort.c: At top level:
list_sort.c:248:1: error: data definition has no type or storage class [-Werror]
248 | EXPORT_SYMBOL(list_sort);
| ^~~~~~~~~~~~~
list_sort.c:248:1: error: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’ [-Werror=implicit-int]
list_sort.c:248:1: error: parameter names (without types) in function declaration [-Werror]
```
發現缺失的函式包含 `unlikely`、`likely`、`EXPORT_SYMBOL`
由[筆記](https://hackmd.io/erWlfVMfQqyUe9JVbOLlBA)中從各項 header file 的定義 copy-paste 到 `list_sort.h` 之中。
而 `EXPORT_SYMBOL` 的用途也可在筆記中有解釋,對於目前的專案來說暫時用不到,故直接刪除即可。
目前可以成功編譯!
```
$ make
CC qtest.o
CC queue.o
CC dudect/constant.o
CC list_sort.o
LD qtest
```
把 `list_sort` 嵌入 `do_sort` 中
```c
if (exception_setup(true)) {
#if (LINUX_SORT)
list_sort(NULL, l_meta.l, cmp);
#else
q_sort(l_meta.l);
#endif
}
```
剩下 `cmp` function 未定義,根據`list_sort.c`中的註解
>The comparison function @cmp must return > 0 if @a should sort after @b (“@a > @b” if you want an ascending sort), and <= 0 if @a should sort before @b or their original order should be preserved. It is always called with the element that came first in the input in @a, and list_sort is a stable sort, so it is not necessary to distinguish the @a < @b and @a == @b cases.
- 如果 a 要排在 b 後面,則回傳大於 0
- 如果 a 要排在 b 前面,則回傳小於等於 0
根據 `list_cmp_func_t` 的 prototype 撰寫 `cmp` function
```c
__attribute__((nonnull(2,3)))
int cmp (void *priv, const struct list_head *a, const struct list_head *b)
{
element_t* a_ele = list_entry(a, element_t, list); // get mother element
element_t* b_ele = list_entry(b, element_t, list);
return strcmp(a_ele->value, b_ele->value) < 0 ? 0 : 1;
}
```
在 `make test` 得時候遇到 `trace-10-robust 0/6` 也就是突然 trace-10-robust 過不了了。
參考 [Risheng1128](https://hackmd.io/@Risheng/linux2022-lab0#%E5%BC%95%E5%85%A5-Linux-%E6%A0%B8%E5%BF%83%E5%8E%9F%E5%A7%8B%E7%A2%BC%E5%88%B0-lab0-c) 同學的筆記
>採雷小紀錄: 修改 `list_sort()` 的 `prototype` ,移除 `__attribute__((nonnull(2,3)))`
原因: 在測資 `trace-10-robust.cmd` , 會輸入 `NULL` 給 `head` ,原本在 `list_sort` 裡新增以下程式,但編譯器會編不過
```c
if (!head)
return
```
>解法: 移除 `__attribute__((nonnull(2,3)))` 即可
這裡要注意 `list_sort.h` 以及 `list_sort.c` 中的 `__attribute__((nonnull(2,3)))` 都要移除
一直以為 `cmp` 一定在某個 header file 中有定義,經過同學提醒之後才知道要自己寫
參考 [Risheng1128](https://hackmd.io/@Risheng/linux2022-lab0#%E5%BC%95%E5%85%A5-Linux-%E6%A0%B8%E5%BF%83%E5%8E%9F%E5%A7%8B%E7%A2%BC%E5%88%B0-lab0-c) 同學的筆記
在適當的地方使用巨集
```c
#define LINUX_SORT 1
```
藉由改變巨集即可決定使用 `list_sort` 或是 `q_sort`
## 比較你自己實作的 merge sort 和 Linux 核心程式碼之間效能落差
檢視 `Makefile` 運作流程
```
check: qtest
./$< -v 3 -f traces/trace-eg.cmd
```
自行新增一個 `efficiency`
```
efficiency: qtest
./$< -v 3 -f traces/trace-sort-efficiency.cmd
```
並且撰寫 `cmd` file
```
option malloc 0
option fail 0
option error 0
new
it RAND 100000
time
sort
time
it RAND 200000
shuffle
time
sort
time
it RAND 200000
shuffle
time
sort
time
```
來進行 `q_sort` 以及 `list_sort` 的比較,時間單位為 (s)
`q_sort`
| 數據數量 | No1 | No2 | No3 | No4 | No5 | Average |
| -------- | ----- | ----- | ----- | ----- | ----- | ------- |
| 100000 | 0.086 | 0.086 | 0.088 | 0.084 | 0.086 | 0.086 |
| 200000 | 0.347 | 0.330 | 0.333 | 0.316 | 0.336 | 0.3324 |
| 500000 | 0.647 | 0.645 | 0.673 | 0.626 | 0.636 | 0.6454 |
`list_sort`
| 數據數量 | No1 | No2 | No3 | No4 | No5 | Average |
| -------- | ----- | ----- | ----- | ----- | ----- | ------- |
| 100000 | 0.076 | 0.073 | 0.078 | 0.074 | 0.075 | 0.0752 |
| 200000 | 0.316 | 0.263 | 0.265 | 0.279 | 0.265 | 0.2776 |
| 500000 | 0.463 | 0.470 | 0.467 | 0.472 | 0.467 | 0.4678 |
| 數據數量 | q_sort_avg | list_sort_avg | efficiency |
| -------- | ---------- | ------------- | ---------- |
| 100000 | 0.086 | 0.0752 | 12.56% |
| 200000 | 0.3324 | 0.2776 | 16.49% |
| 500000 | 0.6454 | 0.4678 | 27.52% |
## 運用 Valgrind 排除 `qtest` 實作的記憶體錯誤
參考 [lanscer](https://hackmd.io/@laneser/linux2022_lab0#%E9%81%8B%E7%94%A8-Valgrind-%E6%8E%92%E9%99%A4-qtest-%E5%AF%A6%E4%BD%9C%E7%9A%84%E8%A8%98%E6%86%B6%E9%AB%94%E9%8C%AF%E8%AA%A4) 筆記利用 `valgrind` 執行 `qtest`
```
valgrind ./qtest -f ./traces/trace-01-ops.cmd
```
發現並沒有問題,看了 github 的 commit 之後發現這個問題已經被
[AmyLin0210 - Fix memory leak in qtest.c](https://github.com/sysprog21/lab0-c/commit/d39497d0cf908012ccae660b459d1a4b0655b076)
給 patch 掉了,為了練習先把 patch 的部份拿掉來 trace code.
把 `qtest.c` 中的 `if (!infile_name)` 拿掉之後再次執行
發現問題
```
valgrind -q --leak-check=full ./qtest -f traces/trace-01-ops.cmd
cmd> # Test of insert_head and remove_head
cmd> option fail 0
cmd> option malloc 0
cmd> new
l = []
cmd> ih gerbil
l = [gerbil]
cmd> ih bear
l = [bear gerbil]
cmd> ih dolphin
l = [dolphin bear gerbil]
cmd> rh dolphin
Removed dolphin from queue
l = [bear gerbil]
cmd> rh bear
Removed bear from queue
l = [gerbil]
cmd> rh gerbil
Removed gerbil from queue
l = []
cmd>
Freeing queue
==16717== 5 bytes in 1 blocks are still reachable in loss record 1 of 3
==16717== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==16717== by 0x4A483BE: strdup (strdup.c:42)
==16717== by 0x110CC0: linenoiseHistoryAdd (linenoise.c:1236)
==16717== by 0x111853: linenoiseHistoryLoad (linenoise.c:1325)
==16717== by 0x10CAD5: main (qtest.c:1042)
==16717==
==16717== 76 bytes in 19 blocks are still reachable in loss record 2 of 3
==16717== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==16717== by 0x4A483BE: strdup (strdup.c:42)
==16717== by 0x110C34: linenoiseHistoryAdd (linenoise.c:1236)
==16717== by 0x111853: linenoiseHistoryLoad (linenoise.c:1325)
==16717== by 0x10CAD5: main (qtest.c:1042)
==16717==
==16717== 160 bytes in 1 blocks are still reachable in loss record 3 of 3
==16717== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==16717== by 0x110C80: linenoiseHistoryAdd (linenoise.c:1224)
==16717== by 0x111853: linenoiseHistoryLoad (linenoise.c:1325)
==16717== by 0x10CAD5: main (qtest.c:1042)
```
可以看到 `still reachable` 這類型的問題 發生在 `linenoiseHistoryAdd` 和 `linenoiseHistoryLoad` 之中
參考 [valgrind 文件](https://valgrind.org/docs/manual/faq.html#faq.deflost) 提到有 5 種 memoryleak
>"definitely lost" means your program is leaking memory -- fix those leaks!
>
>"indirectly lost" means your program is leaking memory in a pointer-based structure. (E.g. if the root node of a binary tree is "definitely lost", all the children will be "indirectly lost".) If you fix the "definitely lost" leaks, the "indirectly lost" leaks should go away.
>
>"possibly lost" means your program is leaking memory, unless you're doing unusual things with pointers that could cause them to point into the middle of an allocated block; see the user manual for some possible causes. Use --show-possibly-lost=no if you don't want to see these reports.
>
>"still reachable" means your program is probably ok -- it didn't free some memory it could have. This is quite common and often reasonable. Don't use --show-reachable=yes if you don't want to see these reports.
>
>"suppressed" means that a leak error has been suppressed. There are some suppressions in the default suppression files. You can ignore suppressed errors.
根據說明 `still reachable` 是有東西沒有 free 乾淨,而在程式中是由 `linenoiseHistoryLoad` 呼叫 `linenoiseHistoryAdd` 造成的問題。
檢視 `console.c` 中的程式碼
```c
if (!has_infile) {
char *cmdline;
while ((cmdline = linenoise(prompt)) != NULL) {
interpret_cmd(cmdline);
linenoiseHistoryAdd(cmdline); /* Add to the history. */
linenoiseHistorySave(HISTORY_FILE); /* Save the history on disk. */
linenoiseFree(cmdline);
while (buf_stack && buf_stack->fd != STDIN_FILENO)
cmd_select(0, NULL, NULL, NULL, NULL);
has_infile = false;
}
} else {
while (!cmd_done())
cmd_select(0, NULL, NULL, NULL, NULL);
}
```
如果有傳入檔案,則 `has_infile` 為 `true`,此時會進入 `else` 區域,也就不會執行到 `linenoise.c` 中的
```c
if (!atexit_registered) {
atexit(linenoiseAtExit);
atexit_registered = 1;
}
```
且 `linenoiseAtExit()` 中含有釋放 `History` 的函式 `freehistory`
```c
static void linenoiseAtExit(void)
{
disableRawMode(STDIN_FILENO);
freeHistory();
}
```
參考 [2020leon](https://hackmd.io/@6649/linux2022-lab0#%E4%BF%AE%E6%AD%A3%E9%8C%AF%E8%AA%A4%EF%BC%88-Valgrind-%EF%BC%89) 同學的解法(最新版本以在 github 上被 commit 了)
>1. 無論如何強制呼叫 atexit
>2. 在沒有輸入檔的情況下不在主函式呼叫類 linenoiseHistoryLoad 的函式
```c
if (!infile_name) {
/* Trigger call back function(auto completion) */
linenoiseSetCompletionCallback(completion);
linenoiseHistorySetMaxLen(HISTORY_LEN);
linenoiseHistoryLoad(HISTORY_FILE); /* Load the history at startup */
}
```
在沒有輸入檔案的時候進行會進入 `if` 就會執行到含有 `atexit()` 的部份,即可解決。
**小筆記:**
如果系統檔出現錯誤,或是本來可以過的突然再次測試卻有問題, `make clean` 之後在重新編譯
## 透過 Massif 視覺化 “simulation” 過程中的記憶體使用量
先編譯出 massif 檔案
```
valgrind -v --tool=massif ./qtest -f traces/trace-sort-efficiency.cmd
```
在用 `massif-visualizer` 進行查看
```
massif-visualizer ./massif.out.26695
```
這邊測試 `trace-sort-efficiency.cmd` 此為在跑 `list_sort` 以及 `q_sort` 的效率檔案

可以看到最後還剩下一點點東西,==原因尚未釐清==
## 檢視命令直譯器的運作流程
詳細研讀 `qtest` 命令直譯器寫在[這邊](https://hackmd.io/reqCMFvVSzaNt6BE9Ek3Bw?view)
## 在 qtest 提供新的命令 shuffle
根據 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) 實做
**q_shuffle**
```c
void q_shuffle(struct list_head *head)
{
if (!head || list_empty(head))
return;
int size = q_size(head);
// shuffle
for(int i =0; i < size;) { // not iterate i , iterate size
struct list_head *start = head->next;
int rand_idx = rand() % size; // random number in range 0~ (size-1)
for (int i = 0; i < rand_idx; i++) {
start = start->next;
}
list_del(start);
list_add_tail(start, head);
size--;
}
return;
}
```
**q_shuffle 過程**
| 隨機範圍 | 隨機數 | 原始數據 | 結果 |
| -------- | ------ | -------- | ---- |
| | | 12345 | |
| 1~5 | 4 | 1235 | 4 |
| 1~4 | 3 | 125 | 34 |
| 1~3 | 1 | 25 | 134 |
| 1~2 | 1 | 5 | 2134 |
**do_shuffle**
```c
tatic bool do_shuffle(int argc, char *argv[])
{
if (argc != 1) {
report(1, "%s takes no argu,emts", argv[0]); //return function error
return false;
}
if (!l_meta.l)
report(3, "Warning: Calling free on null queue");
error_check();
if (exception_setup(true))
q_shuffle(l_meta.l);
exception_cancel();
set_noallocate_mode(false);
show_queue(3);
return !error_check();
}
```
### shadow variable
`git commit` 時出現如下警告訊息
```
Make sure you indent as the following:
clang-format -i queue.c
Following files need to be cleaned up:
qtest.c
queue.c
queue.c:299:18: style: Local variable 'i' shadows outer variable [shadowVariable]
for (int i = 0; i < rand_idx; i++) {
^
queue.c:296:14: note: Shadowed declaration
for (int i = 0; i < size;) { // not iterate i , iterate size
^
queue.c:299:18: note: Shadow variable
for (int i = 0; i < rand_idx; i++) {
^
Fail to pass static analysis.
```
查找定義 [variable shadowing](https://en.wikipedia.org/wiki/Variable_shadowing)
避免使用在 `for` 迴圈中使用重複的變數名稱
改成這樣就 ok 了
```c
for (int i = 0; i < size;) { // not iterate i , iterate size
struct list_head *start = head->next;
int rand_idx = rand() % size; // random number in range 0~ (size-1)
for (int j = 0; j < rand_idx; j++) {
start = start->next;
}
list_del(start);
list_add_tail(start, head);
size--;
}
```
## linenoise 原始碼分析
在整合 [tiny-web-server](https://github.com/7890/tiny-web-server) 之前先研究了一下 `linenoise` 的運作流程,詳細過程寫在 [linenoise 原始碼分析](https://hackmd.io/0uh0RmCJTHKnoD1V1iH-HA)
## 整合 [tiny-web-server](https://github.com/7890/tiny-web-server)
首先比對 [tiny-web-server](https://github.com/7890/tiny-web-server) 和 lab0-c 的 `Makefile` 並整合到一起
```diff=
OBJS := qtest.o report.o console.o harness.o queue.o \
random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \
- lintnoise.o
+ linenoise.o list_sort.o tiny.o
```
為了符合整個 lab0-c 的模式,將 `tiny.c` 拆解成 `tiny.c` 以及 `tiny.h`
:::spoiler `tiny.h`
```c
#ifndef LAB0_TINY_H
#define LAB0_TINY_H
#include <arpa/inet.h> /* inet_ntoa */
#include <signal.h>
#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/sendfile.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#define LISTENQ 1024 /* second argument to listen() */
#define MAXLINE 1024 /* max length of a line */
#define RIO_BUFSIZE 1024
#ifndef DEFAULT_PORT
#define DEFAULT_PORT 9999 /* use this port if none given as arg to main() */
#endif
#ifndef FORK_COUNT
#define FORK_COUNT 4
#endif
#ifndef NO_LOG_ACCESS
#define LOG_ACCESS
#endif
typedef struct {
int rio_fd; /* descriptor for this buf */
int rio_cnt; /* unread byte in this buf */
char *rio_bufptr; /* next unread byte in this buf */
char rio_buf[RIO_BUFSIZE]; /* internal buffer */
} rio_t;
/* Simplifies calls to bind(), connect(), and accept() */
typedef struct sockaddr SA;
typedef struct {
char filename[512];
off_t offset; /* for support Range */
size_t end;
} http_request;
typedef struct {
const char *extension;
const char *mime_type;
} mime_map;
//https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types
#endif
```
:::
剩下的部份即為 `tiny.c`
先將供同使用到的變數定義於 `linenoise.h`
```c
struct sockaddr_in clientaddr;
socklen_t clientlen;
int listenfd;
bool noise;
```
接著照著 [lab0](/dPYu4WH8TuqXnAJvfxm-JA) 的步驟
- 修改 `linenoise.c`
- 修改 `run_console()`
>透過 HTTP 的 GET method 來傳送想要執行的 function,`process()` 處理 URL 字串並將 function name 與 parameter 以跟 `cmdline` 一樣的格式回傳 (`[function name][space][parameter]`):
新增 `process()` 函式
```c
char *process(int fd, struct sockaddr_in *clientaddr) {
#ifdef LOG_ACCESS
printf("accept request, fd is %d, pid is %d\n", fd, getpid());
#endif
http_request req;
parse_request(fd, &req);
int status = 200;
handle_request(fd, req.function_name);
char *p = req.function_name;
/* Change '/' to ' ' */
while (*p && (*p) != '\0') {
++p;
if (*p == '/') {
*p = ' ';
}
}
#ifdef LOG_ACCESS
log_access(status, clientaddr, &req);
#endif
char *ret = malloc(strlen(req.function_name) + 1);
strncpy(ret, req.function_name, strlen(req.function_name) + 1);
return ret;
}
```
- 這個 `process()` 可以直接修改位於 `tiny.c` 的 `process()` 並將回傳類型改為 `char*`
在 `console.c` 中的 `init_cmd()` 加上 `web` 指令
```c
ADD_COMMAND(web, " | web server");
```
在首次 `make` 的時候 `do_web` 一直出現
```
console.c: In function ‘do_web’:
console.c:404:16: error: implicit declaration of function ‘socket_init’ [-Werror=implicit-function-declaration]
404 | listenfd = socket_init();
| ^~~~~~~~~~~
```
原來是在將 [tiny-web-server](https://github.com/7890/tiny-web-server) 拆成 `tiny.c` 和 `tiny.h` 沒有將 header file 裡面的東西補完善
- 低級錯誤浪費好多時間(要檢討一下)
將函式宣告補上並把所有的 `static` 關鍵字移除,並在新增 `socket_init()` 函式
```c
int socket_init() {
return open_listenfd(DEFAULT_PORT);
}
```
至此 `tiny web` 整合完畢
在整合完畢後發現在普通的 `qtest` 中輸入指令會重複顯示(正常應該只有一次)
```
cmd> new
cmd> new
l = []
cmd> it a
cmd> it a
l = [a]
```
這邊目前推測是沒有設定 `non-blocking` 的問題
```c
/* Set socket to non-blocking */
int flags = fcntl(listenfd, F_GETFL);
fcntl(listenfd, F_SETFL, flags | O_NONBLOCK);
```
不過在加上這行之後出現
```
console.c:80:13: error: initializer element is not constant
80 | int flags = fcntl(listenfd, F_GETFL);
| ^~~~~
console.c:81:16: error: expected ‘)’ before numeric constant
81 | fcntl(listenfd, F_SETFL, flags | O_NONBLOCK);
| ^
| )
```
==**此問題尚待解決**==
### Commit
- 這邊 `commit` 時顯示 `tiny.c` 含有 dangerous function detected ,參閱 [Common vulnerabilities guide for C programmers](https://security.web.cern.ch/recommendations/en/codetools/c.shtml)
>**sprintf**
Just as the previous functions, sprintf does not check the buffer boundaries and is vulnerable to overflows.
這邊提到 `sprintf` 不會判斷邊界,有可能會 overflow.
把 `sprintf` 替換成 `snprintf` 有 size 就給 size, 沒有就給 max
- `%lu` -> `%zu` (`%zu` is for `size_t`)
- `%d` -> `%u` (`%u` is for `usigned int`)
- sscanf
- 參考閱讀 [Is sscanf considered safe to use?](https://stackoverflow.com/questions/5873402/is-sscanf-considered-safe-to-use)
>sscanf() without field width limits can crash with huge input data
這邊設定為 `%1023d` ,因為 `MAXLINE` 是 1024 (留一格來 handle null-terminator)
- The scope of the variable ' ' can be reduced.
- 參考閱讀 [CppCheck. The scope of the variable can be reduced (and loop)](https://stackoverflow.com/questions/23604699/cppcheck-the-scope-of-the-variable-can-be-reduced-and-loop)
- shadow variable
- `print_help` never use
- 在 `console.c` 新增了一個 command `web_help` 來使用 `printf_help`
```c
static bool do_web_help() {
print_help();
return true;
}
```