# 2024q1 Homework1 (lab0)
contributed by < [`patri27826`](https://github.com/patri27826) >
## 開發環境
```shell
$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Vendor ID: GenuineIntel
Model name: 13th Gen Intel(R) Core(TM) i7-13700
CPU family: 6
Model: 183
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 1
CPU(s) scaling MHz: 20%
CPU max MHz: 5200.0000
CPU min MHz: 800.0000
BogoMIPS: 4224.00
```
## 指定的佇列操作
### `q_insert_head`
:::danger
列出程式碼之前,要闡述你的想法、考量因素,以及你的做法是否會有副作用 (side effect)。
:::
<s>初步想法</s> ???
```c
bool q_insert_head(struct list_head *head, char *s)
{
if(!head)
return false;
element_t *new = malloc(sizeof(element_t));
if (!new)
return false;
new->value = strdup(s);
list_add(&new->list, head);
return true;
}
```
在參考 [willwillhi1](https://hackmd.io/@willwillhi/lab0-2023#%E5%AE%8C%E6%88%90-queuec) 後,發現還會需要去檢查`new-value` 是不是 `NULL`,所以修改成如下
```diff
bool q_insert_head(struct list_head *head, char *s)
if (!new)
return false;
new->value = strdup(s);
+ if (!new->value) {
+ free(new);
+ return false;
+ }
list_add(&new->list, head);
return true;
}
```
然後我就有點好奇會需要去排除 s 是 `NULL` 的情況嗎?,我就去看了`strdup` 裡面使用到`strlen`,
:::danger
無論標題和內文中,中文和英文字元之間要有空白字元 (對排版和文字搜尋有利);
:::
<s>參考 cppreference.com 的 `strlen`說明,發現s在`null`的情況下,`strlen`會中止
size_t strlen( const char *str );
1) Returns the length of the given **null-terminated** byte string, that is, the number of characters in a character array whose first element is pointed to by str up to and not including the first null character.
</s>
:::danger
查閱 C 語言規格書和 Linux man page,而非 cppreference 網站。
:::
在查閱 [C 語言規格書](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)和 [Linux man page](https://linux.die.net/man/3/strlen)後,發現他們對於 `strlen` 都只有介紹用法,沒有提及特殊值的情況,因此這邊我還是先暫時維持檢查 s 不為 `null` 的條件
:::danger
不要以為做完關鍵字搜尋,就等同於你看完 C 語言規格書。注意規格書的註腳 (footnote),幾百頁的內容,可讓你一輩子受益,快去讀!
:::
因此補上檢查
```diff
bool q_insert_head(struct list_head *head, char *s)
{
- if(!head)
+ if(!head || !s)
return false;
element_t *new = malloc(sizeof(element_t));
if (!new)
return false;
new->value = strdup(s);
if(!new->value)
{
free(new);
return false;
}
list_add(&new->list, head);
return true;
}
```
### `q_insert_tail`
跟 `q_insert_head` 一樣做法, `list_add` 換成 `list_add_tail`
:::danger
使用 `git diff` 或 `diff` 命令來產生程式碼之間的變更列表,而非憑感覺填入,注意位移量。
:::
### `q_remove_head` 和 `q_remove_tail` 實作
```c
element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
if (!head || list_empty(head))
return NULL;
element_t *head_element = list_first_entry(head, element_t, list);
list_del(&head_element->list);
if (sp && bufsize) {
memcpy(sp, head_element->value, bufsize);
sp[bufsize - 1] = '\0';
}
return head_element;
}
```
參考 [komark06](https://hackmd.io/@komark06/SyCUIEYpj#q_remove_head-%E5%92%8C-q_remove_tail-%E5%AF%A6%E4%BD%9C),先把元素從佇列中移除,再去對元素的值複製到傳入的 `sp` 指標,除了 `memcpy` 也有看到 [willwillhi](https://hackmd.io/@willwillhi/lab0-2023) 用 strncpy 實作
```
strncpy(sp, node->value, bufsize-1);
sp[bufsize-1] = '\0';
```
<s>
了解到 `strcpy` 跟 `strncpy` 之間的[差異與用法](https://dotblogs.com.tw/Ace_Dream/2016/01/05/strcpy_strncpy)
</s>
:::danger
參照第一手材料。
:::
### `q_delete_mid`
這題我一看到就想到 `leetcode` 上面的某一題尋找鏈結佇列的中心點是用快慢指針去實作,差異是在因為 `leetcode` 上的是單向的佇列,而我們實作的是雙向佇列,所以會需要去修改終止條件
```c
struct list_head *slow = head->next, *fast = head->next, *tail = head->prev;
while (fast != tail && fast->next != tail) {
slow = slow->next;
fast = fast->next->next;
}
```
### `q_delete_dup`
這題看到一開始的想法是跑兩個迴圈來去檢查所有的 `Duplicate String` ,但這樣的時間複雜度會到 $O(n^2)$,之後參考 [`wanghanchi`](https://hackmd.io/@wanghanchi/linux2023-lab0#q_delete_dup) 的方法
```c
bool isDup = false;
element_t *node, *safe;
list_for_each_entry_safe (node, safe, head, list) {
if (&safe->list != head && !strcmp(safe->value, node->value)) {
list_del(&node->list);
q_release_element(node);
isDup = true;
} else if (isDup) {
list_del(&node->list);
q_release_element(node);
isDup = false;
}
}
```
先去比對第一個元素的值跟第二個元素的值是否相同,如果相同,移除第一個,同時用一個 `flag` 來記錄第二個的值是否 `Duplicate` ,這樣當第一個 `if` 不通過時,我就可以去通過這個 `flag` 來去確認當前的值是不是前一次的 `Duplicate String`
### `q_swap`
:::danger
列出程式碼之前,要闡述你的想法、考量因素,以及你的做法是否會有副作用 (side effect)。
:::
```c
void q_swap(struct list_head *head)
{
if (!head || !head->next || list_empty(head))
return;
struct list_head *cur = head->next, *next = head->next->next;
element_t *e1, *e2;
while (cur != head && next != head) {
e1 = list_entry(cur, element_t, list);
e2 = list_entry(next, element_t, list);
char *c = e1->value;
e1->value = e2->value;
e2->value = c;
cur = next->next;
if (!next->next) {
break;
}
next = next->next->next;
}
}
```
做法就是每次交換兩個元素的值,結束再往下兩個去找,比較要注意的是這邊
```c
cur = next->next;
if (!next->next) {
break;
}
```
要先去檢查 `next->next` 是不是 `null` ,否則在`next->next->next`就會出錯
### `q_reverse` and `q_reverseK`
:::danger
- [ ] traverse (動詞) 和 traversal (名詞)
根據 Dictionary.com 的[解釋](https://www.dictionary.com/browse/traverse
): (作為及物動詞和不及物動詞都有類似的意思,以下列出作為及物動詞的寓意)
* to pass or move over, along, or through.
* to go to and fro over or along.
其實這意思很好懂,就像我們「走過」/「穿越」校園一般,於是 traverse a linked list 就會是「(用某種手段) 存取多個鏈結串列的節點」,但這裡卻沒有必要「所有」的範圍:英語的 "move over/through" 用於某個區域時,根本沒有這樣的隱含意義。如果將 traverse 翻譯為「遍歷」,就會導致「超譯」,也就是跳脫「直譯」和「意譯」。
當我們回頭看 "traverse" 所在的技術描述內容,例如 "traverse every node",若翻譯為「遍歷每個節點」,那麼既然都「遍」(意即「全面」、「到處」),又何來「每個」節點呢?於是,合理的翻譯應改為「逐一走訪每個節點」 —— 差異在哪?在 "traverse every node" 的應用場景中,可能是我們嘗試在鏈結串列尋找特定的節點內含的資料,一旦找到就停止,或者我們要偵測給定的鏈結串列是否包含環狀 (circular) 結構 ,並沒有真的要「遍」(到處/全面)「歷」(意即「經過」) 每個節點。在我們的用語中,要區分「意圖」(intention) 和「實際作用」(reaction),濫用「遍歷」會使得語意不清,從而難以推測英語原文的訴求。
還有個更重要的原因是,「遍歷」這詞已在理工領域存在,且廣泛使用,即「遍歷理論」(Ergodic theory),後者是研究具有不變測度 (invariant measure) 的動力系統及其相關問題的一個數學分支。 遍歷理論研究遍歷變換,由試圖證明統計物理中的遍歷假設 (Ergodic hypothesis) 演進而來。
在統計學中,若單一個物理系統在不同時間內重複相同的實驗 (如丟擲一枚硬幣),其取樣數據所得的統計結果 (如硬幣出現正面的機率) 和極多個完全相同的物理系統的系集 (如丟極多個相同的硬幣) 在同時作相同的實驗取樣數據的統計結果假設為相同時,此種假設即稱為「遍歷性假設」或「遍歷假設」。基於這個假設,對時間平均的統計方式及結果,便可由對系集的平均及統計方式得到。在一般物理系統中,尤其在統計力學範圖中,均採用此遍歷性假設為基本的假設。在流體力學中對亂流的實驗分析,亦是採用此假設。
遍歷 (Ergodic) 源於以下二個希臘詞:
* ergon (對應英語的 work)
* odos (對應英語的 path 或 way)
最初這是由奧地利物理學家波茲曼 ([Ludwig Boltzmann](https://en.wikipedia.org/wiki/Ludwig_Boltzmann)) 於統計力學領域 (statistical mechanics) 提出的概念,其一廣為人知的結果是:在經過長時間後,時間平均將會趨近空間平均。此事實在動力系統扮演極重要的角色,在隨機分析領域亦然。
因此,若貿然將 traverse 翻譯為「遍歷」,一來語意不清,二來增加和理工多項領域專業人員的溝通成本,實在得不償失。
$\to$ [資訊科技詞彙翻譯](https://hackmd.io/@sysprog/it-vocabulary)
:::
`q_reverse` 比較直觀,就是<s>遍歷</s> 整個佇列,一直把元素往 `head` 搬,也就是去呼叫 `list_move(node, head)`
而 `reverseK` 我是參考 [`var-x`](https://hackmd.io/@vax-r/linux2024-homework1#q_reverseK)
```c
void q_reverseK(struct list_head *head, int k)
{
if (!head)
return;
struct list_head *last = head->next;
int times = q_size(head) / k;
LIST_HEAD(tmp);
LIST_HEAD(result);
for (int i = 0; i < times; i++) {
for (int j = 0; j < k; j++) {
struct list_head *node = last->next;
list_del(last);
list_add(last, &tmp);
last = node;
}
list_splice_tail_init(&tmp, &result);
}
list_splice_init(&result, head);
}
```
主要精神就是我事先得知我有幾組需要被 `reverse` 的元素,再分組把 `reverse` 的結果存到暫存佇列 ,最後再把這個暫存佇列加回原本的佇列
流程如下:
1. 我有7個元素且 $K=3$ ,則組數則為 $7/3 = 2$
```c
int times = q_size(head) / k;
```
2. 接著在我的每個組內,依次把元素加到暫存佇列 `tmp` 的 `head` ,其實就是變相的reverse
```c
for (int i = 0; i < times; i++) {
for (int j = 0; j < k; j++) {
struct list_head *node = last->next;
list_del(last);
list_add(last, &tmp);
last = node;
}
list_splice_tail_init(&tmp, &result);
```
3. 把 `tmp` 加到 `result` 的尾巴, `result就是最終結果的暫存佇列`
```c
list_splice_tail_init(&tmp, &result);
```
4. 把 `result` 加回原始佇列的頭部,原因是我們在 `reverse` 時會有一些數量不夠的情況 ($7%3 = 1$) ,這些多的元素就會被留在原始佇列內,因此我們最終要把 `result` 加回原始佇列的頭部
```c
list_splice_init(&result, head);
```
### `q_ascend` and `q_descend`
```c
element_t *first = list_entry(head->prev, element_t, list);
element_t *second = list_entry(head->prev->prev, element_t, list);
while (&second->list != head) {
if (strcmp(first->value, second->value) > 0) {
first = list_entry(first->list.prev, element_t, list);
second = list_entry(second->list.prev, element_t, list);
} else {
list_del(&second->list);
q_release_element(second);
second = list_entry(first->list.prev, element_t, list);
}
}
```
主要想法就是從佇列尾部開始去做比對,把不符的點刪除,若符合則一起往前到下一組
### `q_sort`
一開始想的是 quick sort,但發現現在這個情況可能會用 `merge sort` 會比較合適,因此在參考 [`willwillhi`](https://hackmd.io/@willwillhi/lab0-2023)的做法,實作了 `merge sort`
:::danger
列出明確的考量因素,不要人云亦云。
:::
```c
struct list_head *_merge_sorted_single_linked_queues(struct list_head *l1,
struct list_head *l2)
{
struct list_head *head = NULL, **ptr = &head;
for (struct list_head **cur = NULL; l1 && l2; *cur = (*cur)->next) {
if (strcmp(container_of(l1, element_t, list)->value,
container_of(l2, element_t, list)->value) >= 0)
cur = &l2;
else
cur = &l1;
*ptr = *cur;
ptr = &(*ptr)->next;
}
*ptr = (struct list_head *) (void *) ((uintptr_t) (void *) l1 |
(uintptr_t) (void *) l2);
return head;
}
struct list_head *_merge_sort(struct list_head *l)
{
if (l == NULL || l->next == NULL)
return l;
struct list_head *fast = l;
struct list_head *slow = l;
while (fast->next != NULL && fast->next->next != NULL) {
fast = fast->next->next;
slow = slow->next;
}
struct list_head *l1 = l;
struct list_head *l2 = slow->next;
slow->next = NULL;
return _merge_sorted_single_linked_queues(_merge_sort(l1), _merge_sort(l2));
}
```
:::danger
工程人員說話要精準,避免說「覺得」和「好像」等詞彙。
:::
但在了解他的程式碼之後,<s>我覺得好像</s> 有改進的空間,原因是在他是將雙向鏈結佇列先拆成單向,再去做 `merge` ,但是這樣就不能套用我們事先定義好的一些函式,於是我嘗試將他保持著雙向鏈結的形式去做 `merge sort` ,但很可惜嘗試了很多一直 `segmentation fault` ,於是目前還是先維持單鏈結做法
### `q_merge`
一開始花了一點時間搞懂 `queue_context_t` 的結構,了解之後發現他也是用 `merge sort` 的精神來實作,只是這次的最小單位就是一個 `queue` ,在參考 [var-x](https://hackmd.io/@vax-r/linux2024-homework1#q_merge)的做法後,實作了如下,
```c
int q_merge(struct list_head *head, bool descend)
{
if (!head || list_empty(head))
return 0;
if (list_is_singular(head))
return q_size(list_entry(head, queue_contex_t, chain)->q);
int size = q_size(head);
int count = (size % 2) ? size / 2 + 1 : size / 2;
int queue_size = 0;
for (int i = 0; i < count; ++i) {
queue_contex_t *first = list_first_entry(head, queue_contex_t, chain);
queue_contex_t *second =
list_entry(first->chain.next, queue_contex_t, chain);
while (!list_empty(first->q) && !list_empty(second->q)) {
queue_size =
_merge_sorted_double_linked_queues(first->q, second->q);
list_move_tail(&second->chain, head);
first = list_entry(first->chain.next, queue_contex_t, chain);
second = list_entry(first->chain.next, queue_contex_t, chain);
}
}
if (descend) {
q_reverse(head);
}
return queue_size;
}
```
:::danger
「主要精神」是什麼?查閱辭典,再來看是否合適。
:::
<s>主要精神</s> 是先得知他會需要做幾輪合併,在每一輪內一對一對去處理,例如一開始有4對,接下來合併到2對,再來一對,最後就剩一個。
這樣的好處是能避免單一佇列在合併的過程會太長,每一輪的每一次都各自去合併兩個佇列,維持每個合併佇列都是合併相同數量的佇列
`_merge_sorted_double_linked_queues` 的作用是把兩個佇列依照 `merge sort ` 的方法,合併到傳入的第一個佇列
:::danger
改進你的漢語表達。
:::