# 2025q1 Homework1 (lab0) ### 實驗環境 ```shell Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz CPU family: 6 Model: 165 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 Stepping: 2 BogoMIPS: 5184.01 ``` :::spoiler Architecture * Architecture: x86_64 * 表示處理器採用 x86_64 架構,也就是 64 位元架構。 * CPU op-mode(s): 32-bit, 64-bit * 指 CPU 同時支持 32 位元與 64 位元的運作模式。 * Address sizes: 39 bits physical, 48 bits virtual * 表示物理地址使用 39 位元,而虛擬地址則使用 48 位元。這關係到系統能尋址的內存範圍。 * Byte Order: Little Endian * 表示系統採用小端序(Little Endian)的位元組順序,即低位元組存放在低位地址。 ::: :::spoiler CPU 核心與線程資訊 CPU(s): 12 表示系統中總共有 12 個邏輯 CPU(或線程)。 On-line CPU(s) list: 0-11 表示編號從 0 到 11 的所有邏輯 CPU 均已啟動且在線。 Vendor ID: GenuineIntel 表示 CPU 的供應商是 Intel。 Model name: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz 顯示具體的 CPU 型號和頻率。 CPU family: 6, Model: 165, Stepping: 2 這些是 Intel 內部用來標識 CPU 系列、型號與製程版本的資訊。 Thread(s) per core: 2 表示每個物理核心支援 2 個線程(超執行緒技術,Hyper-Threading)。 Core(s) per socket: 6 表示每個 CPU 插槽中有 6 個物理核心。 Socket(s): 1 表示系統中只有一個 CPU 插槽,也就是只有一顆物理 CPU。 BogoMIPS: 5184.01 這是一個用來粗略衡量 CPU 速度的指標,不過主要用於內部參考,並非精確的性能測量。 ::: contributed by < `BennyWang1007` > ### Reviewed by `yy214123` 在 `queue.h` 中已有: ```c static inline void q_release_element(element_t *e) { test_free(e->value); test_free(e); } ``` 所以不用另外定義: ```c #define free_element(e) \ do { \ free(e->value); \ free(e); \ } while (0) ``` #### `q_new` 可以如下改善: ```diff struct list_head *q_new() { struct list_head *new_head = malloc(sizeof(struct list_head)); if (!new_head) return NULL; - new_head->next = new_head; - new_head->prev = new_head; + INIT_LIST_HEAD(new_head); return new_head; } ``` 如此更貼近 linux 核心風格。 #### `q_delete_mid` 除了使用快慢指標的策略,也能使用另一個策略來實作: ```graphviz digraph G { graph [ nodesep=0.8, ranksep=0.8 ]; { rank=same; head; 1; 2; 3; 4; 5; } { rank=same; p1 [label="*p1", shape=box]; p2 [label="*p2", shape=box]; } head -> 1 -> 2 -> 3 -> 4 -> 5 -> head [dir=both]; p1 [label="*p1", shape=box]; p2 [label="*p2", shape=box]; p1 -> 1 [color=red]; p2 -> 5 [color=red]; } ``` ```graphviz digraph G { graph [ nodesep=0.8, ranksep=0.8 ]; { rank=same; head; 1; 2; 3; 4; 5; } { rank=same; p1 [label="*p1", shape=box]; p2 [label="*p2", shape=box]; } head -> 1 -> 2 -> 3 -> 4 -> 5 -> head [dir=both]; p1 [label="*p1", shape=box]; p2 [label="*p2", shape=box]; p1 -> 2 [color=red]; p2 -> 4 [color=red]; } ``` ```graphviz digraph G { graph [ nodesep=0.8, ranksep=0.8 ]; { rank=same; head; 1; 2; 3; 4; 5; } { rank=same; p1 [label="*p1", shape=box]; p2 [label="*p2", shape=box]; } head -> 1 -> 2 -> 3 -> 4 -> 5 -> head [dir=both]; p1 [label="*p1", shape=box]; p2 [label="*p2", shape=box]; p1 -> 3 [color=red]; p2 -> 3 [color=red]; } ``` 由於使用的資料結構是環狀雙向串列,這個策略可以減少 $\frac {n}{2}$ 次走訪次數。 #### `q_swap` 是 `q_reverseK`(當 k 為 2)的一個例子,可以直接呼叫 `q_reverseK(head, 2)` 來重用程式碼。 ___ {%hackmd NrmQUGbRQWemgwPfhzXj6g %} ## 作業要求 Request followed [2025 年 Linux 核心設計/實作課程作業 —— lab0](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-a) - [ ] 修改 `queue.[ch]` 以完成 `make test` 自動評分系統的所有項目 - [ ] 運用 Valgrind 排除 `qtest` 實作的記憶體錯誤,搭配 Massif 視覺化 - [ ] 研讀 Linux 核心原始程式碼的 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 並嘗試引入到 lab0-c 專案,比較你自己實作的 merge sort 和 Linux 核心程式碼之間效能落差 - [ ] 新增命令 `shuffle`, 參考 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) - [ ] 在 `qtest` 中執行 `option entropy 1` 並搭配 `ih RAND 10`,觀察亂數字串分佈 - [ ] 研讀論文〈[Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)〉,解釋本程式的 “simulation” 模式是如何透過以實驗而非理論分析,達到驗證時間複雜度,需要解釋 [Student’s t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution) 及程式實作的原理。 :::success ## 貢獻lab0c之commit規範熟悉 由於先前 commit 未符合[CONTRIBUTING.md](https://github.com/sysprog21/lab0-c/blob/master/CONTRIBUTING.md) 的規範與七條原則,我針對相對應 commit 進行 `git rabase -i` 加以修正。 首先,針對原先第一條 commit,我應當將 `q_new`, `q_free` 與 `q_insert_head`, `q_insert_tail` 分別置於兩則不同 commit,以達到[CONTRIBUTING.md](https://github.com/sysprog21/lab0-c/blob/master/CONTRIBUTING.md) 中 Git commit style 所要求之 > Each commit should encapsulate a single, coherent change. 以下是更正過後的 commit: > commit [8e86f02](https://github.com/sysprog21/lab0-c/commit/8e86f021a23810731da423cfa2be18a9e7893c18) > commit [101ef36](https://github.com/sysprog21/lab0-c/commit/101ef36c3fc5d4de40a2034604526212db79288a) 第二則 commit 應當說明實作手法,例如讓 `q_insert_head` 和 `q_insert_tail` 共用程式碼片段。注意 [CONTRIBUTING.md](https://github.com/sysprog21/lab0-c/blob/master/CONTRIBUTING.md) 的規範: > Use the body to explain what and why, not how. 再者,對於本來的第二條 commit,我應避免濫用 finish 一詞: > to complete something or come to the end of an activity: `q_remove_head` 和 `q_remove_tail` 仍有改進空間。在工程領域,不要輕易說 "finish"。 以下是更正後的 commit : > commit [91aca2d](https://github.com/sysprog21/lab0-c/commit/91aca2d8c0fac900b52fcc311d58e80a109250ff) 最後,對於原來的第三則 commit,我改進英語書寫。並遵循 [CONTRIBUTING.md](https://github.com/sysprog21/lab0-c/blob/master/CONTRIBUTING.md) 的規範,針對 "what" 與 "how" 進行著重探討。 以下是更正後的 commit : > commit [3769f55](https://github.com/bevmmf/lab0-c/commit/8e86f02) ::: # queue implementation - 參考資料: 1. [The Linux 核心API - List Management Functions](https://www.kernel.org/doc/html/latest/core-api/kernel-api.html) 2. [你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list#Linked-list-%E5%9C%A8-Linux-%E6%A0%B8%E5%BF%83%E5%8E%9F%E5%A7%8B%E7%A8%8B%E5%BC%8F%E7%A2%BC) - 導入作業規範的程式碼縮排風格 clang-format的使用範例 ``` $ clang-format -i queue.c ``` ## q_new Function objective : make a empty queue 作法: 1. `LIST_HEAD`做出一頭節點 2. 使用 `INIT_LIST_HEAD` 宏初始化這個頭節點 3. 返回這個頭節點。 - verification ![image](https://hackmd.io/_uploads/rJEmSG8Fye.png) > [Commit 690d142](https://github.com/sysprog21/lab0-c/commit/690d1426d65bc47b0a8d9215102140cbeffab915) --- ## q_free Function objective : free whole queue 作法 1. guard clause : 如果 `head` 是 NULL,直接返回。 2. 使用 `list_for_each_entry_safe` 遍歷佇列中的每個 element_t 對每個節點: 用 `list_del` 將節點從鏈表中移除。 用 `q_release_element` 釋放節點的記憶體 3. 最後用 free 釋放頭節點 head。 - verification ![image](https://hackmd.io/_uploads/Syg-uE_YJe.png) > [Commit 690d142](https://github.com/sysprog21/lab0-c/commit/690d1426d65bc47b0a8d9215102140cbeffab915) --- ## q_insert_head 作法 : 1. guard clause 2. 動態分配新的 element_t 結構 `new`。 如果分配失敗,返回 false。 3. 初始化 `new` 4. 複製輸入字串 `s` 到 `new`。 如果 `strncpy` 失敗,釋放 `new` 並返回 false。 5. 將 `new` 插入到 `head` 的後面。 返回 true 表示成功。 > [Commit c4c114d](https://github.com/sysprog21/lab0-c/commit/c4c114db2beb5e6280afe4df610a49697d86bf3c) > 增加動態分配記憶體與 `strncpy` 後的memory allocation validation > [Commit faad007](https://github.com/sysprog21/lab0-c/commit/faad0076345c42dd431cbc3e04e7db16ba44f780) --- ## q_insert_tail 作法 : 與 `q_insert_head` 邏輯相同 > [Commit c4c114d](https://github.com/sysprog21/lab0-c/commit/c4c114db2beb5e6280afe4df610a49697d86bf3c) 增加動態分配記憶體與 `strncpy` 後的memory allocation validation > [Commit faad007](https://github.com/sysprog21/lab0-c/commit/faad0076345c42dd431cbc3e04e7db16ba44f780) --- ## q_remove_head 作法: 1. guard clause 2. 找到頭部第一個元素,將其轉換為 element_t 結構。 3. 將該元素從list中移除。 4. 如果 `sp` 不為 NULL,用 `strncpy` 將移除元素的 value 複製到 `sp` 中,並末尾保留位置設置字串終止符 `\0`。 5. 返回被移除的 element_t 指標。 > [Commit a4389d3](https://github.com/sysprog21/lab0-c/commit/a4389d3322e6e65e01f92474a2b4c74c74ab48fc) 字串增加設置以 `\0` 結尾,確保了安全性與一致性,解決會導致未定義行為的可能性 > [Commit faad007](https://github.com/sysprog21/lab0-c/commit/faad0076345c42dd431cbc3e04e7db16ba44f780) --- ## q_remove_tail 作法 : 與 `q_remove_tail` 同 > [Commit a4389d3](https://github.com/sysprog21/lab0-c/commit/a4389d3322e6e65e01f92474a2b4c74c74ab48fc) 字串增加設置以 `\0` 結尾,確保了安全性與一致性,解決會導致未定義行為的可能性 > [Commit faad007](https://github.com/sysprog21/lab0-c/commit/faad0076345c42dd431cbc3e04e7db16ba44f780) --- ## q_size 作法 1. guard clause 2. 定義一個計數器 count,初始值為 0。 3. 遍歷 `head`的每個節點 每次遍歷,count 增加 1。 4. 返回 count。 > [Commit c4c114d](https://github.com/sysprog21/lab0-c/commit/c4c114db2beb5e6280afe4df610a49697d86bf3c) --- ## q_delete_mid 作法 : 1. guard clause 2. 定義 `slow` 和 `fast` 指針,初始都指向第一個元素 3. 進入迴圈,檢查 fast 的位置: `fast` 以 `slow` 兩倍stride前進如果 `fast`快到末尾,則中間節點是 slow->next,移除它。 4. 移除目標節點,並釋放該節點。 5. 返回 true 表示成功。 >[commit f2a7c4f](https://github.com/sysprog21/lab0-c/commit/f2a7c4f5473d29fb860ed4ebdbcbefb66e6b1604) 修改未適當釋放element記憶體的部分 > [Commit b6ee485](https://github.com/sysprog21/lab0-c/commit/b6ee4856894c3ad8a6d0c147239d2ccba7bf8954) --- ## q_delete_dup 根據 [leetcode 82. Remove Duplicates from Sorted List II](https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/) 要求,須將重複的val之node全刪除。若是只刪除重複的node,較為容易,因此轉而思考如何將多的那個node刪除。最後參考 [王漢棋](https://hackmd.io/tqrBvKRgTTm-LZzlde0V5A?both) 同學實作,使用 `flag` 觀念。 - 作法流程 1. edge condition check 2. 用 `list_for_each_safe` 遍歷 3. 若目前node之val與下一個node之val相同 - 刪除目前node - flag設置為true 6. 其他情況若flag為1也刪除當前node,並重設flag為false :::info 若用list_for_each_entry_safe遍歷就不需要釋放記憶體時都要再 `list_entry()` 使用此函式做"向上轉型" ::: > [Commit 585701f](https://github.com/sysprog21/lab0-c/commit/585701f6e14e71102520d2bf959803f5a9e2a47f) --- ## q_swap 作法: 1. guard clause 2. 定義一個輔助函數 `swap2node` ,用來交換兩個相鄰節點: 3. 在主函數中,遍歷鏈表,計算節點數量 `count` 當 `count` == 2 時,調用 `swap2node` 交換當前節點和前一個節點,然後重置 `count` 為 0。 > [commit 889afe7](https://github.com/sysprog21/lab0-c/commit/889afe7eecb98a42d70ee23af656a884f77641e0) 將 `swap2node` 函式移到外面消除nested function warning > [Commit b6ee485](https://github.com/sysprog21/lab0-c/commit/b6ee4856894c3ad8a6d0c147239d2ccba7bf8954) --- ## q_reverse 作法 : 1. guard clause 2. 遍歷list,並交換當前節點 `next` 、 `prev`方向 3. 最後在交換 `head` 的 `next` 、 `prev` 完成reverse > [commit 889afe7](https://github.com/sysprog21/lab0-c/commit/889afe7eecb98a42d70ee23af656a884f77641e0) --- ## q_reverseK 用到之list.h函式講解: 1. **list_cut_position** * 目的 : 把一條list一部分剪取下來 - 傳入的引數 - head_to : 被切下的 `list` 所要接入的點(必須為empty) - head_from : 被切下的 `list` 的開始node - node : 被切下的 `list` 結尾node ```c static inline void list_cut_position(struct list_head *head_to, struct list_head *head_from, struct list_head *node) { struct list_head *head_from_first = head_from->next; if (list_empty(head_from)) return; if (head_from == node) { INIT_LIST_HEAD(head_to); return; } head_from->next = node->next; head_from->next->prev = head_from; head_to->prev = node; node->next = head_to; head_to->next = head_from_first; head_to->next->prev = head_to; } ``` 2. **list_splice** * 目的 : 把 `list` 插入到另外一條中 ```c static inline void list_splice(struct list_head *list, struct list_head *head) { struct list_head *head_first = head->next; struct list_head *list_first = list->next; struct list_head *list_last = list->prev; if (list_empty(list)) return; head->next = list_first; list_first->prev = head; list_last->next = head_first; head_first->prev = list_last; } ``` 3. **list_splice_init** ```c static inline void list_splice_init(struct list_head *list, struct list_head *head) { list_splice(list, head); INIT_LIST_HEAD(list); } ``` * 多加了一 `LIST_HEAD()` 函式對 `list` 做初始化,把next、prev洗掉 :::spoiler 程式碼 ```c /* Reverse the nodes of the list k at a time */ void q_reverseK(struct list_head *head, int k) { //Guard Clause if(!head || list_empty(head) || list_is_singular(head) || k<2) return; // LIST_HEAD(reversek); LIST_HEAD(buffer); int count = 0; // struct list_head *trav, *safe; list_for_each_safe(trav, safe, head) { count++; if (count == k) { list_cut_position(&reversek,head,trav); count = 0; q_reverse(&reversek); list_splice_init(&reversek,(&buffer)->prev); } } list_splice_init(&buffer,head); } ``` ::: --- ## q_sort 我以merge_sort實作`q_sort` Merge Sort 核心原理 : - 圖解說明 <contributed by [wanhanchi](https://hackmd.io/tqrBvKRgTTm-LZzlde0V5A?both)> ```graphviz digraph G { compound=true node[shape=box, height=.1];sorted_1 sorted_2 sorted_3 sorted_4 sorted_5 sorted_6 sorted_7 sorted_8; node[shape=record, height=.1];input result divide_41 divide_42 divide_21 divide_22 divide_23 divide_24 merge_21 merge_22 merge_23 merge_24 merge_41 merge_42; input[label="<f0>2|<f1>5|<f2>4|<f3>6|<f4>8|<f5>1|<f6>7|<f7>3"] result[label="<f0>1|<f1>2|<f2>3|<f3>4|<f4>5|<f5>6|<f6>7|<f7>8"] divide_41[label="<f1>2|<f2>5|<f3>4|<f4>6"] divide_42[label="<f1>8|<f2>1|<f3>7|<f4>3"] divide_21[label="<f0>2|<f1>5"] divide_22[label="<f0>4|<f1>6"] divide_23[label="<f0>8|<f1>1"] divide_24[label="<f0>7|<f1>3"] sorted_1[label="1"] sorted_2[label="2"] sorted_3[label="3"] sorted_4[label="4"] sorted_5[label="5"] sorted_6[label="6"] sorted_7[label="7"] sorted_8[label="8"] merge_21[label="<f0>2|<f1>5"] merge_22[label="<f0>4|<f1>6"] merge_23[label="<f0>1|<f1>8"] merge_24[label="<f0>3|<f1>7"] merge_41[label="<f1>2|<f2>4|<f3>5|<f4>6"] merge_42[label="<f1>1|<f2>3|<f3>7|<f4>8"] subgraph cluster_0 { style=filled; color="#ef6548"; label="divide"; divide_pad[label="----------------------", style=invis] divide_pad -> divide_23[style=invis] input -> divide_41 input -> divide_42 divide_41 -> divide_21 divide_41 -> divide_22 divide_42 -> divide_23 divide_42 -> divide_24 } divide_21:f0 -> sorted_2 divide_21:f1 -> sorted_5 divide_22 -> sorted_4 divide_22 -> sorted_6 divide_23:f0 -> sorted_8 divide_23:f1 -> sorted_1 divide_24:f0 -> sorted_7 divide_24:f1 -> sorted_3 subgraph cluster_1 { style=filled; color="#a6cee3"; label="sorted lists"; sorted_1; sorted_2; sorted_3; sorted_4; sorted_5; sorted_6; sorted_7; sorted_8; } sorted_2 -> merge_21:f0 sorted_5 -> merge_21:f1 sorted_4 -> merge_22:f0 sorted_6 -> merge_22:f1 sorted_8 -> merge_23:f1[constraint=false] sorted_1 -> merge_23:f0 sorted_7 -> merge_24:f1 sorted_3 -> merge_24:f0 subgraph cluster_2 { style=filled; color="#b2df8a"; label="merge"; merge_pad[label="--------------------", style=invis] rank=same; merge_41; merge_pad; merge_42 rank=same; merge_41; merge_42; merge_pad; merge_22 -> merge_pad[style=invis] merge_23 -> merge_pad[style=invis] merge_pad -> result[style=invis] merge_21 -> merge_41 merge_22 -> merge_41 merge_23 -> merge_42 merge_24 -> merge_42 merge_41 -> result merge_42 -> result } } ``` 1. 分割 (Divide): - 將鏈結串列不斷地對半分,直到每個子串列只剩下一個節點(或空串列)。一個節點的串列自然就是已排序的。 - 快慢指標 (Fast and Slow Pointers): 這是找到鏈結串列中點的常用技巧。快指標每次移動兩個節點,慢指標每次移動一個節點。當快指標到達串列尾端時,慢指標正好在串列中間。 2. 解決 (Conquer): - 當子串列只剩下一個節點時,它們已經是排序好的。 3. 合併 (Combine): - 將兩個已排序的子串列合併成一個新的已排序串列。這是 Merge Sort 的關鍵步驟。 合併兩個已排序串列 (Merge Two Sorted Lists): - 使用兩個指標(通常稱為 p1 和 p2),分別指向兩個子串列的開頭。 - 比較 p1 和 p2 指向的節點的值。 - 將較小(或較大,取決於排序方向)的節點添加到結果串列的尾部,並將對應的指標向前移動。 - 重複這個過程,直到其中一個子串列為空。 - 將另一個子串列的剩餘部分直接添加到結果串列的尾部。 - 可以建立一個 dummy head 簡化操作. - 可以利用 Indirect Pointer. 在過程中我遇到兩個問題 1. sort not stable(重複值sort後的相對位置改變) ![image](https://hackmd.io/_uploads/HyX5o-QnJg.png) ```diff struct list_head *merge2sortedlist(struct list_head *left, struct list_head *right, bool descend) { struct list_head *new_head = NULL, **indirect = &new_head, **chosen_list_ptr = NULL; for (; left && right; *chosen_list_ptr = (*chosen_list_ptr)->next) { if (descend) { chosen_list_ptr = strcmp(list_entry(left, element_t, list)->value, list_entry(right, element_t, list)->value) >= 0 ? &left : &right; } else { chosen_list_ptr = strcmp(list_entry(left, element_t, list)->value, - list_entry(right, element_t, list)->value) >= 0 - ? &rigjt - : &left; + list_entry(right, element_t, list)->value) <= 0 + ? &left + : &right; } *indirect = *chosen_list_ptr; indirect = &(*indirect)->next; } *indirect = (struct list_head *) ((__int64_t) left | (__int64_t) right); return new_head; } ``` 修改成set = condition時chosen_list_ptr 先拿left的,即可不更動重複case的相對位置 2. segmentation fault (core dumped) 執行`trace-03-ops`時 sort 總是無法順利進行,輸出如下: ![image](https://hackmd.io/_uploads/SkhPiZQ2Jg.png) 後來發現原因為 : 我 `merge_sort` 時其實僅僅只對 `next` 做正確設置,並沒有對每個 element 的 `prev` 進行設置。我沒注意到此事,因此在 `q_sort` 完我僅僅將頭尾重接,卻並 沒管理到每個element_t的 `prev` 部分。導致若之後對此q_sort完的queue進行處理,會有極大機率會產生segmentation fault ```diff - struct list_head *trav = head; + struct list_head *trav = head, *safe = head->next; - for (; trav->next != NULL; trav = trav->next) - ; + for (; safe != NULL; trav = trav->next, safe = safe->next){ + safe->prev = trav; + } trav->next = head; head->prev = trav; return; ``` 在更改完,在遍歷時利用正確的`next` ,將每個`element` 的`prev`管理設置正確,最後頭尾接回後即測試成功 > [Commit e14947a](https://github.com/sysprog21/lab0-c/commit/e14947a8221b2ac6774edb2a6d8f423ff693ad56) --- ## q_descend 首先根據[leetcode 2487. Remove Nodes From Linked List](https://leetcode.com/problems/remove-nodes-from-linked-list/description/)我參考了這部[影片](https://www.youtube.com/watch?v=IGqowY9tjpw) ![image](https://hackmd.io/_uploads/H1COi0io1l.png) 講者提出總共四種策略 第一種 最直觀方法但效率低下 ( O(n^2) ): 作法: - 對於每個節點 (P): 遍歷 P 之後的所有節點(即 P 的右側)。 - 比較: 如果在 P 的右側找到任何一個值 大於 P 的節點,就移除 P。 第二種 "壞節點" 陣列 ( O(n) 時間、O(n) 空間): 作法 : - 從 右向左 遍歷陣列。 - 維護一個 `suffix_max` 變數,記錄目前為止(從右向左看)的最大值。 - 如果當前元素 小於 `suffix_max` ,就將其標記為「壞的」(需要移除)。 - 最後,再次遍歷原始陣列,移除所有「壞的」節點。 這種方法的時間複雜度是 O(n)(兩次遍歷),空間複雜度是 O(n)(用於儲存 `suffix_max` 或「壞節點」陣列)。 第三種 堆疊Stack ( O(n) 時間、O(n) 空間) 作法 : - 從 左向右 遍歷陣列。 - 將元素推入堆疊。 - 在推入一個元素之前,先從堆疊中彈出所有 小於 當前元素的元素。 - 遍歷結束後,堆疊中剩下的就是應該 保留 的元素(順序是反的)。 第四種 linked list最佳方法 ( O(n) 時間, O(1) 空間 ) : 講者最終提出了適用於 linked-list 的高效方法: 作法 : - 反轉鏈結串列: 反轉鏈結串列。這一步很關鍵,因為它讓我們可以從原來的尾端(現在的頭端)開始遍歷,並且只需一次遍歷就能追蹤「目前最大值」。 - 遍歷並比較: 遍歷 反轉後 的鏈結串列。 - 維護一個 `max_so_far` 變數,初始值設為第一個節點(原來的尾端節點)的值。 - 對於每個節點: - 如果當前節點的值 小於 `max_so_far`,則移除它(使用標準的鏈結串列節點移除方法:調整前後節點的指標)。 - 否則(如果當前節點的值大於或等於 `max_so_far`),則將 `max_so_far` 更新為當前節點的值。 - 再次反轉: 再次反轉鏈結串列,恢復原始順序。 根據第四種方法,就是從尾部逆向遍歷,且維護一個全局最大值 `max_so_far` 小於他則delete,大於他則更新。但linux linked-list結構是doubly,因此不需要像leetcode作法一樣做兩次reverse,即可直接逆向遍歷。 `q_descend` 作法流程 : - 檢查邊界條件是否NULL、是否空、是否單節點 - 逆遍歷,且維護一個全局最大值 `max_so_far` 小於他則delete,大於他則更新 - `max_so_far` 為指向char之指標 >[commit 2105efd](https://github.com/sysprog21/lab0-c/commit/2105efd30ded356fdcfecac0b9ce92c3541f5dee) :::spoiler 程式碼 ```c int q_descend(struct list_head *head) { // https://leetcode.com/problems/remove-nodes-from-linked-list/ if (!head || list_empty(head) || list_is_singular(head)) return 0; struct list_head *trav = head->prev; struct list_head *safe; const char *max_so_far = list_entry(trav, element_t, list)->value; while (trav != head) { safe = trav->prev; if (strcmp(max_so_far, list_entry(trav, element_t, list)->value) > 0) { list_del(trav); q_release_element(list_entry(trav, element_t, list)); } else { max_so_far = list_entry(trav, element_t, list)->value; } trav = safe; } return q_size(head); } ``` ::: :::info 若是將相等與小於分開判斷,能夠避免相等後又跑一遍 `max_so_far = list_entry(trav, element_t, list)->value;` 值帶入,或許效能較好。 ::: --- ## q_ascend 目的是與 `q_descend` 相反,刪除右邊所有比本身node小的所有node。因此邏輯與 `q_descend` 完全相同,將判斷式改變即可。 >[commit 2105efd](https://github.com/sysprog21/lab0-c/commit/2105efd30ded356fdcfecac0b9ce92c3541f5dee) --- ## q_merge 參考[蔡忠翰](https://hackmd.io/@jeremytsai/linux2024-homework1#%E7%A0%94%E8%AE%80%E8%AB%96%E6%96%87%E4%B8%A6%E6%94%B9%E9%80%B2-dudect)的實作 作法 : 1. guard clause : 如果 head 為空指標或串列為空,直接返回 0,表示無需處理 2. 初始化臨時佇列 使用 `LIST_HEAD` 定義一個臨時雙向鏈結串列 temp,並透過 `INIT_LIST_HEAD` 初始化其結構。 3. 合併所有佇列 使用 `list_for_each_entry_safe` 遍歷 `head` 中的每個 queue_contex_t 節點(透過 `chain` 成員連結),並將每個節點的佇列(`trav`->`q`)拼接至 `temp`。 4. 排序臨時佇列 調用 `q_sort`,對 `temp` 中的所有元素進行排序,根據 `descend` 決定升序或降序。 5. 還原至原始結構 將排序後的 `temp` 佇列拼接回 `head` 中第一個 queue_contex_t 節點的佇列並清空 `temp`。 6. 返回大小 調用 `q_size` 返回合併後佇列的總元素數量。 > [Commit 3651919](https://github.com/sysprog21/lab0-c/commit/3651919171b9836d719aacce5778701573dbec5d) ## 使用 Valgrind 排除 qtest 實作的記憶體問題 根據 [N01: lab0 以 Valgrind 分析記憶體問題](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-b) ,memory leak 分為四種 : 1. `definitely lost` : 真的 memory leak 2. `indirectly lost`: 間接的 memory leak,structure 本身發生 memory leak,而內部的 member 如果是 allocate 的出來的,一樣會 memory leak,但是只要修好前面的問題,後面的問題也會跟著修復。 3. `possibly lost: allocate` 一塊記憶體,並且放到指標 ptr,但事後又改變 ptr 指到這塊記憶體的中間 4. `still reachable`: 程式結束時有未釋放的記憶體,不過卻還有指標指著,通常會發生在 global 變數,即便是在所用的函式庫裡頭配置的記憶體,也可偵測到,這也是動態分析手法的優勢。 :::spoiler 當時跑分 ``` bev_m@MSI:~/course/LK/lab0-c$ make test Progress: [########################################] 100% Fingerprint: gywtakew789c051686c0ghmz scripts/driver.py -c --- Trace Points +++ TESTING trace trace-01-ops: # Test of 'q_new', 'q_insert_head', and 'q_remove_head' --- trace-01-ops 5/5 +++ TESTING trace trace-02-ops: # Test of 'q_new', 'q_insert_head', 'q_insert_tail', 'q_remove_head', 'q_remove_tail', and 'q_delete_mid' --- trace-02-ops 6/6 +++ TESTING trace trace-03-ops: # Test of 'q_new', 'q_insert_head', 'q_remove_head', 'q_reverse', 'q_sort', and 'q_merge' --- trace-03-ops 0/6 +++ TESTING trace trace-04-ops: # Test of 'q_new', 'q_insert_tail', 'q_insert_head', 'q_remove_head', 'q_size', 'q_swap', and 'q_sort' --- trace-04-ops 6/6 +++ TESTING trace trace-05-ops: # Test of 'q_new', 'q_free', 'q_insert_head', 'q_insert_tail', 'q_remove_head', 'q_reverse', 'q_size', 'q_swap', and 'q_sort' ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient ERROR: Removed value bear != expected value meerkat Segmentation fault occurred. You dereferenced a NULL or invalid pointer--- trace-05-ops 0/6 +++ TESTING trace trace-06-ops: # Test of 'q_new', 'q_free', 'q_insert_head', 'q_insert_tail', 'q_remove_head', 'q_delete_dup', 'q_sort', 'q_descend', and 'q_reverseK' ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient ERROR: Calling delete duplicate on null queue Segmentation fault occurred. You dereferenced a NULL or invalid pointer--- trace-06-ops 0/6 +++ TESTING trace trace-07-string: # Test of truncated strings: 'q_new', 'q_free', 'q_insert_head', 'q_insert_tail', 'q_remove_head', 'q_reverse', and 'q_sort' --- trace-07-string 6/6 +++ TESTING trace trace-08-robust: # Test operations on empty queue: 'q_new', 'q_remove_head', 'q_reverse', 'q_size', and 'q_sort' --- trace-08-robust 6/6 +++ TESTING trace trace-09-robust: # Test 'q_remove_head' with NULL argument: 'q_new', 'q_insert_head', and 'q_remove_head' --- trace-09-robust 6/6 +++ TESTING trace trace-10-robust: # Test operations on NULL queue: 'q_free', 'q_insert_head', 'q_insert_tail', 'q_remove_head', 'q_reverse', 'q_size', and 'q_sort' --- trace-10-robust 6/6 +++ TESTING trace trace-11-malloc: # Test of malloc failure on 'q_insert_head': 'q_new', and 'q_insert_head' --- trace-11-malloc 6/6 +++ TESTING trace trace-12-malloc: # Test of malloc failure on 'q_insert_tail': 'q_new', 'q_insert_head', and 'q_insert_tail' --- trace-12-malloc 6/6 +++ TESTING trace trace-13-malloc: # Test of malloc failure on 'q_new': 'q_new' --- trace-13-malloc 6/6 +++ TESTING trace trace-14-perf: # Test performance of 'q_new', 'q_insert_head', 'q_insert_tail', 'q_reverse', and 'q_sort' Warning: Skip checking the stability of the sort because the number of elements 2000000 is too large, exceeds the limit 100000. Segmentation fault occurred. You dereferenced a NULL or invalid pointer--- trace-14-perf 0/6 +++ TESTING trace trace-15-perf: # Test performance of 'q_sort' with random and descending orders: 'q_new', 'q_free', 'q_insert_head', 'q_sort', and 'q_reverse' # 10000: all correct sorting algorithms are expected pass # 50000: sorting algorithms with O(n^2) time complexity are expected failed # 100000: sorting algorithms with O(nlogn) time complexity are expected pass ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient ERROR: Not sorted in ascending order Segmentation fault occurred. You dereferenced a NULL or invalid pointer--- trace-15-perf 0/6 +++ TESTING trace trace-16-perf: # Test performance of 'q_new', 'q_insert_head', 'q_insert_tail', and 'q_reverse' --- trace-16-perf 6/6 +++ TESTING trace trace-17-complexity: # Test if time complexity of 'q_insert_tail', 'q_insert_head', 'q_remove_tail', and 'q_remove_head' is constant Probably constant time Probably constant time Probably constant time Probably constant time --- trace-17-complexity 5/5 --- TOTAL 70/100 make: *** [Makefile:65: test] Error 1 ``` ::: 用 `valgrind` 跑 trace-03-ops.cmd 測資,此為test中第一個出現fail且非效能測試 ``` valgrind --tool=memcheck ./qtest -f ./traces/trace-03-ops.cmd ``` ::: spoiler valgrind +++ TESTING trace trace-03-ops: Test of 'q_new', 'q_insert_head', 'q_remove_head', 'q_reverse', 'q_sort', and 'q_merge' ==86422== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==86422== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==86422== Can't extend stack to 0x1ffe8010b8 during signal delivery for thread 1: ==86422== no stack segment ==86422== ==86422== Process terminating with default action of signal 11 (SIGSEGV) ==86422== Access not within mapped region at address 0x1FFE8010B8 ==86422== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==86422== at 0x11047E: merge_sort (queue.c:271) ==86422== If you believe this happened as a result of a stack ==86422== overflow in your program's main thread (unlikely but ==86422== possible), you can try to increase the size of the ==86422== main thread stack using the --main-stacksize= flag. ==86422== The main thread stack size used in this run was 8388608. ==86422== 6 bytes in 1 blocks are still reachable in loss record 1 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E3D7: strsave_or_fail (report.c:256) ==86422== by 0x10ECF3: parse_args (console.c:154) ==86422== by 0x10ECF3: interpret_cmd (console.c:233) ==86422== by 0x10EFC0: cmd_select (console.c:604) ==86422== by 0x10F8A1: run_console (console.c:684) ==86422== by 0x10DB20: main (qtest.c:1441) ==86422== ==86422== 8 bytes in 1 blocks are still reachable in loss record 2 of 45 ==86422== at 0x484D953: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E344: calloc_or_fail (report.c:234) ==86422== by 0x10ECC5: parse_args (console.c:151) ==86422== by 0x10ECC5: interpret_cmd (console.c:233) ==86422== by 0x10EFC0: cmd_select (console.c:604) ==86422== by 0x10F8A1: run_console (console.c:684) ==86422== by 0x10DB20: main (qtest.c:1441) ==86422== ==86422== 32 bytes in 1 blocks are still reachable in loss record 3 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10D126: do_new (qtest.c:151) ==86422== by 0x10E97E: interpret_cmda (console.c:214) ==86422== by 0x10ED28: interpret_cmd (console.c:234) ==86422== by 0x10EFC0: cmd_select (console.c:604) ==86422== by 0x10F8A1: run_console (console.c:684) ==86422== by 0x10DB20: main (qtest.c:1441) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 4 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10F3C6: init_cmd (console.c:436) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 5 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10F3E7: init_cmd (console.c:437) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 6 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10F404: init_cmd (console.c:440) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 7 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10F428: init_cmd (console.c:441) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 8 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10F445: init_cmd (console.c:442) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 9 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10F466: init_cmd (console.c:443) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 10 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10F487: init_cmd (console.c:444) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 11 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10F4A8: init_cmd (console.c:445) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 12 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F0AC: add_param (console.c:105) ==86422== by 0x10F4C7: init_cmd (console.c:446) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 13 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F0AC: add_param (console.c:105) ==86422== by 0x10F4E6: init_cmd (console.c:447) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 14 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F0AC: add_param (console.c:105) ==86422== by 0x10F505: init_cmd (console.c:448) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 15 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F0AC: add_param (console.c:105) ==86422== by 0x10F524: init_cmd (console.c:449) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 16 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F0AC: add_param (console.c:105) ==86422== by 0x10F543: init_cmd (console.c:450) ==86422== by 0x10D824: main (qtest.c:1420) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 17 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D848: console_init (qtest.c:1061) ==86422== by 0x10D848: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 18 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D865: console_init (qtest.c:1062) ==86422== by 0x10D865: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 19 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D882: console_init (qtest.c:1063) ==86422== by 0x10D882: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 20 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D89F: console_init (qtest.c:1064) ==86422== by 0x10D89F: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 21 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D8C3: console_init (qtest.c:1065) ==86422== by 0x10D8C3: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 22 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D8E0: console_init (qtest.c:1069) ==86422== by 0x10D8E0: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 23 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D904: console_init (qtest.c:1073) ==86422== by 0x10D904: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 24 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D921: console_init (qtest.c:1077) ==86422== by 0x10D921: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 25 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D93E: console_init (qtest.c:1081) ==86422== by 0x10D93E: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 26 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D95B: console_init (qtest.c:1082) ==86422== by 0x10D95B: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 27 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D97C: console_init (qtest.c:1083) ==86422== by 0x10D97C: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 28 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D999: console_init (qtest.c:1084) ==86422== by 0x10D999: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 29 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D9B6: console_init (qtest.c:1085) ==86422== by 0x10D9B6: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 30 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D9D3: console_init (qtest.c:1086) ==86422== by 0x10D9D3: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 31 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10D9F0: console_init (qtest.c:1087) ==86422== by 0x10D9F0: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 32 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10DA0D: console_init (qtest.c:1088) ==86422== by 0x10DA0D: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 33 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10DA2A: console_init (qtest.c:1089) ==86422== by 0x10DA2A: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 34 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10DA4A: console_init (qtest.c:1093) ==86422== by 0x10DA4A: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 35 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F028: add_cmd (console.c:85) ==86422== by 0x10DA6B: console_init (qtest.c:1097) ==86422== by 0x10DA6B: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 36 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F0AC: add_param (console.c:105) ==86422== by 0x10DA8A: console_init (qtest.c:1099) ==86422== by 0x10DA8A: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 37 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F0AC: add_param (console.c:105) ==86422== by 0x10DAA9: console_init (qtest.c:1101) ==86422== by 0x10DAA9: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 38 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F0AC: add_param (console.c:105) ==86422== by 0x10DAC8: console_init (qtest.c:1103) ==86422== by 0x10DAC8: main (qtest.c:1421) ==86422== ==86422== 40 bytes in 1 blocks are still reachable in loss record 39 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10F0AC: add_param (console.c:105) ==86422== by 0x10DAE3: console_init (qtest.c:1105) ==86422== by 0x10DAE3: main (qtest.c:1421) ==86422== ==86422== 64 bytes in 2 blocks are possibly lost in loss record 40 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10D126: do_new (qtest.c:151) ==86422== by 0x10E97E: interpret_cmda (console.c:214) ==86422== by 0x10ED28: interpret_cmd (console.c:234) ==86422== by 0x10EFC0: cmd_select (console.c:604) ==86422== by 0x10F8A1: run_console (console.c:684) ==86422== by 0x10DB20: main (qtest.c:1441) ==86422== ==86422== 168 bytes in 3 blocks are still reachable in loss record 41 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10F92D: alloc (harness.c:146) ==86422== by 0x10FA80: test_malloc (harness.c:176) ==86422== by 0x10FDCE: q_new (queue.c:10) ==86422== by 0x10D15F: do_new (qtest.c:155) ==86422== by 0x10E97E: interpret_cmda (console.c:214) ==86422== by 0x10ED28: interpret_cmd (console.c:234) ==86422== by 0x10EFC0: cmd_select (console.c:604) ==86422== by 0x10F8A1: run_console (console.c:684) ==86422== by 0x10DB20: main (qtest.c:1441) ==86422== ==86422== 378 bytes in 9 blocks are still reachable in loss record 42 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10F92D: alloc (harness.c:146) ==86422== by 0x10FA80: test_malloc (harness.c:176) ==86422== by 0x10FC41: test_strdup (harness.c:231) ==86422== by 0x10FE8E: q_insert_head (queue.c:40) ==86422== by 0x10CEEB: queue_insert (qtest.c:234) ==86422== by 0x10D0BC: do_ih (qtest.c:282) ==86422== by 0x10E97E: interpret_cmda (console.c:214) ==86422== by 0x10ED28: interpret_cmd (console.c:234) ==86422== by 0x10EFC0: cmd_select (console.c:604) ==86422== by 0x10F8A1: run_console (console.c:684) ==86422== by 0x10DB20: main (qtest.c:1441) ==86422== ==86422== 576 bytes in 9 blocks are still reachable in loss record 43 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10F92D: alloc (harness.c:146) ==86422== by 0x10FA80: test_malloc (harness.c:176) ==86422== by 0x10FE6D: q_insert_head (queue.c:36) ==86422== by 0x10CEEB: queue_insert (qtest.c:234) ==86422== by 0x10D0BC: do_ih (qtest.c:282) ==86422== by 0x10E97E: interpret_cmda (console.c:214) ==86422== by 0x10ED28: interpret_cmd (console.c:234) ==86422== by 0x10EFC0: cmd_select (console.c:604) ==86422== by 0x10F8A1: run_console (console.c:684) ==86422== by 0x10DB20: main (qtest.c:1441) ==86422== ==86422== 1,024 bytes in 1 blocks are still reachable in loss record 44 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x49D01B4: _IO_file_doallocate (filedoalloc.c:101) ==86422== by 0x49E0523: _IO_doallocbuf (genops.c:347) ==86422== by 0x49DDF8F: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:745) ==86422== by 0x49DEAAE: _IO_new_file_xsputn (fileops.c:1244) ==86422== by 0x49DEAAE: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1197) ==86422== by 0x49ABCC8: __printf_buffer_flush_to_file (printf_buffer_to_file.c:59) ==86422== by 0x49ABCC8: __printf_buffer_to_file_done (printf_buffer_to_file.c:120) ==86422== by 0x49B6742: __vfprintf_internal (vfprintf-internal.c:1545) ==86422== by 0x10E1F0: vfprintf (stdio2.h:109) ==86422== by 0x10E1F0: report_noreturn (report.c:142) ==86422== by 0x10E62F: do_comment_cmd (console.c:289) ==86422== by 0x10E97E: interpret_cmda (console.c:214) ==86422== by 0x10ED28: interpret_cmd (console.c:234) ==86422== by 0x10EFC0: cmd_select (console.c:604) ==86422== ==86422== 8,216 bytes in 1 blocks are still reachable in loss record 45 of 45 ==86422== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==86422== by 0x10E2C9: malloc_or_fail (report.c:215) ==86422== by 0x10EB06: push_file (console.c:471) ==86422== by 0x10F77B: run_console (console.c:662) ==86422== by 0x10DB20: main (qtest.c:1441) ==86422== ::: 1. 主要問題:Stack Overflow 報告中提到以下關鍵錯誤: ``` 86422 Stack overflow in thread #1: can't grow stack to 0x1ffe801000 86422 Can't extend stack to 0x1ffe8010b8 during signal delivery for thread 1: 86422 no stack segment 86422 Process terminating with default action of signal 11 (SIGSEGV) 86422 Access not within mapped region at address 0x1FFE8010B8 86422 at 0x11047E: merge_sort (queue.c:271) ``` 具體出現在 merge_sort 函數中(位於 queue.c 文件的第 271 行)。程式試圖使用超出分配stack大小的記憶體空間。在這裡,程式嘗試將stack擴展到地址 `0x1ffe801000`,但失敗了,因為該地址超出了映射的記憶體區域。這導致程序試圖存取無效記憶體地址 `0x1ffe8010b8` ,引發了分段錯誤(SIGSEGV=segmentation fall),最終導致程序終止。 * 可能原因: 從側資輸出發現前面三次sort都正常執行,到最後merge才出現SIGSEGV。是merge_sort 函數可能使用了過深的recursion,每次函數調用都會在stack上分配新的空間,當數據量很大時,stack空間被耗盡,默認的主執行緒stack大小為 8MB(8388608 字節),如報告中提到: ``` The main thread stack size used in this run was 8388608. ``` 如果遞迴深度超過這個限制,就會觸發stack overflow。 2. 有45 個 `still reachable` 的記憶體塊 這些記憶體在程序結束時未被釋放 ``` 86422 6 bytes in 1 blocks are still reachable in loss record 1 of 45 86422 at 0x4846828: malloc (...) 86422 by 0x10E3D7: strsave_or_fail (report.c:256) 86422 8 bytes in 1 blocks are still reachable in loss record 2 of 45 86422 at 0x484D953: calloc (...) 86422 by 0x10E344: calloc_or_fail (report.c:234) 86422 32 bytes in 1 blocks are still reachable in loss record 3 of 45 86422 at 0x4846828: malloc (...) 86422 by 0x10D126: do_new (qtest.c:151) 86422 8,216 bytes in 1 blocks are still reachable in loss record 45 of 45 86422 at 0x4846828: malloc (...) 86422 by 0x10EB06: push_file (console.c:471) ``` 這些記憶體塊是程式分配的,但直到程序終止時仍未釋放。它們被標記為 `still reachable` ,它們仍然有指針指向這些記憶體,不是記憶體洩漏(memory leak)。這些記憶體的大小從 6 字節到 8,216 字節不等,分別在不同函數中分配,例如 `strsave_or_fail`、`calloc_or_fail`、`do_new` 和 `push_file` 等。由於程式因stack overflow異常終止,這些記憶體未被正常釋放是預期之內的。當程序退出時,作業系統會自動回收這些記憶體 首先我懷疑是 `q_merge` 出現問題於是單獨測試功能但都正常。後來才懷疑到是用到的 `q_sort` 有問題,詳情在[q_sort遇到問題二](###q_sort) ## 透過 Massif 視覺化 測資過程中的記憶體使用量 這邊參考 [alanjiang85](https://hackmd.io/@alanjian85/lab0-2023#%E9%80%8F%E9%81%8E-Massif-%E5%88%86%E6%9E%90%E8%A8%98%E6%86%B6%E9%AB%94%E4%BD%BF%E7%94%A8%E9%87%8F) 所表示需挑選適當的測試資料時要避免效能類型的測試,因為以 Valgrind 執行測試程式會比較慢,容易導致出現超過時間限制的狀況。因此最後用 `trace-eg.cmd` 進行分析 ![image](https://hackmd.io/_uploads/SkHk5kE3Jx.png) * 時間單位:`i` Massif 使用 `i` 表示時間單位,通常代表程式執行的指令數量(instructions),記憶體使用情況是根據指令執行次數來記錄的。 * `snapshots`:9 Massif 在程式執行期間拍攝了 9 個記憶體快照,這些快照記錄了不同時間點的記憶體使用情況。 * `peak` : snapshot # 6 after 290678 "i" 程式在執行 290,678 條指令後達到記憶體使用的高峰 * 峰值記憶體使用 在峰值時,程式在heap上分配了 4.5 KiB 的記憶體,額外開銷(heap extra)為 24 字節,而stack記憶體使用量顯示為 0 字節(可能是因為stack使用量極小或未被監控) 分配來源:峰值記憶體主要與文件操作(`_IO_file_doallocate` 和 `fdopen`)相關 出現錯誤訊息像 ``` MESA: error: ZINK: failed to choose pdev glx: failed to create drisw screen) ``` 可能是圖形驅動或環境配置的問題。 因此之後改用 `ms_print` 工具在終端查看 Massif 輸出,生成一個文字版的記憶體使用報告,包含曲線和分配細節 :::spoiler ms_print ``` -------------------------------------------------------------------------------- Command: ./qtest -f traces/trace-17-complexity.cmd Massif arguments: (none) ms_print arguments: ./massif.out.181221 -------------------------------------------------------------------------------- KB 4.484^ ::::::::::# | : # | : # | : # | : # | : # | : # | : # | : # | : # | : # | : # | : # | : # | : # | : # | : # | : # | : # | @ # 0 +----------------------------------------------------------------------->ki 0 284.0 Number of snapshots: 9 Detailed snapshots: [2, 6 (peak)] -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 0 0 0 0 0 0 1 248,335 264 256 8 0 2 249,116 264 256 8 0 96.97% (256B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->96.97% (256B) 0x4A510F3: __posix_spawn_file_actions_realloc (spawn_faction_init.c:32) ->96.97% (256B) 0x4A50EA7: posix_spawn_file_actions_adddup2 (spawn_faction_adddup2.c:37) ->96.97% (256B) 0x10D3BB: commit_exists (qtest.c:1217) ->96.97% (256B) 0x10D596: sanity_check (qtest.c:1328) ->96.97% (256B) 0x10D596: main (qtest.c:1371) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 3 249,116 0 0 0 0 4 249,254 488 472 16 0 5 249,710 4,592 4,568 24 0 6 290,678 4,592 4,568 24 0 99.48% (4,568B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->89.20% (4,096B) 0x49C71B4: _IO_file_doallocate (filedoalloc.c:101) | ->89.20% (4,096B) 0x49D7523: _IO_doallocbuf (genops.c:347) | ->89.20% (4,096B) 0x49D4883: _IO_file_underflow@@GLIBC_2.2.5 (fileops.c:486) | ->89.20% (4,096B) 0x49D75D1: _IO_default_uflow (genops.c:362) | ->89.20% (4,096B) 0x49C8F79: _IO_getline_info (iogetline.c:60) | ->89.20% (4,096B) 0x49C7BD3: fgets (iofgets.c:53) | ->89.20% (4,096B) 0x10D2FF: fgets (stdio2.h:200) | ->89.20% (4,096B) 0x10D2FF: commit_exists (qtest.c:1260) | ->89.20% (4,096B) 0x10D596: sanity_check (qtest.c:1328) | ->89.20% (4,096B) 0x10D596: main (qtest.c:1371) | ->10.28% (472B) 0x49C759D: fdopen@@GLIBC_2.2.5 (iofdopen.c:122) | ->10.28% (472B) 0x10D2DB: commit_exists (qtest.c:1251) | ->10.28% (472B) 0x10D596: sanity_check (qtest.c:1328) | ->10.28% (472B) 0x10D596: main (qtest.c:1371) | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 7 290,678 488 472 16 0 8 290,816 0 0 0 0 ``` ::: ## 研讀linux核心src的 `lib/list_sort.c` 並引入專案 參考 , [chiangkd](https://hackmd.io/@chiangkd/2023spring-lab0-c#%E7%A0%94%E8%AE%80-list_sortc-%E4%B8%A6%E5%BC%95%E5%85%A5%E5%B0%88%E6%A1%88), [Risheng](https://hackmd.io/@Risheng/list_sort), [hanchi](https://hackmd.io/tqrBvKRgTTm-LZzlde0V5A?view#%E4%BD%BF%E7%94%A8-Valgrind-%E6%8E%92%E9%99%A4-qtest-%E5%AF%A6%E4%BD%9C%E7%9A%84%E8%A8%98%E6%86%B6%E9%AB%94%E9%8C%AF%E8%AA%A4) , [jeremy](https://hackmd.io/@jeremytsai/is_my_code_constant_time) 閱讀 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 筆記額外放在這[Lab0 lib/list_sort.c 研讀筆記](https://hackmd.io/bKueUbacRd6S7MOUWy-aKg) ### 將 `list_sort.c` 加入專案 1. 將 [list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 及 [list_sort.h](https://github.com/torvalds/linux/blob/master/include/linux/list_sort.h) 加入到 `lab0_C` 專案中 2. 將 `list_sort.c` 中用到的 macro 加入到 `list_sort.h` 中 * 將在 [compiler.h](https://github.com/torvalds/linux/blob/master/include/linux/compiler.h) 中找到的`likely` , `unlikely` 加入 `list_sort.h` header中 ```c #define likely_notrace(x) __builtin_expect(!!(x), 1) #define unlikely_notrace(x) __builtin_expect(!!(x), 0) ``` * 定義 list_sort.c 中的 u8 型別 ```c #include <stdint.h> typedef uint8_t u8; ``` 3. 在 `Makefile` 中的 OBJS 中新增 `list_sort.o` ```diff OBJS := qtest.o report.o console.o harness.o queue.o \ random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \ shannon_entropy.o \ linenoise.o web.o \ + list_sort.o ``` 4. 修改 `qtest.c` 這個檔案 * 在 `qtest.c` 的開頭加入以下行 ```c #include "list_sort.h"` ``` * 定義一個靜態變數來決定是否使用 `list_sort` ```c static int use_list_sort = 0; ``` * 編寫比較函數@cmp * 函數原型為: ```c int cmp(void *priv, const struct list_head *a, const struct list_head *b); ``` * 加入以下比較函數 : ```c static int cmp(void *priv, const struct list_head *a, const struct list_head *b) { return strcmp(list_entry(a, element_t, list)->value, list_entry(b, element_t, list)->value); } ``` * 修改 `do_sort` 函數,根據 `use_list_sort` 選擇排序方法 ```c if (current && exception_setup(true)) use_list_sort ? list_sort(NULL, current->q, cmp): q_sort(current->q); exception_cancel(); ``` * 加入 `list_sort` 選項到help選單: ```c add_param("listsort", &use_list_sort, "Use the sort which is made by linux kernel developers", NULL); ``` 這允許用戶通過命令 option listsort 1 啟用 list_sort,或 option listsort 0 回到 q_sort ```shell Options: descend 0 | Sort and merge queue in ascending/descending order echo 1 | Do/don't echo commands entropy 0 | Show/Hide Shannon entropy error 5 | Number of errors until exit fail 30 | Number of times allow queue operations to return false length 1024 | Maximum length of displayed string listsort 1 | Use the sort which is made by linux kernel developers malloc 0 | Malloc failure probability percent simulation 0 | Start/Stop simulation mode verbose 4 | Verbosity level ``` 5. 編寫一測資 `trace-sort-cp.cmd` 測試實作之`q_sort`與`list_sort`執行時間差異 ```shell # Testing the efficiency of list_sort and sort. # You can modify the option listsort and the RAND to meet your need option list_sort 0 new it RAND 1000000 time sort time option list_sort 1 new it RAND 1000000 time sort time ``` 輸入 ``` $ ./qtest -f traces/trace-sort-cp.cmd ``` 即可得到 `list_sort` 或 `sort` 的處理時間 可看到 `q_sort` 、 `list_sort` 所花時間分別為 1.289s、0.946s ```shell cmd> # Testing the efficiency of list_sort and sort. cmd> # You can modify the option listsort and the RAND to meet your need cmd> cmd> option listsort 0 cmd> new l = [] cmd> it RAND 1000000 l = [fooiz fracl tgfcu zoojvil tjfoxvmua xkrpfuhq chgztyvv xchumlct mfpayg hezyhkf mariuftwc agahvim gdzcbpul fdmaege iwmbuzxlg jafld bkhyf dsorqgjfb txoazw weccnf xeipuh vslap jxpgyh dwjfpxaj mdlzndac uvccdjfs cdynagay kyuxyrvk vlkrwbpqb wvrpvohd ... ] cmd> time Elapsed time = 0.397, Delta time = 0.397 cmd> sort Warning: Skip checking the stability of the sort because the number of elements 1000000 is too large, exceeds the limit 100000. l = [aaaagvv aaaaqtt aaacklux aaaddseqr aaadr aaaedu aaafhr aaagedkxo aaagfy aaaglgs aaagmgnls aaagq aaagwo aaagytybt aaahb aaahjs aaaimib aaaiudwd aaajaebq aaajvkdy aaajxy aaakqvpoq aaakuewes aaalb aaalllc aaamo aaamvo aaanqc aaaoblpli aaaokcfbp ... ] cmd> time Elapsed time = 1.686, Delta time = 1.289 cmd> option listsort 1 cmd> new l = [] cmd> it RAND 1000000 l = [jxnyi amfwt apihaud kiqohbb vrsyxq maumcrq piuutq xdxbub ymwdvhhxl vhccbgyf ssidmpb dhgwdt egysb axntszvle yvskbzyxy adswugyrp dngdlifqz qqgsi wouacg pvmxwdg xpqxaksl hxpimtcz babqugwvn uccmt atnwqthft glekayb ggmkyzaor olbkhcrpx tygofpin dkheyvw ... ] cmd> time Elapsed time = 2.118, Delta time = 0.432 cmd> sort Warning: Skip checking the stability of the sort because the number of elements 1000000 is too large, exceeds the limit 100000. l = [aaabhur aaachllo aaacjl aaacrruse aaacrxmml aaacu aaaczwtq aaada aaadkjbp aaadlqwm aaadlx aaaebe aaael aaaffh aaagcl aaagdpvi aaagsgwz aaagz aaaht aaaikr aaaiu aaaiw aaaiwbqcn aaakrw aaaktoo aaalejdzf aaalk aaamptkq aaamzhxh aaanjnr ... ] cmd> time Elapsed time = 3.064, Delta time = 0.946 Freeing queue ``` ### 用 perf 分析 sort 效能 #### 環境確認 透過以下命令來查看使用 perf 的權限 ```shell $ cat /proc/sys/kernel/perf_event_paranoid ``` 如果想要把權限全部打開的話,可以使用這條指令,且將值設為 -1 ```shell `$ sudo sh -c "echo -1 > /proc/sys/kernel/perf_event_paranoid"` ``` 再來可以輸入以下來確認自己是否有拿到最高權限值 (如果以非 root 用戶運行 perf top,通常會收到類似「Permission denied」或「perf_event_open() failed」的錯誤訊息) ```shell $ perf top ``` #### 開始測試 首先我在trace目錄下家兩個測試命令集用來測試分別用來對 q_sort 與 list_sort 可以使用以下指令來取得 `cache-misses`, `cache-references`, `instructions`, `cycles` 這些項目的資訊 ```shell ``` ## 在 qtest 提供新的命令 `shuffle`