2025q1 Homework5 (assessment)

# 2025q1 Homework5 (assessment) contributed by < `timsong1` > ## 閱讀〈[因為自動飲料機而延畢的那一年](https://www.opasschang.com/docs/the-story-of-auto-beverage-machine-1)〉的啟發相信創業這想法大家應該都有過，我也不意外，甚至有數個點子，但都因種種原因而草率放棄，首先我一開始也跟作者一樣，覺得這自動飲料機沒什麼難的，但越探究一個問題，往往會發現更多問題，這可以套用到很多事情，在這篇文中我看到作者是如何一步一步的走下去，這精神真的值得我們學習，也不應該提早放棄， ## 提問 : [concurrent-programs/coro/coro.c](https://github.com/sysprog21/concurrent-programs/blob/master/coro/coro.c#L31) (2025/5/3) ### 提問 1 : 之前在寫作業 [kxo](https://hackmd.io/@sysprog/linux2025-kxo/%2F%40sysprog%2Flinux2025-kxo-f) 的時候，修改其使用者程式的部分，使用 `setjmp()`和 `longjmp()` 使其任務排程採用 coroutine 方式，經過測試可以順利運行，並且通過 [valgrind](https://valgrind.org/) 測試，沒有**記憶體洩漏**的問題 ```cpp Stopping the kernel space tic-tac-toe game... ==4281== ==4281== HEAP SUMMARY: ==4281== in use at exit: 0 bytes in 0 blocks ==4281== total heap usage: 140 allocs, 140 frees, 14,257 bytes allocated ==4281== ==4281== All heap blocks were freed -- no leaks are possible ==4281== ==4281== For lists of detected and suppressed errors, rerun with: -s ==4281== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ``` 其中每個 task 的運作流程大概是： ```cpp= static void task_0() { struct task *task = malloc(sizeof(struct task)); . . if (setjmp(task->end) != 0) { task = cur_task; free(task); longjmp(sched, 1); } . . task = cur_task; . . } ``` 然而這支程式沒有通過 cppcheck 的靜態測試，並且報出 `error: Memory leak: task [memleak]` ```cpp $ cppcheck xo-user.c Checking xo-user.c ... xo-user.c:196:9: error: Memory leak: task [memleak] task = cur_task; ^ xo-user.c:206:5: error: Memory leak: task [memleak] task = cur_task; ^ xo-user.c:229:9: error: Memory leak: task [memleak] task = cur_task; ^ xo-user.c:239:5: error: Memory leak: task [memleak] task = cur_task; ^ xo-user.c:256:9: error: Memory leak: task [memleak] task = cur_task; ^ xo-user.c:266:5: error: Memory leak: task [memleak] task = cur_task; ^ xo-user.c:286:9: error: Memory leak: task [memleak] task = cur_task; ^ xo-user.c:296:5: error: Memory leak: task [memleak] task = cur_task; ^ Checking xo-user.c: LIST_POISONING... Checking xo-user.c: __GNUC__... ``` 而因為我是採用 [concurrent-programs/coro.c](https://github.com/sysprog21/concurrent-programs/blob/master/coro/coro.c) 排程方式，所以**之後都會以這程式來討論** 其每個 task 內部的關鍵運作流程： ```cpp= void task0(void *arg) { struct task *task = malloc(sizeof(struct task)); strcpy(task->task_name, ((struct arg *) arg)->task_name); task->n = ((struct arg *) arg)->n; task->i = ((struct arg *) arg)->i; INIT_LIST_HEAD(&task->list); printf("%s: n = %d\n", task->task_name, task->n); if (setjmp(task->env) == 0) { task_add(task); longjmp(sched, 1); } task = cur_task; . . free(task); } ``` 雖然還是有差異，但我想測試看看能不能通過 cppcheck，結果也是不行 ```cpp= $ cppcheck coro.c Checking coro.c ... coro.c:93:5: error: Memory leak: task [memleak] task = cur_task; ^ coro.c:126:5: error: Memory leak: task [memleak] task = cur_task; ^ Checking coro.c: LIST_POISONING... Checking coro.c: __GNUC__... ``` cppcheck 會針對 `task = cur_task` 報記憶體洩漏的錯誤是因為，在配置記憶體給 `task` 指標(行 16 )後，更改 `task` 指標指向的位址(雖然以整體運作流程來說是正確的)，所以判斷有出現記憶體洩漏以下探討 coro.c 的問題是屬於流程設計問題，還是 cppcheck 針對記憶體洩漏的 false positive 問題 ### coro.c 流程設計問題? 此程式排程的函式會先逐一呼叫每個任務，每個任務都會先進行初始化(行3-12)，接著再跳回排程函式(行13) `cur_task` 儲存當前被排程的 `task`，在之後跳回來時重新賦值給 `task` (行16)，就可以確保 `task` 的正確性 [c11 規則書](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf) 7.22.3.3： >`void free(void *ptr);` >The `free` function causes the space pointed to by `ptr` to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does not match a pointer earlier returned by a memory management function, or if the space has been deallocated by a call to `free` or `realloc`, the behavior is undefined. 在呼叫 `free()` 的參數 `ptr` 如果不是 null 也沒匹配到之前透過記憶體配置回傳的指標的話，屬於 undefined behavior 換另一個角度想，呼叫 `free` 的參數 `ptr` 都必須要正確才行 [cppcheck manual](https://cppcheck.sourceforge.io/manual.pdf) 有提到，靜態分析可以找到三種問題： - Undefined behavior - Using dangerous code patterns - Coding style 以及定義了數個嚴重程度分級，其中最高等級是 `error` >when code is executed there is either undefined behavior or other error, such as a memory leak or resource leak 而`task = cur_task;` 這種更改指標的行為可以正確運作的前提，是建立在整個程式設計、運作正確之下，再加上 cppcheck 針對此行是發出`error: Memory leak: task [memleak]`，因此可以判斷 cppcheck 是覺得這行屬於 **Using dangerous code patterns** 而 cppcheck 是靜態分析，較難分析這種需要動態分析才能確認的程式，因此報出記憶體洩漏的錯誤很合理 ### cppcheck false positive ? 針對 [cppcheck 的開源程式碼](https://github.com/danmar/cppcheck/tree/main)簡述偵測這部分的流程 `lib` 目錄下有定義許多 `checkXXX` 格式的類別，負責定義要檢查的不同功能因為 `task` 指標變數是屬於 **automatic storage duration**，關於會報記憶體洩漏的 cpp 檔是 [lib/checkleakautovar.cpp](https://github.com/danmar/cppcheck/blob/main/lib/checkleakautovar.cpp) `bool CheckLeakAutoVar::checkScope(...)` 就是在檢查某對 `{`和` }`之間的程式碼，如果裡面有 `if`、`else`之類還會有別的 scope 的話就遞迴呼叫即可接下來探討 `coro.c` 程式碼較關鍵的部分 ```cpp= void task0(void *arg) { struct task *task = malloc(sizeof(struct task)); . . if (setjmp(task->env) == 0) { task_add(task); longjmp(sched, 1); } task = cur_task; . . } ``` 我們可以直接只看這個函式的部分在 `CheckLeakAutoVar::checkScope(...)` 裡面，首先會先檢查到行 3 的 `task` 指標，根據` if (const Token* const tokAssignOp = isInit ? varTok : isAssignment(varTok))` 判斷該變數有被賦值接著會偵測到一個剛配置的記憶體位址賦值給該變數，因此設定 `varAlloc.status = VarInfo::ALLOC` 代表該變數已配置記憶體先講行 10 的部分，同剛剛的邏輯，先偵測到該變數要被賦值，接著會判斷此變數是不是屬於 conditional allocate (之前的配置記憶體動作在條件判斷裡面) ```cpp if (conditionalAlloc.find(varTok->varId()) == conditionalAlloc.end()) leakIfAllocated(varTok, varInfo); ``` 由於 `task` 指標不屬於 conditional allocate ，接著呼叫 `leakIfAllocated(varTok, varInfo)` ```cpp void CheckLeakAutoVar::leakIfAllocated(const Token *vartok, const VarInfo &varInfo) { . . const auto& possibleUsage = varInfo.possibleUsage; . . if (var != alloctype.cend() && var->second.status == VarInfo::ALLOC) { const auto use = possibleUsage.find(vartok->varId()); if (use == possibleUsage.end()) { leakError(vartok, vartok->str(), var->second.type); } } . . } ``` 關於變數的資訊都會存在 `VarInfo varInfo` 裡面而此函式會判斷 `task` 指標在之前有沒有**可能**被使用過，就是透過 `possibleUsage` 來查詢，如果沒有的話就會發出記憶體洩漏錯誤關於有沒有被使用過，可以看到行 7 ：`task_add(task);` ，因為有時候很難在靜態分析分辨 `task` 當作某些函式參數這行為究竟會不會變更 `task`，所以只要不是特定函式，基本上都會把 `task` 加入到 `possibleUsage` 所以問題是：明明在偵測到 `task = cur_task` 前已經因為 `task_add(task)` 而把 `task` 加入 `possibleUsage`，為何之後卻在 `possibleUsage` 找不到？關鍵就是在 `if` 的 scope 裡(行 6-9)，之前有提到遇上 if - else 這種分支語句會透過遞迴呼叫 `CheckLeakAutoVar::checkScope(...)` 解決。針對 `if` 分支，會用 `VarInfo varInfo1` 來記錄，而 `else` 分支則是用 `VarInfo varInfo2` 來紀錄，而如果像這程式段落沒有 `else` 的話則是把 `if` 之前的 `varInfo` 當作 `varInfo2` 在遞迴完成後，會把 `varInfo1` 跟 `varInfo2` 合併，其中合併各自的 `possibleUsage` 規則很多，但以目前程式的情況來看，是會直接合併的因此，可以確定在遞迴呼叫檢查 `if` 分支後，沒有因為 `task_add(task)` 而加入 `varInfo1` 其實在檢查到 `task_add(task)` 時是有把 `task` 加入至 `possibleUsage` 的，關鍵是在檢查到之後的 `longjmp(sched, 1)` 後，把 `varInfo1` 清掉的檢查到 `longjmp(sched, 1)` 時，由於其位於該 scope 的最後面，因此會進入： ``` if (allocation.status == VarInfo::NOALLOC && Token::simpleMatch(tok, ") ; }")) { . . } ``` 接著透過呼叫 [lib\library.cpp](https://github.com/danmar/cppcheck/blob/main/lib/library.cpp#L1278) 的 `bool Library::isScopeNoReturn(...)` 來檢查該函式是不是屬於不會返回的，而 `longjmp()` 顯然是不會返回的如果根據 cppcheck 規則，修改 coro.c 的話目前想到兩個方案： ```diff void task0(void *arg) { struct task *task = malloc(sizeof(struct task)); . . + task_add(task); if (setjmp(task->env) == 0) { - task_add(task); longjmp(sched, 1); } task = cur_task; . . } ``` 或是 ```diff void task0(void *arg) { struct task *task = malloc(sizeof(struct task)); . . if (setjmp(task->env) == 0) { task_add(task); longjmp(sched, 1); + // 隨便一行，例如 printf() } task = cur_task; . . } ``` ### 提問 2 ```cpp static long long fib_sequence(long long k) { /* FIXME: use clz/ctz and fast algorithms to speed up */ long long f[k + 2]; f[0] = 0; f[1] = 1; for (int i = 2; i <= k; i++) { f[i] = f[i - 1] + f[i - 2]; } return f[k]; } ``` 該方法的複雜度為 $O(n)$ 使用矩陣： ${\begin{bmatrix} 1 & 1 \\ 1 & 0 \\ \end{bmatrix}}^2$ = ${\begin{bmatrix} F(n+1) & F(n) \\ F(n) & F(n-1) \\ \end{bmatrix}}$ ![image](https://hackmd.io/_uploads/S1oRlK7ele.png) 這公式可以用數學歸納法求得，可以使時間複雜度變為$O(log(n))$ 而從這公式我們可以直接改寫：給定$F(k) 、 F(k+1)$ $F(2k)=F(k)[2F(k+1)−F(k)]$ $F(2k+1)=F(k+1)^2+F(k)^2$ 乘法演算法： https://www.nayuki.io/page/karatsuba-multiplication ## 2025/5/3 一對一討論後續紀錄 [連結](https://hackmd.io/@timsong1/BkHWe_ixll) ## 期末專題 ### vwifi : 虛擬無線網路裝置驅動程式 ideas : 支援實體網卡 vwifi 是透過創建虛擬網卡(支援 station、access point、ibss 等模式)，透過實作 `cfg80211_ops` 、 Tx/Rx 來模擬整個網卡以及傳輸行為，但從頭到尾沒有與實體網卡交互。因此我好奇能不能在原本的架構上擴充，支援特定型號的實體網卡以下是需要探討的問題 1. 網卡要 FullMAC 還是 SoftMAC ? vwifi 只有實作 `cfg80211`，而根據 [cfg80211 說明](https://wireless.docs.kernel.org/en/latest/en/developers/documentation/cfg80211.html) >All new Linux wireless drivers should be written targeting either cfg80211 for fullmac devices or mac80211 for softmac devices. 對於 vwifi 來說，是屬於 FullMAC 驅動，如果要先擴充成支援 FullMAC 實體網卡的話，驅動也是透過 `cfg80211` 2. 要使用哪個品牌、型號的網卡? 很多市面上的網卡的驅動都有開源，像是 [realtek rtl8821ce](https://github.com/tomaspinho/rtl8821ce)，是不是需要挑沒有開源的網卡來實作? 4. 支援兩個實體網卡互相通訊，還是跟 vwifi 一樣自己模擬通訊 ?