2023q1 Homework1 (lab0)

# 2023q1 Homework1 (lab0) contributed by < ```terry23304``` > ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 167 Model name: 11th Gen Intel(R) Core(TM) i9-11900 @ 2.50GHz Stepping: 1 CPU MHz: 2500.000 CPU max MHz: 5200.0000 CPU min MHz: 800.0000 BogoMIPS: 4992.00 Virtualization: VT-x L1d cache: 384 KiB L1i cache: 256 KiB L2 cache: 4 MiB L3 cache: 16 MiB NUMA node0 CPU(s): 0-15 ``` ## 開發過程 ## queue.[ch] ### q_new 用 `INIT_LIST_HEAD` 初始化 list head ```c struct list_head *q_new() { struct list_head *head = malloc(sizeof(struct list_head)); if (!head) return NULL; INIT_LIST_HEAD(head); return head; } ``` ### q_free ```c void q_free(struct list_head *l) { struct list_head *li, *safe; list_for_each_entry_safe (li, safe, l, list { list_del(&li->list); q_release_element(li); } free(l); } ``` ### q_insert_head pre-commit hook 告知 memory leak ，查了一下才發現```strdup``` 可能會 ```malloc```失敗 ```c bool q_insert_head(struct list_head *head, char *s) { if (!head) return false; element_t *first = malloc(sizeof(element_t)); if (!first) return false; first->value = strdup(s); if (!first->value) { free(first); return false; } list_add(&first->list, head); return true; } ``` ### q_insert_tail ```c bool q_insert_tail(struct list_head *head, char *s) { if (!head) return false; element_t *last = malloc(sizeof(element_t)); if (!last) return false; last->value = strdup(s); if (!last->value) { free(last); return false; } list_add_tail(&last->list, head); return true; } ``` ### q_remove_head remove: 斷開節點之間的鏈結，還存在記憶體中 :::warning 注意用語: 「[節點](https://dict.revised.moe.edu.tw/dictView.jsp?ID=90405)」 :notes: jserv ::: 如果 src 字元數（含'\0'）比 bufsize 少，會把剩下未滿 bufsize 的部分通通補上 '\0' 如果 src 字元數（含'\0'）比 bufsize 多，要自己補 '\0' ```c char *strncpy( char *dest, const char *src, size_t count ); ``` :::warning 想問為甚麼 q_remove_head 要用 ```sp``` 儲存被移除的值是為了讓 q_insert_head 申請 memory 的速度更快嗎，但 q_insert_head 的 parameter 也沒有記憶體位置 > 將問題列在下方開發紀錄，並詳述你的疑慮，避免只說「儲存被移除的值」，應該具體指出相關程式碼，附上你的推測和實驗。 > :notes: jserv ::: >The reason for copying the string value to sp is that the value member of the element_t struct is an internal implementation detail of the linked list and is not intended to be accessed or modified directly by the caller of the q_remove_head function. >By providing a sp buffer as an argument to the function, the caller can obtain the string value of the removed element in a way that is safe and consistent with the function's API. ```c element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *node = list_first_entry(head, element_t, list); list_del_init(&node->list); if(sp){ strncpy(sp, node->value, bufsize-1); sp[bufsize-1] = '\0'; } return node; } ``` ### q_remove_tail ```c element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *node = list_last_entry(head, element_t, list); list_del_init(&node->list); if(sp){ strncpy(sp, node->value, bufsize-1); sp[bufsize-1] = '\0'; } return node; } ``` ### q_delete_mid 忘記是 doublly-linked list ，原本 while 判斷式這樣寫會導致 infinite loop ```c while(fast && fast->next){ fast = fast->next->next; slow = slow->next; } ``` ```c bool q_delete_mid(struct list_head *head) { // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/ if (!head || list_empty(head)) return false; struct list_head *fast = head->next; struct list_head *slow = fast; while(fast != head->prev && fast->next != head->prev){ fast = fast->next->next; slow = slow->next; } list_del (slow); q_release_element(container_of(slow, element_t, list)); return true; } ``` ### q_delete_dup 對每個節點檢查是否與下一個節點字串相等，若相等則刪除 :::spoiler ```c list_for_each_entry_safe (cur, next, head, list) { // current value equal to next value if (!strcmp(cur->value, next->value)) { list_del(&cur->list); q_release_element(cur); dup = true; } // delete last duplicate node else if (dup) { list_del(&cur->list); q_release_element(cur); dup = false; } } ``` ::: 原本的寫法在 ```next == head``` 的時候 ```next->value```會出錯，所以改成刪除前一個節點 ```c bool dup = false; // for last duplicate string list_for_each_safe (cur, next, head) { // current value equal to next value element_t *cur_node = list_entry(cur, element_t, list); if (cur->next == head) { if (dup) { list_del(&cur_node->list); q_release_element(cur_node); } break; } element_t *next_node = list_entry(cur->next, element_t, list); if (!strcmp(cur_node->value, next_node->value)) { list_del(&cur_node->list); q_release_element(cur_node); dup = true; } // delete last duplicate node else if (dup) { list_del(&cur_node->list); q_release_element(cur_node); dup = false; } } ``` ### q_swap 每兩點交換一次，利用 ```list_add``` 與 ```list_del``` 來更改節點位置 :::spoiler ```c void q_swap(struct list_head *head) { if (!head) return; struct list_head *cur, *next, *first = NULL; list_for_each_safe (cur, next, head) { if (first) { // add first behind second list_add(first, cur); first = NULL; } else { // remove first one first = cur; list_del(cur); } } // last node if (first) list_add(first, head->prev); } ``` ::: 改成直接用 `list_move` 將目前節點跟下一個節點互換，程式碼更精簡 ```c void q_swap(struct list_head *head) { // https://leetcode.com/problems/swap-nodes-in-pairs/ if (!head || list_empty(head) || list_is_singular(head)) return; struct list_head *cur = head->next; while (cur != head && cur->next != head) { list_move(cur, cur->next); cur = cur->next; } } ``` ### q_reverse 把每個 node 移出再加到 head 後面 ```c void q_reverse(struct list_head *head) { if (!head || list_empty(head)) return; struct list_head *cur, *next = NULL; list_for_each_safe (cur, next, head) list_move(cur, head); } ``` ### q_reverseK 用 `count` 來記錄要 revserse 的節點數 ```c void q_reverseK(struct list_head *head, int k) { // https://leetcode.com/problems/reverse-nodes-in-k-group/ if (!head || list_empty(head)) return; int count = 0; struct list_head *cur, *next = NULL; list_for_each_safe (cur, next, head) { count++; if (count == k) { count--; struct list_head *tmp = cur->prev; struct list_head *tmp_prev; while ((count--) > 0) { tmp_prev = tmp->prev; list_del(tmp); list_add(tmp, cur); cur = cur->next; tmp = tmp_prev; } count = 0; } } } ``` ### q_descend 反向迭代串列，紀錄最大的字串，若節點字串比最大字串小則移除原本沒有紀錄 ```prev``` ，會導致 segmentation fault (dereferenced a NULL or invalid pointer) ```c int q_descend(struct list_head *head) { // https://leetcode.com/problems/remove-nodes-from-linked-list/ if (!head || list_empty(head)) return 0; struct list_head *cur = head->prev; struct list_head *prev = cur->prev; char *max = NULL; // iterate list reversly for (; cur != head; cur = prev) { element_t *entry = list_entry(cur, element_t, list); prev = cur->prev; if (!max || strcmp(entry->value, max) > 0) max = entry->value; else { list_del(cur); q_release_element(entry); } } return q_size(head); } ``` ### q_merge 當成單向的串列合併起來做排序後再接回 `prev` 變回雙向串列 ```c int q_merge(struct list_head *head) { if (!head || list_empty(head)) return 0; if (list_is_singular(head)) return list_entry(head, queue_contex_t, chain)->size; queue_contex_t *first = list_entry(head->next, queue_contex_t, chain); queue_contex_t *cur = NULL; list_for_each_entry (cur, head, chain) { if (cur == first) continue; list_splice_init(cur->q, first->q); first->size = first->size + cur->size; cur->size = 0; } q_sort(first->q); return first->size; } ``` ## 開啟 Address Sanitizer，修正 qtest 執行過程中的錯誤 ``` make clean make SANITIZER=1 make test ``` 執行後 make test 分數未改變 ## 運用 Valgrind 排除 qtest 實作的記憶體錯誤 ```c +++ TESTING trace trace-13-malloc: # Test of malloc failure on new ==160726== 32 bytes in 1 blocks are still reachable in loss record 1 of 3 ==160726== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==160726== by 0x10CBCE: do_new (qtest.c:145) ==160726== by 0x10DDFA: interpret_cmda (console.c:181) ==160726== by 0x10E3AF: interpret_cmd (console.c:201) ==160726== by 0x10E7B0: cmd_select (console.c:610) ==160726== by 0x10F09C: run_console (console.c:705) ==160726== by 0x10D1EC: main (qtest.c:1223) ==160726== ==160726== 160 bytes in 5 blocks are possibly lost in loss record 2 of 3 ==160726== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==160726== by 0x10CBCE: do_new (qtest.c:145) ==160726== by 0x10DDFA: interpret_cmda (console.c:181) ==160726== by 0x10E3AF: interpret_cmd (console.c:201) ==160726== by 0x10E7B0: cmd_select (console.c:610) ==160726== by 0x10F09C: run_console (console.c:705) ==160726== by 0x10D1EC: main (qtest.c:1223) ==160726== ==160726== 224 bytes in 4 blocks are still reachable in loss record 3 of 3 ==160726== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==160726== by 0x10F110: test_malloc (harness.c:133) ==160726== by 0x10F515: q_new (queue.c:17) ==160726== by 0x10CC07: do_new (qtest.c:149) ==160726== by 0x10DDFA: interpret_cmda (console.c:181) ==160726== by 0x10E3AF: interpret_cmd (console.c:201) ==160726== by 0x10E7B0: cmd_select (console.c:610) ==160726== by 0x10F09C: run_console (console.c:705) ==160726== by 0x10D1EC: main (qtest.c:1223) ==160726== --- trace-13-malloc 0/6 ``` 發現 `do_new` 會出問題，參考 [yanjiew](https://hackmd.io/@yanjiew/linux2023q1-lab0#Valgrind-%E8%88%87-Address-Sanitizer-%E8%A8%98%E6%86%B6%E9%AB%94%E6%AA%A2%E6%9F%A5) 的解決方式，把 `qtest.c` 中 `q_quit` 第一行的 `return true` 刪除，讓記憶體釋放正常執行 ## 論文閱讀: Dude, is my code constant time? 用兩 class data 做 statistical test ，若兩個 data 的 timing distribution 相等則為 constant time ### Measure execution time perform a leakage detection test on execution time. 再執行時期用兩類 data 測試兩個 timing distribution is statistically different or not. #### Class definition - first clas: fixed to a constant value might be chosen to trigger certain special corner-case(e.g. low weight input in arthmetic operations) ::: info low weight input: 較小的數值通常可以使用較少的位元（或較小的數據類型）來表示。使用較小的數據類型可以節省記憶體使用和運算時間，因此可以提高計算效率。如計算 5 + 1000000，可以將 5 表示為一個 8 位元的整數（如 unsigned char），而將 1000000 表示為一個 32 位元或 64 位元的整數（如 unsigned int 或 unsigned long long）。 ::: - second class: chosen at randomfor each measturement #### cycle counters 現代處理器提供 cycle counter，可以更準確的測量執行時間 #### enviromental conditions 為了最小化環境變化帶來的影響，每項測試隨機選擇 class ### Apply post-processing 再做統計分析前可對各個 measurement 進行一些後處理 - cropping：典型的 timing distribution 呈現 [positive skewed](https://zh.wikipedia.org/wiki/%E5%81%8F%E5%BA%A6)，可能是由於測量工具、主程式被 os 或其他外部活動中斷等原因引起的。在這種情況下可以裁剪或刪除過大或極端的 measurement。 - 高階預處理 ### Apply statistical test 第三步是應用統計檢驗來嘗試否定 NULL Hypothesis “兩個 timing distribution 相等”，統計檢驗方法: - t-test：檢測平均值的等價性，測試失敗很意味著存在一個 first order timing information leakage。 - non-parametric test：優點是這些檢驗通常對潛在分佈做出較少的假設，缺點是它們可能收斂得更慢並且需要更多樣本。 ### lab0 dudect 缺陷參考 [KHLee529](https://hackmd.io/@KHLee529/linux2023-lab0#lab0-%E4%B8%AD%E7%9A%84-dudect-%E5%AF%A6%E4%BD%9C) 的方法在論文中的 post-processing 中有提到可以裁減或刪除過大的 measurement ，而在 dudect/constant.c 中的實作是頭尾少做 DROP_SIZE 次測量 cycle 數，要改成排序執行時間後把前後 DROP_SIZE 個執行時間去除後再比較分布 ```c qsort(exec_times, N_MEASURES, sizeof(int64_t), cmp); memset(exec_times, 0, DROP_SIZE); memset(&exec_times[N_MEASURES - DROP_SIZE], 0, DROP_SIZE); ``` ## 在 qtest 提供新的命令 shuffle 原本只把選出來的 node 移到串列最後，但用 qtest shuffle 了幾次感覺不夠亂 ```c for (int i = size; i > 0; i--){ int r = rand() % i; struct list_head *rand_node = head->next; for (int j = 0; j < r; j++) rand_node = rand_node->next; list_move_tail(rand_node, head); } ``` 把移到最後改成跟 `Fisher–Yates shuffle algo` 一樣交換亂數到的節點與最後的節點 ```c element_t *rand_entry = list_entry(rand_node, element_t, list); element_t *last_entry = list_entry(last, element_t, list);\ char *tmp = rand_entry->value; rand_entry->value = last_entry->value; last_entry->value = tmp; ``` 參考 do_merge 及 do_reservse_K 在 do_descend、do_merge 等有做檢查是否正確執行函式功能的函式才需要 return ok ```c static bool do_shuffle(int argc, char *argv[]) { if (argc != 1) { report(1, "%s takes no arguments", argv[0]); return false; } if (!current || !current->q) { report(3, "Warning: Calling shuffle on null queue"); return false; } error_check(); set_noallocate_mode(true); if (exception_setup(true)) q_shuffle(current->q); exception_cancel(); set_noallocate_mode(false); q_show(3); return !error_check(); } ``` ## github workflow make test 達到 100 分後 github action 還是會卡在 Run webfactory/ssh-agent@v0.7.0 ，參考 [commit 紀錄](https://github.com/sysprog21/lab0-c/commit/5fcff8eba8886e52023062ce1fa373915fe06aed)後加上 `continue-on-error: true` 就可以看到皮卡丘