# 2021q1 Homework1 (lab0) contributed by < `iLeafy11` > ###### tags: `linux2021` ## :penguin: 作業描述 ### [作業要求](https://hackmd.io/@sysprog/linux2021-lab0#-%E4%BD%9C%E6%A5%AD%E8%A6%81%E6%B1%82) ## 開發環境設定 :::info 尚未安裝Ubuntu 20.04。目前仍在 Mac Catalina 上寫作業。 :notes: iLeafy11 > 請愛用 [Multipass](https://multipass.run/),我在 macOS Catalina 執行 Ubuntu Linux 20.04 很順暢 > :notes: jserv ::: - [x] 安裝 Multipass on MacOS Catalina - [x] 安裝 開發環境套件 - [x] 設置 .vimrc 及 vim-plugin - [x] Fork [lab0-c](https://github.com/sysprog21/lab0-c) 至自己的 github 儲藏庫中並且 clone 至開發環境中 - [ ] ? ### Multipass **MacOS Catalina** * Install Multipass ``` $ brew install multipass ``` * Launch Ubuntu 20.04 ``` $ multipass launch 20.04 ``` * Check out the running instances ``` $ multipass list Name State IPv4 Image spotless-sculpin Running 192.168.64.2 Ubuntu 20.04 LTS ``` * Connect to a running instance ``` $ multipass shell dancing-chipmunk ``` ### 開發環境套件 **Ubuntu 20.04** ``` $ sudo apt-get update $ sudo apt install gcc $ sudo apt install git $ sudo apt install tig $ sudo apt install build-essential git-core valgrind $ sudo apt install cppcheck clang-format aspell colordiff ``` ### .vimrc 設置 .vimrc 檔 ```.vimrc set ai set background=dark set backspace=2 set cursorline set enc=utf8 set expandtab set hls set hlsearch set history=100 set laststatus=2 set ignorecase set incsearch set mouse=a set nocompatible set nu set ruler set shiftwidth=4 set smartcase set softtabstop=4 set t_Co=256 set tabstop=4 syntax on filetype indent on inoremap ( ()<Esc>i inoremap " ""<Esc>i inoremap ' ''<Esc>i inoremap [ []<Esc>i inoremap {<CR> {<CR>}<Esc>ko inoremap {{ {}<Esc>i hi CursorLineNR cterm=bold ctermbg=none ctermfg=Yellow hi LineNR cterm=bold ctermbg=none ctermfg=Grey hi Search cterm=reverse ctermbg=none ctermfg=none ``` ### vim-plugin **1. Pathogen** * installation ``` mkdir -p ~/.vim/autoload ~/.vim/bundle && \ curl -LSso ~/.vim/autoload/pathogen.vim https://tpo.pe/pathogen.vim ``` **2. SnipMate** * installation * Using Pathogen ``` % cd ~/.vim/bundle % git clone https://github.com/tomtom/tlib_vim.git % git clone https://github.com/MarcWeber/vim-addon-mw-utils.git % git clone https://github.com/garbas/vim-snipmate.git # Optional: % git clone https://github.com/honza/vim-snippets.git ``` **3. Syntastic** * installation * Using Pathogen ``` cd ~/.vim/bundle && \ git clone --depth=1 https://github.com/vim-syntastic/syntastic.git ``` 最後在 .vimrc 檔加入這幾行: ```.vimrc execute pathogen#infect() set statusline+=%#warningmsg# set statusline+=%{SyntasticStatuslineFlag()} set statusline+=%* let g:syntastic_always_populate_loc_list = 1 let g:syntastic_auto_loc_list = 1 let g:syntastic_check_on_open = 1 let g:syntastic_check_on_wq = 0 let g:syntastic_python_checkers = ['pylint'] let g:syntastic_c_checkers = ['gcc', 'make'] let g:snipMate = { 'snippet_version' : 1 } let g:snipMate = get(g:, 'snipMate', {}) let g:snipMate.scope_aliases = {} let g:snipMate.scope_aliases['ruby'] = 'ruby,rails' ``` ### ## queue.h queue.h 中存在兩個資料結構 **1. `list_ele_t` 為組成 singly linked list 的節點結構** * `char *value` 為指向一字元陣列構成的字串的指標 * `struct ELE *next` 為指向下一個節點的指標 ```c /* Linked list element (You shouldn't need to change this) */ typedef struct ELE { /* Pointer to array holding string. * This array needs to be explicitly allocated and freed */ char *value; struct ELE *next; } list_ele_t; ``` **2. `queue_t` 是以 singly linked list 所實現的佇列 (Queue) 資料結構** * `list_ele_t *head` 指向佇列 (Queue) 中 linked list 的頭端 * `list_ele_t *tail` 指向佇列 (Queue) 中 linked list 的尾端 * 考量到 `q_insert_tail` 時間複雜度必須達到 $O(1)$,故在 `queue_t` 中,加入指向 `list_ele_t` 的指標 `list_ele_t *tail`,指向 `queue_t` 中 linked list 最尾端的元素 * `size_t size` 儲存在佇列 (Queue) 中,linked list 所擁有的節點個數 * 考量到 `q_size` 時間複雜度必須達到 $O(1)$,故在 `queue_t` 中,加入變數 `size_t size`。隨著佇列 (Queue) 中 linked list 節點的增減,算出節點個數 `size`,其實作在 `q_insert_head`、`q_insert_tail`、`q_delete_head`中 ```diff /* Queue structure */ typedef struct { list_ele_t *head; /* Linked list of elements */ + list_ele_t *tail; + size_t size; } queue_t; ``` :::danger 解說呢?思考過程呢?只是張貼程式碼,我無從判斷是否原創,遑論得知你學到什麼。 :notes: jserv ::: :::info 因部分程式碼先完成,先貼上HackMD。解說以及思考過程隨後補上。 造成老師批改困擾,實屬抱歉。 :notes: iLeafy11 ::: ## queue.c ### q_new q_new 的實作考量到兩個部分: **1. Create empty queue** 用 `malloc` 配置一塊 `sizeof(queue_t)` 大小的記憶體給指向 `queue_t` 的指標 `q`,讓記憶體內的變數 `q->head` `q->tail` 指標指向 `NULL`,同時將 `q->size` 的值設為 `0`。 ```c queue_t *q = malloc(sizeof(queue_t)); q->head = q->tail = NULL; q->size = 0; return q; ``` **2. Return NULL if could not allocate space** 考量到有可能 malloc failed,使得 `malloc` 回傳 `NULL` ,所以實作可以用 `if` 條件式來判斷 `malloc` 是否成功配置記憶體,如果 `malloc` 配置記憶體失敗,便回傳 `malloc` 配置記憶體失敗時所回傳的值 `NULL`。 ```c queue_t *q = malloc(sizeof(queue_t)); if (q) { q->head = q->tail = NULL; q->size = 0; } return q; ``` **q_new 實作的完整程式碼** ```c queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (q) { q->head = q->tail = NULL; q->size = 0; } return q; } ``` ### q_free 傳遞參數: `queue_t *q` q_free 的實作考量到以下兩點: **1. No effect if q is NULL** 首先判斷指向 `queue_t` 的指標 `q` 是否為 `NULL`, 若為 `NULL`,則直接 `return`。 ```c if (!q) return; ``` **2. Free ALL storage used by queue** 刪除在 `queue_t` 結構中,由 `list_ele_t` 結構為節點所構成的 singly linked list * `list_ele_t` 結構所構成的節點中,必須 `free` 掉指向 array holding string 的指標 `char *value` 所動態配置 (`malloc`) 的記憶體 * 實作時,需要額外宣告一個指向 `list_ele_t` 的指標 `list_ele_t *tmp`,指向即將被刪除的節點,也就是在刪除整個 singly linked list 的過程中,`q->head` 所走訪的節點的上一個節點,以免在刪除一個節點的過程,直接抹除掉該節點連接至下一個節點的指標,導致存取不到該節點的下一個節點。 ```c list_ele_t *tmp = q->head; while (q->head) { free(tmp->value); free(tmp); q->head = tmp->next; tmp = q->head; } ``` 最後再將指向 `queue_t` 結構的指標所動態配置 (`malloc`) 的記憶體 `free` 掉 ```cpp free(q); ``` * **result:查看 valgrind 並分析錯誤** ``` +++ TESTING trace trace-11-malloc: # Test of malloc failure on insert_head ==27746== Invalid read of size 8 ==27746== at 0x10E2DE: q_free (queue.c:32) ==27746== by 0x10AA06: queue_quit (qtest.c:664) ==27746== by 0x10D133: do_quit_cmd (console.c:289) ==27746== by 0x10DB81: finish_cmd (console.c:604) ==27746== by 0x10C307: main (qtest.c:789) ==27746== Address 0x4b9da08 is 40 bytes inside a block of size 56 free'd ==27746== at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==27746== by 0x10E036: test_free (harness.c:209) ==27746== by 0x10E2D9: q_free (queue.c:31) ==27746== by 0x10AA06: queue_quit (qtest.c:664) ==27746== by 0x10D133: do_quit_cmd (console.c:289) ==27746== by 0x10DB81: finish_cmd (console.c:604) ==27746== by 0x10C307: main (qtest.c:789) ==27746== Block was alloc'd at ==27746== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==27746== by 0x10DE7E: test_malloc (harness.c:138) ==27746== by 0x10E31C: q_insert_head (queue.c:50) ==27746== by 0x10BC57: do_insert_head (qtest.c:215) ==27746== by 0x10CE65: interpret_cmda (console.c:221) ==27746== by 0x10D2FC: interpret_cmd (console.c:244) ==27746== by 0x10DB3F: cmd_select (console.c:594) ==27746== by 0x10DE0C: run_console (console.c:667) ==27746== by 0x10C2BF: main (qtest.c:788) ==27746== ==27746== Invalid read of size 8 ==27746== at 0x10E2CA: q_free (queue.c:30) ==27746== by 0x10AA06: queue_quit (qtest.c:664) ==27746== by 0x10D133: do_quit_cmd (console.c:289) ==27746== by 0x10DB81: finish_cmd (console.c:604) ==27746== by 0x10C307: main (qtest.c:789) ==27746== Address 0x5555555555555555 is not stack'd, malloc'd or (recently) free'd ==27746== Segmentation fault occurred. You dereferenced a NULL or invalid pointer ``` :::info 錯誤的結果出在於 **2. Free ALL storage used by queue** 的實作: ```c list_ele_t *tmp = q->head; while (q->head) { free(tmp->value); free(tmp); q->head = tmp->next; tmp = q->head; } ``` 這段程式碼因 `while` 檢查的條件 `q->head` 在迴圈內重新被賦值 ```c q->head = tmp->next; ``` 且 `tmp` 在迴圈的最後才被賦值 ```c tmp = q->head ``` 所以 `tmp` 最終的值有可能為 `NULL`,導致在 `free(tmp->value)` 的過程中 dereferenced a `NULL` pointer,造成 segmentation fault。 **2. Free ALL storage used by queue** 的實作修正為以下: ```diff list_ele_t *tmp = q->head; while (q->head) { - free(tmp->value); + q->head = tmp->next; + free(tmp->value); free(tmp); - q->head = tmp->next; tmp = q->head; } ``` ::: **q_free 實作的完整程式碼** ```c void q_free(queue_t *q) { if (!q) return; list_ele_t *tmp = q->head; while (q->head) { q->head = tmp->next; free(tmp->value); free(tmp); tmp = q->head; } free(q); } ``` :::info `q_free` 可用 `q_remove_head` 實作: ```c void q_free(queue_t *q) { if (!q) return; while (q_remove_head(q, NULL, 0)); free(q); } ``` ::: ### q_insert_head 傳遞參數: `queue_t *q` `char *s` `q_insert_head` 的實作考量到以下幾點: **1. Return false if q is NULL or could not allocate space.** 判斷指向 `queue_t` 的指標 `q` 是否為 `NULL`, 若為 `NULL`,則為 false case,`return false`。 ```c if (!q) return false; ``` 當動態配置記憶體失敗,`malloc` 回傳 `NULL`,為 false case,詳細情形如下: * 用 `malloc` 動態配置 `sizeof(list_ele_t)` 大小的記憶體,以新增節點 `list_ele_t *newh`,若 `malloc` 回傳 `NULL`,則為 false case。 * 計算傳遞參數 `char *s` 的字串長度 `len`,用 `malloc` 動態配置 `(len + 1) * sizeof(char)` 大小的記憶體給節點 `newh` 內的 char pointer `newh->value` ,若 `malloc` 回傳 `NULL`,則為 false case,此 false case 要考量: * 宣告變數`size_t len = strlen(s)` 是為了避免呼叫 `strlen` 兩次。注意到`strlen` 讀到 `'\0'` 便會停下來,因此 `len` 的值不會將 `'\0'` 的長度算在內。 * `len + 1` 是為了將結尾字元留給 null terminator `'\0'`。 * 在此 false case 下,要注意到 `newh` 的生命週期 (life cycle) 只存在此 function 內,所以在 function 結束前 (`return false` 之前),必須將先前分配給 `newh` 的記憶體 `free` 掉,否則會造成 memory leak。 ```c list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!newh) return false; size_t len = strlen(s); newh->value = malloc ((len + 1) * sizeof(char)); if (!newh->value) { free(newh); return false; } ``` **2. The function must explicitly allocate space and copy the string into it.** 不使用不安全的 `strcpy`,使用可以指定字串長度的 `strncpy` `memcpy` 實作 string copy。要注意結尾字串加上 `'\0'`。 ```c strncpy(newh->value, s, len); newh->value[len] = '\0'; ``` :::info **使用 `strncpy()` 取代 `strcpy()`** *... note that strncpy only insecure when you forget to manually add the null terminator ... always sets the final character to '\0' after calling strncpy. – ojrac May 15 '09 at 17:25* from **[Stack Overflow](https://stackoverflow.com/questions/869883/why-is-strncpy-insecure)** ::: ```c memcpy(newh->value, s, len + 1); ``` :::info **使用 `memcpy()` 取代 `strcpy()`** *strcpy stops when it encounters a NUL ('\0') character, memcpy does not. You do not see the effect here, as %s in printf also stops at NUL. – answered May 24 '10 at 16:11 Yann Ramin* from **[Stack Overflow](https://stackoverflow.com/questions/2898364/strcpy-vs-memcpy)** ::: **3. Attempt to insert element at head of queue.** ```c newh->next = q->head; q->head = newh; if (!q->tail) { q->tail = newh; } q->size++; ``` **4. Return true if successful.** 排除上面的 false case 之後,成功 insert element at head of queue,便可以 `return true`。 ```cpp return true; ``` **q_insert_head 實作的完整程式碼** ```c bool q_insert_head(queue_t *q, char *s) { if (!q) return false; list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!newh) return false; size_t len = strlen(s); newh->value = malloc ((len + 1) * sizeof(char)); if (!newh->value) { free(newh); return false; } strncpy(newh->value, s, len); newh->value[len] = '\0'; newh->next = q->head; q->head = newh; if (!q->tail) q->tail = newh; q->size++; return true; } ``` :::info **縮減程式碼** 可以用 `strdup()` 針對 allocate & copy string 做改進: ```c newh->value = strdup(s); if (!newh->value) { free(newh); return false; } ``` **縮減結果** ```diff bool q_insert_head(queue_t *q, char *s) { if (!q) return false; list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!newh) return false; - size_t len = strlen(s); - newh->value = malloc ((len + 1) * sizeof(char)); + newh->value = strdup(s); if (!newh->value) { free(newh); return false; } - strncpy(newh->value, s, len); - newh->value[len] = '\0'; newh->next = q->head; q->head = newh; if (!q->tail) q->tail = newh; q->size++; return true; } ``` ::: ### q_insert_tail 傳遞參數: `queue_t *q` `char *s` `q_insert_tail` 的實作與 `q_insert_head` 差異不大,主要差別於新節點 `newt` 的 `newt->next` 必須指向 `NULL`,以及當佇列 (Queue) 為空的時候,insert 時的處理稍微不同。 **q_insert_tail 實作的完整程式碼** ```c bool q_insert_tail(queue_t *q, char *s) { if (!q) return false; list_ele_t *newt = malloc(sizeof(list_ele_t)); if (!newt) return false; size_t len = strlen(s); newt->value = malloc((len + 1) * sizeof(char)); if (!newt->value) { free(newt); return false; } strncpy(newt->value, s, len); newt->value[len] = '\0'; newt->next = NULL; if (!q->tail) q->head = newt; else q->tail->next = newt; q->tail = newt; q->size++; return true; } ``` :::warning 上方程式碼可縮減,請嘗試改進。 :notes: jserv ::: :::info **縮減程式碼** 實現在 `q_insert_head` 縮減程式碼的方式。 **縮減結果** ```diff bool q_insert_tail(queue_t *q, char *s) { if (!q) return false; list_ele_t *newt = malloc(sizeof(list_ele_t)); if (!newt) return false; - size_t len = strlen(s); - newt->value = malloc((len + 1) * sizeof(char)); + newt->value = strdup(s); if (!newt->value) { free(newt); return false; } - strncpy(newt->value, s, len); - newt->value[len] = '\0'; newt->next = NULL; if (!q->tail) q->head = newt; else q->tail->next = newt; q->tail = newt; q->size++; return true; } ``` :notes: iLeafy11 ::: ### q_remove_head 傳遞參數: `queue *q` `char *sp` `size_t bufsize` `q_remove_head` 的實作考量到以下幾點: **1. Return false if queue is NULL or empty** 首先判斷指向 `queue_t` 的指標 `q` 是否為 `NULL`,或指向 `list_ele_t` 的指標 `q->head` 是否為 `NULL`,確認佇列是否為空,若其條件成立,則為 false case,`return false`。 ```c if (!q || !q->head) return false; ``` **2. If `sp` is non-NULL and an element is removed, copy the removed string to `*sp`** 檢查參數 `sp` 是否已被動態配置記憶體,若 `sp` 並非指向 `NULL`,且參數 `bufsize` 的值不為 `0`,則將佇列 (Queue) 中 linked list 最頭端的元素所存的字串 `q->head->value` 的內容,複製到 `sp` 所指向的那塊記憶體。 * 不使用 `strcpy()`。使用 `strncpy()` 時注意 null terminator `'\0'` 的處理。 ```c if (sp && bufsize) { strncpy(sp, q->head->value, bufsize - 1); sp[bufsize - 1] = '\0'; } ``` :::info 也可使用 `memcpy()` : ```c if (sp && bufsize) memcpy(sp, q->head->value, bufsize); ``` ::: **3. The space used by the list element and the string should be freed** 如同實作 `q_free` 一樣,需要額外宣告 `list_ele_t *tmp`,指向即將被刪除的節點,然後再更改 `q->head` 的值為 `q->head->next`。記得刪除佇列 (Queue) 中 linked list 最頭端的元素後,其元素個數 `q->size` 必須要減一。 ```c list_ele_t *tmp = q->head; q->head = tmp->next; free(tmp->value); free(tmp); q->size--; ``` **4. Return true if successful** ```c return true; ``` **q_remove_head 實作的完整程式碼** ```c bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || !q->head) return false; if (sp && bufsize) { strncpy(sp, q->head->value, bufsize - 1); sp[bufsize - 1] = '\0'; } list_ele_t *tmp = q->head; q->head = tmp->next; free(tmp->value); free(tmp); q->size--; return true; } ``` ### q_size 傳遞參數: `queue_t *q` 因為在 `list_ele_t` 資料結構內增加變數 `size_t size`,並且在佇列 (Queue) 元素的增減的同時修改 `size` 的大小,所以 `q_size`的時間複雜度可以達到 $O(1)$。 實作時需考量到: **Return 0 if q is NULL or empty** `(!q || !q->head)` 此條件判斷即可達成。 ```cpp size_t q_size(queue_t *q) { return (!q || !q->head) ? 0 : q->size; } ``` ### q_reverse 傳遞參數: `queue_t *q` **1. No effect if q is NULL or empty** ```c if (!q || !q->head) return; ``` :::info **Also no effect if q has only one element** ```c if (!q->head->next) return; ``` ::: **2. Reverse elements in queue** 宣告 `list_ele_t *prev, *curr, *tmp`,`curr` 會走訪整個 singly linked list,`prev` 會指向 `curr` 目前走訪到的節點的上一個節點,`tmp`則指向即將要 reverse 的節點。在 `curr` 走訪過一節點之後 `tmp` 會指向該處,然後將 `tmp->next` 的值改為 `prev`,最後在迴圈外修飾一下 `q` 的頭尾,如此便可以 reverse 整個 singly linked list。 ```c llist_ele_t *prev = NULL; list_ele_t *curr = q->head; while (curr) { list_ele_t *tmp; tmp = curr; curr = curr->next; tmp->next = prev; prev = curr; } q->tail = q->head; q->head = prev; ``` * **Result:查看 valgrind 並分析錯誤** ``` +++ TESTING trace trace-03-ops: # Test of insert_head, insert_tail, reverse, and remove_head ==27017== Invalid read of size 8 ==27017== at 0x10B9D3: do_insert_tail (qtest.c:303) ==27017== by 0x10CE65: interpret_cmda (console.c:221) ==27017== by 0x10D2FC: interpret_cmd (console.c:244) ==27017== by 0x10DB3F: cmd_select (console.c:594) ==27017== by 0x10DE0C: run_console (console.c:667) ==27017== by 0x10C2BF: main (qtest.c:788) ==27017== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==27017== Segmentation fault occurred. You dereferenced a NULL or invalid pointer ``` :::info 推測錯誤出在於 `q_reverse`。 錯誤的結果出在於 **2. Reverse elements in queue** 的實作: ```c list_ele_t *curr = q->head; list_ele_t *prev = NULL; while (curr) { list_ele_t *tmp; tmp = curr; curr = curr->next; tmp->next = prev; prev = curr; } q->tail = q->head; q->head = prev; ``` 這段程式碼因 `while` 檢查的條件 `curr` 在迴圈內重新被賦值 ```c curr = curr->next; ``` 且 `prev` 在迴圈的最後才被賦值 ```c prev = curr; ``` 所以最後 `prev` 的值有可能為 `NULL`,導致佇列 (Queue) 在之後的 operation 可能會 dereferenced a `NULL` pointer,造成 segmentation fault。 **2. Reverse elements in queue** 的實作修正為以下: ```c list_ele_t *curr = q->head; list_ele_t *next = NULL; list_ele_t *prev = NULL; while (curr) { next = curr->next; curr->next = prev; prev = curr; curr = next; } q->tail = q->head; q->head = prev; ``` ::: **`q_reverse` 實作的完整程式碼** ```c void q_reverse(queue_t *q) { if (!q || !q->head) return; if (!q->head->next) return; ist_ele_t *curr = q->head; list_ele_t *next = NULL; list_ele_t *prev = NULL; while (curr) { next = curr->next; curr->next = prev; prev = curr; curr = next; } q->tail = q->head; q->head = prev; } ``` ### q_sort 傳遞參數: `queue_t *q` :::info 首先選擇 q_sort 的 sorting algorithm 肯定是從 stable sorting algorithm 裡面挑,尤其這是要 sort 一個 linked list,ramdom access $O(1)$ 的優勢不再,所以一定用一個穩定又快速的 sorting algorithm,~~誰都不想跟自己的分數過不去~~。 為何選擇使用 Merge Sort? 因為 Merge Sort 的表現~~從不令人失望~~很穩定。尤其是在 linked list 上,穩穩妥妥的 $O(nlogn)$。每次 merge sort 在做 `split` 時,被分裂的兩個 list 的長度幾乎會是相同 (因為從中間切開),`merge` 時的表現也十分穩定,同樣長度的 list,做比較大小 (Comparisons) 的次數也相同。 ::: 用 `merge_sort` 實作,參考: [Merge Sort of Linked List from geekforgeeks](https://www.geeksforgeeks.org/merge-sort-for-linked-list/) `merge_list` 用 iterative 的方式實作,參考: [Merge Sort a Linked List from Stack Overflow](https://stackoverflow.com/questions/7685/merge-sort-a-linked-list)。 ```c void split(list_ele_t *src, list_ele_t **l, list_ele_t **r) { list_ele_t *slow = src; list_ele_t *fast = src->next; while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } *l = src; *r = slow->next; slow->next = NULL; } void merge_list(list_ele_t **head, list_ele_t *l, list_ele_t *r) { list_ele_t *list = NULL; list_ele_t **indirect; for (indirect = &list; l && r; indirect = &(*indirect)->next) { if (strcmp(l->value, r->value) < 0) { *indirect = l; l = l->next; } else { *indirect = r; r = r->next; } } *indirect = l ? l : r; *head = list; } void merge_sort(list_ele_t **head) { if (!(*head) || !(*head)->next) return; list_ele_t *left = NULL; list_ele_t *right = NULL; split(*head, &left, &right); merge_sort(&left); merge_sort(&right); merge_list(head, left, right); } void q_sort(queue_t *q) { if (!q || !q->head) return; merge_sort(&q->head); list_ele_t *walk = q->head; while (walk->next) walk = walk->next; q->tail = walk; } ``` merge_sort 結束之後要記得讓 `q->tail` 指向此 singly linked list 的尾端,否則在 `q_sort` 完之後 `q_insert_tail` 會發生問題。 * **Result** ``` $ make test ``` ``` scripts/driver.py -c --- Trace Points ... --- TOTAL 100/100 ``` ## Address Sanitizer 開啟 Address Sanitizer 強化執行時期的記憶體檢測,修正 qtest 執行過程中的錯誤。 ``` $ make clean $ make SANITIZER=1 $ make test ``` 得到以下結果: ``` +++ TESTING trace trace-17-complexity: # Test if q_insert_tail and q_size is constant time complexity ================================================================= ==31964==ERROR: AddressSanitizer: global-buffer-overflow on address 0x557b13c566e0 at pc 0x557b13c3ec79 bp 0x7ffcf2ffade0 sp 0x7ffcf2ffadd0 READ of size 4 at 0x557b13c566e0 thread T0 #0 0x557b13c3ec78 in do_option_cmd /home/ubuntu/linux/lab0-c/console.c:375 #1 0x557b13c3da47 in interpret_cmda /home/ubuntu/linux/lab0-c/console.c:227 #2 0x557b13c3e22c in interpret_cmd /home/ubuntu/linux/lab0-c/console.c:250 #3 0x557b13c3f472 in cmd_select /home/ubuntu/linux/lab0-c/console.c:605 #4 0x557b13c3f9ec in run_console /home/ubuntu/linux/lab0-c/console.c:678 #5 0x557b13c3c531 in main /home/ubuntu/linux/lab0-c/qtest.c:788 #6 0x7f3ba97610b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) #7 0x557b13c39b8d in _start (/home/ubuntu/linux/lab0-c/qtest+0x8b8d) 0x557b13c566e1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:21:6' (0x557b13c566e0) of size 1 'simulation' is ascii string '' SUMMARY: AddressSanitizer: global-buffer-overflow /home/ubuntu/linux/lab0-c/console.c:375 in do_option_cmd Shadow bytes around the buggy address: 0x0aafe2782c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0aafe2782c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0aafe2782ca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0aafe2782cb0: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 0x0aafe2782cc0: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 =>0x0aafe2782cd0: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9[01]f9 f9 f9 0x0aafe2782ce0: f9 f9 f9 f9 00 00 00 00 01 f9 f9 f9 f9 f9 f9 f9 0x0aafe2782cf0: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0aafe2782d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0aafe2782d10: 00 f9 f9 f9 f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 0x0aafe2782d20: 01 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==31964==ABORTING ``` 同時,若我們直接執行 `qtest` ,再輸入 `help`,會得到以下結果: ``` $ ./qtest cmd> help Commands: # ... | Display comment free | Delete queue hello | Print hello message help | Show documentation ih str [n] | Insert string str at head of queue n times. Generate random string(s) if str equals RAND. (default: n == 1) it str [n] | Insert string str at tail of queue n times. Generate random string(s) if str equals RAND. (default: n == 1) log file | Copy output to file new | Create new queue option [name val] | Display or set options quit | Exit program reverse | Reverse queue rh [str] | Remove from head of queue. Optionally compare to expected value str rhq | Remove from head of queue without reporting value. show | Show queue contents size [n] | Compute queue size n times (default: n == 1) sort | Sort queue in ascending order source file | Read commands from source file time cmd arg ... | Time command execution Options: ================================================================= ==32039==ERROR: AddressSanitizer: global-buffer-overflow on address 0x563909a754c0 at pc 0x563909a5e934 bp 0x7ffd0d9530a0 sp 0x7ffd0d953090 READ of size 4 at 0x563909a754c0 thread T0 #0 0x563909a5e933 in do_help_cmd /home/ubuntu/linux/lab0-c/console.c:313 #1 0x563909a5ea47 in interpret_cmda /home/ubuntu/linux/lab0-c/console.c:227 #2 0x563909a5f22c in interpret_cmd /home/ubuntu/linux/lab0-c/console.c:250 #3 0x563909a60989 in run_console /home/ubuntu/linux/lab0-c/console.c:671 #4 0x563909a5d531 in main /home/ubuntu/linux/lab0-c/qtest.c:788 #5 0x7f2e82d280b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) #6 0x563909a5ab8d in _start (/home/ubuntu/linux/lab0-c/qtest+0x8b8d) 0x563909a754c1 is located 0 bytes to the right of global variable 'echo' defined in 'console.c:59:13' (0x563909a754c0) of size 1 SUMMARY: AddressSanitizer: global-buffer-overflow /home/ubuntu/linux/lab0-c/console.c:313 in do_help_cmd Shadow bytes around the buggy address: 0x0ac7a1346a40: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 0x0ac7a1346a50: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 0x0ac7a1346a60: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 0x0ac7a1346a70: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0ac7a1346a80: 00 00 f9 f9 f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 =>0x0ac7a1346a90: 01 f9 f9 f9 f9 f9 f9 f9[01]f9 f9 f9 f9 f9 f9 f9 0x0ac7a1346aa0: 04 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 0x0ac7a1346ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac7a1346ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac7a1346ad0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac7a1346ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==32039==ABORTING ``` ### 找出問題 查看上面兩個錯誤 case 的訊息 ``` 0x557b13c566e1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:21:6' (0x557b13c566e0) of size 1 'simulation' is ascii string '' SUMMARY: AddressSanitizer: global-buffer-overflow /home/ubuntu/linux/lab0-c/console.c:375 in do_option_cmd ``` ``` 0x563909a754c1 is located 0 bytes to the right of global variable 'echo' defined in 'console.c:59:13' (0x563909a754c0) of size 1 SUMMARY: AddressSanitizer: global-buffer-overflow /home/ubuntu/linux/lab0-c/console.c:313 in do_help_cmd ``` 於 `console.[ch]` 找出 global variable `simulation`、`echo` ```c static bool simulation = false; static bool echo = 0; ``` 找出與此有關的函式 ```c void init_cmd() { ... add_param("simulation", (int *) &simulation, "Start/Stop simulation mode", NULL); ... ... add_param("echo", (int *) &echo, "Do/don't echo commands", NULL); ... } ``` ```c void add_param(char *name, int *valp, char *documentation, setter_function setter); ``` 其中,我們可以發現 `bool` 形態的 `echo` 被強制轉型 `(int *) &echo`,其 address value 才得以傳入 `add_param`。但是 `bool` type 所佔的空間大小小於 `int` type,所以在這情況下 `echo` 傳入 `add_param` 會被認定是 `int` type,而有可能會在讀取其值時越界存取記憶體,此行為為 Undefined Behaviour,於是執行時 Address Sanitizer 報錯。 :::info from **C99 Standard** — An object declared as type _Bool is large enough to store the values 0 and 1. — The rank of _Bool shall be less than the rank of all other standard integer types. 測試: ```c #include <stdio.h> #include <stdbool.h> int main () { printf("%zu\n%zu\n", sizeof(int), sizeof(bool)); return 0; } ``` gcc: ``` gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ``` 結果: ``` $ ./a.out 4 1 ``` 可以看出,在此環境下, `int` type 的變數佔據 4 個 byte 的空間, `bool` 則為 1 個 byte。 ::: 查看 `add_param` 內容 ```c void add_param(char *name, int *valp, char *documentation, setter_function setter) { param_ptr next_param = param_list; param_ptr *last_loc = &param_list; while (next_param && strcmp(name, next_param->name) > 0) { last_loc = &next_param->next; next_param = next_param->next; } param_ptr ele = (param_ptr) malloc_or_fail(sizeof(param_ele), "add_param"); ele->name = name; ele->valp = valp; ele->documentation = documentation; ele->setter = setter; ele->next = next_param; *last_loc = ele; } ``` 找出對應 structure 於 `console.h`: ```c typedef struct PELE param_ele, *param_ptr; struct PELE { char *name; int *valp; char *documentation; /* Function that gets called whenever parameter changes */ setter_function setter; param_ptr next; }; ``` `add_param` 對於我們感興趣的值 `valp` 會複製其 address 至 `param_ptr ele` 的成員變數,一個指向 `int` 的指標 `ele->valp`。 ### 解決問題 #### 方法 1:將 `echo` `simulation` 型態宣告為 `int`。 `console.h`: ```c extern int simulation; ``` `console.c`: ```c static int simulation = 0; static int echo = 0; ``` **Result** ``` $ make clean $ make SANITIZER=1 $ make test ... --- TOTAL 100/100 ``` ``` $ ./qtest cmd> help Commands: ... Options: echo 1 Do/don't echo commands error 5 Number of errors until exit fail 30 Number of times allow queue operations to return false length 1024 Maximum length of displayed string malloc 0 Malloc failure probability percent simulation 0 Start/Stop simulation mode verbose 4 Verbosity level ``` #### 方法 2:創造出一個新的資料結構解決此問題 `bool` 型態值域應當只有 [0, 1],所以方法 1 沒有從根本解決問題,所以可能的解法是創造出一個 `struct` 或 `union` 去決定該變數型態的大小以及資料結構,然後定義出一個函式去對其大小以及型態做判斷,然而此方法衍生出的問題是,需要大幅更動原本程式碼,造成維護上的困難,有可能「拆東牆,補西牆」。 ## Valgrind & Massif ### 用 Valgrind 追蹤記憶體使用情形。 #### 查看 valgrind 訊息 ``` $ make valgrind ``` ``` +++ TESTING trace trace-01-ops: # Test of insert_head and remove_head ==1617== 11 bytes in 1 blocks are still reachable in loss record 1 of 3 ==1617== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==1617== by 0x4A4650E: strdup (strdup.c:42) ==1617== by 0x110106: linenoiseHistoryAdd (linenoise.c:1236) ==1617== by 0x110C99: linenoiseHistoryLoad (linenoise.c:1325) ==1617== by 0x10C287: main (qtest.c:777) ==1617== ==1617== 114 bytes in 18 blocks are still reachable in loss record 2 of 3 ==1617== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==1617== by 0x4A4650E: strdup (strdup.c:42) ==1617== by 0x11007A: linenoiseHistoryAdd (linenoise.c:1236) ==1617== by 0x110C99: linenoiseHistoryLoad (linenoise.c:1325) ==1617== by 0x10C287: main (qtest.c:777) ==1617== ==1617== 160 bytes in 1 blocks are still reachable in loss record 3 of 3 ==1617== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==1617== by 0x1100C6: linenoiseHistoryAdd (linenoise.c:1224) ==1617== by 0x110C99: linenoiseHistoryLoad (linenoise.c:1325) ==1617== by 0x10C287: main (qtest.c:777) ==1617== --- trace-01-ops 6/6 ... ``` Valgrind 指出有 memory leak 可能發生在 `linenoiseHistoryAdd` `linenoiseHistoryLoad`。 #### 用 Massif 再檢查一次 ``` $ valgrind --tool=massif ./qtest < traces/trace-01-ops.cmd $ ms_print massif.out.1697 ``` * **result** ``` -------------------------------------------------------------------------------- Command: ./qtest Massif arguments: (none) ms_print arguments: massif.out.1697 -------------------------------------------------------------------------------- KB 19.79^ :: | ::@:: ::#: :: :: @ : | ::@: ::: @: : #: ::: ::: : @ : | : @ : : @: : #: : : : : : @ : | : @ : : @: : #: : : : : ::@ : | : @ : : @: :: #: @ : : : :::@::: | :: @ :: : @: :: #: @ : : : :::@:::: | :: @ :: : @: :: #: @ : : : :::@:::: | :: @ :: : @: :: #: @ : : : :::@:::: | :: @ :: : @: :: #: @ : : : :::@:::: | :: @ :: : @: :: #: @ : : : :::@:::: | :: @ :: : @: :: #: @ : : : :::@:::: | :: @ :: : @: :: #: @ : : : :::@:::: | :: @ :: : @: :: #: @ : : : :::@:::: | ::: @ :: : @: :: #: @ : : : :::@:::: | @:: @ :: : @: :: #: @ : : : :::@:::: | @:: @ :: : @: :: #: @ : : : :::@:::: | @:: @ :: : @: :: #: @ : : : :::@:::: | @:: @ :: : @: :: #: @ : : : :::@:::: | :@:: @ :: : @: :: #: @ : : : :::@:::: 0 +----------------------------------------------------------------------->ki 0 411.7 Number of snapshots: 67 Detailed snapshots: [7, 19, 20, 31, 39 (peak), 44, 58] -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 0 0 0 0 0 0 1 204,118 40 32 8 0 2 205,227 320 256 64 0 3 206,012 584 448 136 0 4 207,179 744 576 168 0 5 208,760 864 672 192 0 6 210,585 1,600 1,328 272 0 7 211,702 5,896 5,595 301 0 94.89% (5,595B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ... ->02.24% (445B) in 8 places, all below massif's threshold (1.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 59 402,979 15,776 15,109 667 0 60 403,737 15,312 14,638 674 0 61 404,753 19,880 19,201 679 0 62 415,340 15,776 15,105 671 0 63 416,107 15,088 14,473 615 0 64 416,875 14,472 13,985 487 0 65 418,060 5,832 5,465 367 0 66 421,612 1,728 1,369 359 0 ``` 用 ms_print 查看 `Massif` 結果,顯示的確有 memory leak。 ### 檢查程式碼 `linenoise.c` 先查看可能發生問題的相關函式及相關變數。 ```c static int history_max_len = LINENOISE_DEFAULT_HISTORY_MAX_LEN; static int history_len = 0; static char **history = NULL; ``` ```c int linenoiseHistoryAdd(const char *line) { char *linecopy; if (history_max_len == 0) return 0; if (history == NULL) { history = malloc(sizeof(char *) * history_max_len); if (history == NULL) return 0; memset(history, 0, (sizeof(char *) * history_max_len)); } if (history_len && !strcmp(history[history_len - 1], line)) return 0; linecopy = strdup(line); if (!linecopy) return 0; if (history_len == history_max_len) { free(history[0]); memmove(history, history + 1, sizeof(char *) * (history_max_len - 1)); history_len--; } history[history_len] = linecopy; history_len++; return 1; } ``` #### 查看變數 `linecopy` `history` 前者為 local variable,後者為 global variable。 global variable 的 case 要檢查使用到 `linenoiseHistoryAdd` 的函式 `linenoiseHistoryLoad`,查看 `history` 所指向的空間是否有做相對應的釋放 ```c int linenoiseHistoryLoad(const char *filename) { FILE *fp = fopen(filename, "r"); char buf[LINENOISE_MAX_LINE]; if (fp == NULL) return -1; while (fgets(buf, LINENOISE_MAX_LINE, fp) != NULL) { char *p; p = strchr(buf, '\r'); if (!p) p = strchr(buf, '\n'); if (p) *p = '\0'; linenoiseHistoryAdd(buf); } fclose(fp); return 0; } ``` 我們可以觀察到 `history` 被動態配置的記憶體是大小為 `sizeof(char *) * history_max_len` ,是指向 `char *` 的指標,而這些指標會指向 `linecopy` 所 allocate 的 memory。 看到了 `while` 迴圈,推測 `history` 不會希望在 `linenoiseAdd` 就直接釋放掉記憶體,因為 `history` 再讀一筆資料之後,會想要保存那筆資料直到 `history_len == history_max_len`。所以我們處理 `history` 的方式會是在跳出迴圈之後,再一次 `free` 掉。而在 free 掉 `history` 所指向的空間時,也要將藉由 `linecopy` 所 allocate 的 memory 釋放掉。 ```diff int linenoiseHistoryLoad(const char *filename) { FILE *fp = fopen(filename, "r"); char buf[LINENOISE_MAX_LINE]; if (fp == NULL) return -1; while (fgets(buf, LINENOISE_MAX_LINE, fp) != NULL) { char *p; p = strchr(buf, '\r'); if (!p) p = strchr(buf, '\n'); if (p) *p = '\0'; linenoiseHistoryAdd(buf); } + for (int i = 0; i < history_max_len; i++) + free(history[i]); + free(history); fclose(fp); return 0; } ``` #### Result: Valgrind & Massif 測試 ``` $ make valgrind ``` 發現錯誤出現在 trace-15-ops.cmd ``` $ valgrind ./qtest < traces/trace-15-perf.cmd ``` ``` ==2759== Invalid read of size 8 ==2759== at 0x110062: linenoiseHistoryAdd (linenoise.c:1231) ==2759== by 0x10DDE8: run_console (console.c:668) ==2759== by 0x10C2B7: main (qtest.c:788) ==2759== Address 0x4b9ad08 is 152 bytes inside a block of size 160 free'd ==2759== at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==2759== by 0x110D0A: linenoiseHistoryLoad (linenoise.c:1330) ==2759== by 0x10C287: main (qtest.c:777) ==2759== Block was alloc'd at ==2759== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==2759== by 0x1100C6: linenoiseHistoryAdd (linenoise.c:1224) ==2759== by 0x110C9F: linenoiseHistoryLoad (linenoise.c:1325) ==2759== by 0x10C287: main (qtest.c:777) ... ==2759== Invalid read of size 8 ==2759== at 0x10B098: do_sort (qtest.c:564) ==2759== by 0x10CE7F: interpret_cmda (console.c:223) ==2759== by 0x10D316: interpret_cmd (console.c:246) ==2759== by 0x10DDE0: run_console (console.c:667) ==2759== by 0x10C2B7: main (qtest.c:788) ==2759== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==2759== Segmentation fault occurred. You dereferenced a NULL or invalid pointer ``` 所以很顯然,不應該是在那裡就釋放掉 history 的記憶體。所以我們重新檢視 `linenoise.c` `console.c` 的程式碼,重新完成兩者的實作。 ### 完成 `linenoise.c`,整合 `linenoise.c` `console.c` 首先觀察 `console.c` 的函式 `run_console()` ```c ``` * 觀察參數 `quit_flag` 以及其值對於其他函式的影響 * 觀察`linenoise()` 函式 * `linenoiseAtExit()` 之謎:沒有被呼叫到 * ### qtest TODO & FIXME TODO: ```diff /* * TODO: Add a buf_size check of if the buf_size may be less * than MIN_RANDSTR_LEN. */ static void fill_rand_string(char *buf, size_t buf_size) { size_t len = 0; while (len < MIN_RANDSTR_LEN) len = rand() % buf_size; for (size_t n = 0; n < len; n++) { buf[n] = charset[rand() % (sizeof charset - 1)]; } buf[len] = '\0'; } ``` FIXME: ```diff bool do_sort(int argc, char *argv[]) { ... /* Ensure each element in ascending order */ /* FIXME: add an option to specify sorting order */ - if (strcasecmp(e->value, e->next->value) > 0) { + if (strcmp(e->value, e->next->value) > 0) { report(1, "ERROR: Not sorted in ascending order"); ok = false; break; } ... } ```