2018q3 Homework2 (lab0)

# 2018q3 Homework2 (lab0) contributed by `<0xff07>` # 環境 `uname -i`: ``` Linux 4.15.0-34-generic x86_64 GNU/Linux ``` `lscpu`: ```jsx Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 70 Model name: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz Stepping: 1 CPU MHz: 2289.664 CPU max MHz: 3700,0000 CPU min MHz: 800,0000 BogoMIPS: 4988.61 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 6144K L4 cache: 131072K NUMA node0 CPU(s): 0-7 ``` `cat /etc/lsb-release`: ```jsx DISTRIB_ID=Ubuntu DISTRIB_RELEASE=18.04 DISTRIB_CODENAME=bionic DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS" ``` # 實作 ## `queue_t` 因為要求尾端插入與長度查詢的複雜度是 $O(1)$ ，因此在資料結構中加入 `tail` 與 `size` 分別紀錄尾端與長度: ```c= typedef struct { list_ele_t *head; list_ele_t *tail; long long size; } queue_t; ``` ## `q_free` ```C= void q_free(queue_t *q) { if (!q) return; list_ele_t *tmp = q->head; while (q->head) { tmp = q->head; q->head = q->head->next; if (tmp->value) free(tmp->value); if (tmp) free(tmp); } if (q) free(q); } ``` 若 `head` 為空，則顯然記憶體已經釋放。若 `head` 非空，則「釋放 `head`」可遞迴地視為「釋放串列第一個元素(迴圈中暫存於 `tmp`) ，並釋放剩下所有元素形成的串列(`q -> head -> next`)」。而再釋放串列中的元素之後，再釋放 `q` 即可。若在 `free` 中傳入空指標，在測試程式中將出現 `Attempt to free NULL` 的錯誤訊息，並且扣分(儘管 `man` 中寫到傳入空指標是合法的)。因此這邊所有的 `free` 都事先檢測是否為空指標。 ## `q_insert_head` ```C= bool q_insert_head(queue_t *q, char *s) { list_ele_t *newh = NULL; char *_s = NULL; do { if (!q) break; newh = malloc(sizeof(list_ele_t)); if (!newh) break; _s = malloc(strlen(s) + 1); if (!_s) break; strcpy(_s, s); newh->value = _s; newh->next = q->head; q->head = newh; if (!q->tail) q->tail = newh; q->size++; return true; } while (0); if (newh) free(newh); if (_s) free(_s); return false; } ``` 使用 `do{...} while(0)` 配合 `break` ，達到 `goto` 的效果，並進行各種例外處理。並在插入後，將~~屁股~~尾端維護好。正在思考例外處理與判斷的邏輯是否能夠更精減（立刻回去翻上週作業嗚嗚）。另外發現若將 14. 15 行改成 cpumpound-literal 的寫法： ```C= *new = (list_ele_t){.value=_s,.next=q->head}; ``` 在 `git commit` 時似乎會被檢查出 memory leak，跑出： ```= [queue.c:90]: (error) Memory leak: _s Fail to pass static analysis. ``` 但該作法可以順利通過 `qtest` 。因此不是很確定這種用法是否將造成 memory leak，或是有什麼潛在的不良影響。相關資料待查。 ## `q_insert_tail` ```C= bool q_insert_tail(queue_t *q, char *s) { list_ele_t *newh = NULL; char *_s = NULL; do { if (!q) break; newh = malloc(sizeof(list_ele_t)); if (!newh) break; _s = malloc(strlen(s) + 1); if (!_s) break; strcpy(_s, s); newh->value = _s; newh->next = NULL; if (q->head) q->tail->next = newh; else q->head = newh; q->tail = newh; q->size++; return true; } while (0); if (newh) free(newh); if (_s) free(_s); return false; } ``` 大致上與前一個函數相同，紀錄尾端並在插入後維護好 `tail` 與 `head` 的指標。「新增 `list_ele_t` 」似乎可以寫成一個函式，但發現使用 `strdup` 時，即使 `man` 中提到： >`The strdup() function returns a pointer to a new string which is a duplicate of the string s. Memory for the new string is obtained with malloc(3), and can be freed with free(3).` 即 `strdup` 中的記憶體由 `malloc` 配置，但使用 `strdup` 時測試程式，仍會判定沒有正確配置記憶體（見「其他」），因此尚未測試。 ## `q_remove_head` 依照要求，除了釋放記憶體，必要時得將節點中存的字串進行複製。因此照題目要求寫出來： ```C= bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || !q->head) return false; list_ele_t *tmp = NULL; tmp = q->head; q->head = q->head->next; if (sp && tmp->value) { strncpy(sp, tmp->value, bufsize); sp[bufsize - 1] = 0; } q->size--; if (tmp->value) free(tmp->value); if (tmp) free(tmp); return true; } ``` ## `q_size` 直接回傳存好的值大小： ```C= int q_size(queue_t *q) { return q ? q->size : 0; } ``` ## `q_reverse` 假定 $L = [1,\dots,k-1,k\dots,n]$ ，則： 1. 若串列為空，或是串列僅有一個元素。則自己就是自己的反轉 2. 若已經有前 $k$ 個元素反轉的串列 $L' = [k-1 \dots 0]$: $$ [k-1 \dots 0]\cup[k,k+1 \dots n] $$ 則將第 $k$ 個元素加入反轉串列的頭部： $$ [k,k-1 \dots 0]\cup[k+1 \dots n] $$ 就能構造出前 $k + 1$ 個元素反轉的串列。以此遞推下去。 ```C= void q_reverse(queue_t *q) { if (!q || !q->head) return; if (q->head == q->tail) return; list_ele_t *buf[3] = {q->head, q->head->next, q->head->next->next}; q->head->next = NULL; q->tail = q->head; while (buf[2]) { buf[1]->next = buf[0]; buf[0] = buf[1]; buf[1] = buf[2]; buf[2] = buf[2]->next; } buf[1]->next = buf[0]; q->head = buf[1]; } ``` 前 2 個 `if` 敘述描述的是第 1. 中的狀況; 而其後的程式用於處理 2. 中的狀況。我覺得這樣的實作看起來並不一目瞭然。似乎有待改進。 # 測試程式討論 ## 記憶體管理在 `harness.[ch]` 當中，可以看到記憶體管理相關的內容。該記憶體管理機制相關說明如下： 1. 將原先的 `malloc` 與 `free` 使用 `test_malloc` 與 `test_free` 取代。該部份可見 `harness.h` 最後面： ```C= /* Tested program use our versions of malloc and free */ #define malloc test_malloc #define free test_free ``` 所以實際上，在 `queue.c` 中呼叫 `malloc` 與 `free` 時，其實式分別呼叫 `test_malloc` 與 `test_free` 兩個函式。以下都用 `test_malloc` 與 `test_free` 稱呼該二函式（而不是 `malloc` 與 `free`）。 2. 所有因呼叫 `test_malloc` 得來的記憶體，實際上是紀錄在一個名為 `allocated` 的一個 doubly-linked list 中。該 linked list 結點的結構為 `block_ele_t`，定義在 `harness.c` 中： ```C= typedef struct BELE { struct BELE *next; struct BELE *prev; size_t payload_size; size_t magic_header; /* Marker to see if block seems legitimate */ unsigned char payload[0]; /* Also place magic number at tail of every block */ } block_ele_t; ``` 在這個結構體中，最後一個成員 `payload[0]` 是實際給呼叫者使用的記憶體開頭。該成員使用到的 GCC 中 "Array of Length Zero" 。該功能簡便處在於：若結構體中的最後一個成員是大小為 0 的陣列，那麼可以透過 `malloc(sizeof(block_ele_t) + 希望 payload 後面接著多少記憶體) ` 來達成「最後一個成員大小可以任意指定」的效果（另外一個好處是 `malloc` 跟 `free` 一次就可以完成，不用替 `payload` 開頭的記憶體再 `malloc` 一次）。`malloc_test` 中可以發現以下程式： ```C= void *test_malloc(size_t size) { if (noallocate_mode) { report_event(MSG_FATAL, "Calls to malloc disallowed"); return NULL; } if (fail_allocation()) { report_event(MSG_WARN, "Malloc returning NULL"); return NULL; } block_ele_t *new_block = malloc(size + sizeof(block_ele_t) + sizeof(size_t)); ... ``` 這個技巧表現在程式的 `malloc(size + sizeof(block_ele_t) + sizeof(size_t))` 當中。但除了 `sizeof(block_ele_t)` 與 `size` 之外，還多了一個 `sizeof(size_t)` 的大小。這是因為使用者實際可使用的記憶體，前後被兩個 `size_t` 整數包著，裡面各自紀錄著兩個 Magic Number，作為驗證這塊記憶體是否是由 `test_malloc` 分配的依據，以及作為記憶體是否有產生越界等不佳使用的依據。 3. 位於記憶體之前的 magic number，實際上就是結構體中 `magic_header` 這個成員，其值會在 `test_malloc` 中被指定為 `MAGICHEADER`，也就是 `0xdeadbeef` ; 在記憶體尾端的 `size_t` 整數，數值必定為 `MAGICFOOTER`，也就是 `0xbeefdead`。 4. 因為要回傳給使用者的指標實際上是 `payload` 的位置，因在程式中另外有 `find_footer` 跟 `find_header` 這兩個函式作為輔助。前這傳入 `block_ele_t*` ，回傳位於 `payload` 尾端 magic number 的位址。如 `test_malloc` 中有一段指定尾端 magic number 的程式： ```C= *find_footer(new_block) = MAGICFOOTER; ``` 而後者則是傳入使用者呼叫後得到的記憶體開頭，反推該記憶體所屬的 `block_ele_t` 的開頭位置。這兩個函式除了用在 `test_malloc` 當中，也會用再 `test_free` 當中。在呼叫 `test_free` 時，使用者傳入的，實際上是 `payload` 的位置，但釋放記憶體時，除了該記憶體之外，該記憶體所屬的結構體，也要一併釋放。因此需要有一個尋找該記憶體所屬開頭的方法。該二函數的原理不難理解。給定 `block_ele_t`，尾端的 magic number，由 `payload_size` 進行推算即可; 對於反推記憶體所屬的結構體，則是利用 `sizeof(block_ele_t)` 反推，並且尋找反推過後的記憶體是否在 `allocated` 的清單中。若沒有，則推論為錯誤的配置。而在 `find_header` 中有檢測傳入指標是否為空的設定。 > 我懷疑這是一個不能使用 `calloc` 跟 `strdup` 的理由。但目前還沒確認該二函數中的 `malloc` 是否有被影響到。 > 5. `test_free` 中用的原理類似：首先用 `find_header` 找出使用者準備釋放的記憶體，其所屬的結構體有沒有再 `allocated` 裡面。若傳入的指標為空指標，或是該記憶體不屬於任何一個節點，`find_header` 內部會分別傳出「試圖釋放空指標」或「記憶體損壞」等警告。 6. 若順利找到該記憶體所屬的節點，接著檢驗尾端的 magic number 是否仍為 `MAGICFOOTER`，作為記憶體有沒有被越界存取等狀況的判斷依據。若比對結果發現不合，也會發出該區段記憶體損壞的警告。 7. 若確認結果無誤，則將記憶體區段前後的 magic number 都設成 `MAGICFREE`。並且將記憶體內容都用 `FHILLCHAR` 填充。最後移出 `allocated` 這個 doubly-linked list。 > 不確定為什麼要先把 magic number 都改掉之後再釋放。我猜測是同樣大小的記憶體如果又被 `malloc` ，但裡面仍有 magic number 時，會造成誤判，以為不合法的記憶體其實是合法的。 8. 在分配記憶體時，每呼叫一次 `test_malloc` ，`allocate_count` 就會 +1; 而每呼叫一次 `test_free` ，該數值就會 -1。因此程式即將結束時，判斷該數值，即可知道是否有妥當地釋放記憶體。 ## 例外處理在 `harness.c` 中，有一些使用 `sigsetjmp`, `siglongjmp` 的函式作為例外處理。該函式並未在 `harness.c` 中被其他函式呼叫。因此推測該例外處理用在其他地方。 ## shell 實作 # 思考 1. 雖然過了 `qtest`，但應該要更有「品味」一點。決定再把上週的東西仔細看一遍，並嘗試影片裡面的技巧。 3. 程式要怎麼做到「發現記憶體有沒有正確釋放」或「檢查有沒有在不該使用 `malloc` 的時候使用 `malloc`」是一件很神奇的事，決定好好研究一下。 # 其他 ## 無法使用 `calloc` 及 `strdup` 使用 `strdup` 或 `calloc` 的話，測試程式會頻繁地跑出如： >CMU 助教為了測試使用 macro 將 `malloc` 和 `free` 替換成 `test_malloc` 和 `test_free` , 請看 `harness.h` 和`harness.c`, 所以你用其他配置記憶體的函數測試程式沒辦法正確測試 >[name=amikai][color=red] ``` ... +++ TESTING trace trace-01-ops: # Test of insert_head and remove_head ERROR: Attempted to free unallocated block. Address = 0x55772899ccf0 ERROR: Attempted to free unallocated or co：rrupted block. Address = 0x55772899ccf0 ERROR: Corruption detected in block with address 0x55772899ccf0 when attempting to free it ERROR: Segmentation fault occurred. You dereferenced a NULL or invalid pointer ERROR: Removal from queue failed (1 failures total) ERROR: Attempted to free unallocated block. Address = 0x55772899cc90 ERROR: Attempted to free unallocated or corrupted block. Address = 0x55772899cc90 ERROR: Corruption detected in block with address 0x55772899cc90 when attempting to free it ERROR: Segmentation fault occurred. You dereferenced a NULL or invalid pointer ERROR: Removal from queue failed (2 failures total) free(): invalid next size (fast) --- trace-01-ops 0/6 ... ``` 以及非常低的分數。後來在 `qtest.c` 中才發現： ``` /* Our program needs to use regular malloc/free */ ``` 雖然說明中有提及： " The function must explicitly allocate space and copy the string into it."，但不是很確定要求「明顯」到什麼程度。以至於花費不少時間懷疑人生。 ## `free` 依照 `man` 中的內容： >`The free() function frees the memory space pointed to by ptr, which must have been returned by a previous call to malloc(), calloc(), or realloc(). Otherwise, or if free(ptr) has already been called before, undefined behavior occurs. If ptr is NULL, no operation is performed. ` 因此若 `ptr` 為 `NULL`，則 `free(ptr)` 應該要合法。但以下實作: ```C= bool q_insert_head(queue_t *q, char *s) { list_ele_t *newh = NULL; char *_s = NULL; do { if (!q) break; newh = malloc (sizeof(list_ele_t)); if (!newh) break; int len = strlen(s); _s = malloc(len + 1); if (!_s) break; ... } while (0); free (newh); free (_s); return false; } ``` 在測試時將會出現如: ```shell +++ TESTING trace trace-07-robust: # Test operations on NULL queue Attempt to free NULL Attempt to free NULL --- trace-07-robust 0/7 ``` 的錯誤訊息。