2020q3 Homework1 (lab0)

# 2020q3 Homework1 (lab0) contributed by <`fdsa654hg`> ###### tags: `sysprog2020` ## 測試環境 ```shell= $ cat /etc/os-release NAME="Ubuntu" VERSION="20.04.1 LTS (Focal Fossa)" $ cat /proc/version Linux version 5.4.0-47-generic (buildd@lcy01-amd64-014) (gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2)) #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020 ``` ## :penguin: 作業要求 * 在 GitHub 上 fork [lab0-c](https://github.com/sysprog21/lab0-c) * 參閱 [Git 教學和 GitHub 設定指引](https://hackmd.io/@sysprog/git-with-github) ^附教學影片^ * ==詳細閱讀 [C Programming Lab](https://www.cs.cmu.edu/afs/cs/academic/class/15213-s20/www/labs/cprogramminglab.pdf)== ，依據指示著手修改 `queue.[ch]` 和連帶的檔案，測試後用 Git 管理各項修改，記得也該實作 `q_sort` 函式。 * 在提交程式變更前，務必詳閱 [如何寫好 Git Commit Message](https://blog.louie.lu/2017/03/21/%E5%A6%82%E4%BD%95%E5%AF%AB%E4%B8%80%E5%80%8B-git-commit-message/) * 不用理會 [Autolab](http://www.autolabproject.com/) 和檔案下載的描述，這兩者都是 CMU 專屬 * 除了修改程式，也要編輯「[作業區](https://hackmd.io/@sysprog/2020-homework1)」，增添開發紀錄和 GitHub 連結，除了提及你如何逐步達到自動評分程式的要求外，共筆也要涵蓋以下: * 運用 Valgrind 排除 `qtest` 實作的記憶體錯誤，並透過 Massif 視覺化 "simulation" 過程中的記憶體使用量，需要設計對應的實驗 * 研讀論文 [Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)，解釋本程式的 "simulation" 模式是如何透過以實驗而非理論分析，達到驗證時間複雜度，需要解釋 [Student's t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution) 及程式實作的原理。注意：現有實作存在若干致命缺陷，請討論並提出解決方案; ## `queue.h` & `queue.c` 實作 ### `queue.h` - 這個檔案有修改的部份很少，主要是應題目的要求 `q_instert_tail()` 和 `q_size()` 兩個函式要在 $O(1)$ 時間內完成，由於作業給予的 Linked List 為單向的結構，所以如果從 `head` 開始走直到找到 `tail` 時間複雜度是 $O(n)$ 無法控制在常數時間內完成，因此須在 `queue` 結構裡新建一個跟 `head` 相似的指標-- `tail` ，`q_size()` 也是同樣的道理，只要新建一個型別為 int 的變數 `size` 即可。 ```cpp= typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; /* new */ int size; /* new */ } queue_t; ``` * 另外我額外宣告了 merge sort 函式 ```cpp= void merge_sort(list_ele_t **q); ``` ### `queue.c` - 函式的實作幾乎都在這個檔案 ##### `q_new()` - 很簡單的函式，功能是建立一個新的空佇列，由於 `malloc` 未必能成功分配記憶體，故先檢查是否成功，否就回傳 `NULL` ，成功的話就初始化所有的變數，並回傳指標 `q` ```cpp= queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) return NULL; q->head = NULL; q->tail = NULL; q->size = 0; return q; } ``` ##### `q_free()` - 為了 `free` 整個佇列，先用 `cur` 存當前的 `q->head` ，再讓 `head` 變成它所指向的下一個位置，並使用 `cur` 指標釋放其所儲存的記憶體，重複做以下動作便能走訪整個佇列並清空它 ```cpp= void q_free(queue_t *q) { if (!q) return; list_ele_t *cur = q->head; while (q->head) { q->head = q->head->next; cur->next = NULL; free(cur->value); free(cur); cur = q->head; } free(q); } ``` ##### `q_insert_head()` - 函式的開頭就是分配記憶體，再檢查有沒有 `malloc` 成功，之後我原本想用 `strcpy` 將 `s` 的值傳給 `new_head->value` ，並使用 `memset` 加上'\0'，但根據 `cern` 指出這樣有 `buffer overflow` 的風險，所以我改用 `snprintf` ，而且它會自己補'\0'所以就不用使用 `memset` 了 ```cpp= bool q_insert_head(queue_t *q, char *s) { if (!q) return false; list_ele_t *new_head; new_head = malloc(sizeof(list_ele_t)); if (!new_head) return false; new_head->value = malloc(sizeof(char) * (strlen(s) + 1)); if (!new_head->value) { free(new_head); return false; } q->size += 1; snprintf(new_head->value, (size_t) strlen(s) + 1, "%s", s); new_head->next = q->head; q->head = new_head; if (!q->tail) { list_ele_t **tmp = &q->head; while ((*tmp)->next) tmp = &(*tmp)->next; q->tail = *tmp; q->tail->next = NULL; } return true; } ``` ##### `q_insert_tail()` - 這個函式基本上跟 `q_insert_head` 是一樣的道理，只是它改變的是 `tail` ```cpp= bool q_insert_tail(queue_t *q, char *s) { if (!q) return false; list_ele_t *new_tail; new_tail = malloc(sizeof(list_ele_t)); if (!new_tail) return false; new_tail->value = malloc(sizeof(char) * (strlen(s) + 1)); if (!new_tail->value) { free(new_tail); return false; } q->size += 1; snprintf(new_tail->value, (size_t) strlen(s) + 1, "%s", s); if (!q->tail) { q->tail = q->head = new_tail; q->tail->next = q->head->next = NULL; return true; } q->tail->next = new_tail; q->tail = new_tail; q->tail->next = NULL; return true; } ``` ##### `q_remove_head()` - 這個函式目的是將 `head` 刪除，並將其賦值給 `sp` ，開頭一樣先檢查指標是否不為空指標，一樣用 `snprintf` 將 `q->head->value` 傳給 `sp`，只是這裡的參數有 `bufsize` 所以不用自己算長度了，之後就是將 `head` 改為 `head->next` 並將原本的 `head` 及其內容物 `free` 掉 ```cpp= bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || !q->head) return false; if (sp) snprintf(sp, bufsize, "%s", q->head->value); q->size -= 1; list_ele_t *tmp = q->head; q->head = q->head->next; tmp->next = NULL; free(tmp->value); free(tmp); return true; } ``` ##### `q_size()` - 由於 `queue_t` 的 struct member 有 `size` ，故在確認指標是否正常及是否為空佇列後，直接回傳 `q_size` 即可 ```cpp= int q_size(queue_t *q) { if (!q || !q->head) return 0; return q->size; } ``` ##### `q_reverse()` - `head` 最後會變成 `tail` 所以先將其紀錄起來，之後可以直接賦值給新的 `tail`，由於 `head->next` 已經被 `cur` 存起來，所以可以將其指向 `NULL`，之後便利用 `prev`,`cur`,`next` 將整個佇列反轉，最後再重新設定 `head` 跟 `tail` 的值 ```cpp= void q_reverse(queue_t *q) { if (!q || !q->head) return; list_ele_t *old_head = q->head; list_ele_t *prev = q->head, *next, *cur = q->head->next; q->head->next = NULL; while (cur) { next = cur->next; cur->next = prev; prev = cur; cur = next; } q->head = prev; q->tail = old_head; } ``` ##### `q_sort()` - 參考 [Linked List sort](https://npes87184.github.io/2015-09-12-linkedListSort/) 中的 Merge sort 演算法做出來的 - 為了將 Linked List 分成兩部份，我原本想直接用 q->size/2+1 直接找出 `l2` 但演算法使用了 `fast` 跟 `slow` 兩個走訪 Linked List 速度不同的兩個變數來找 `l2` ，我完全沒想過這個方法，於是我學了起來 - 整個流程基本上跟網站上的演算法差不多，唯一的差別是我用了 `strncmp` 來防止 `buffer overflow` ，還有把 `merge` 的部份從遞迴改成迭代了，因為我 `trace16` 一直過不了，我看了一下裡面的指令要排序好幾萬個 string ，似乎是 `stack` 爆了，不過我無法確定，因為即使我把 `valgrind` 的參數 `--stack` 設成 `yes` 但還是它沒有顯示出 `stack` 的狀態 ```cpp= void q_sort(queue_t *q) { if (!q || !q->head) return; merge_sort(&q->head); while ((q->tail->next)) q->tail = q->tail->next; } void merge_sort(list_ele_t **head) { if (!(*head) || !((*head)->next)) return; list_ele_t *l1 = *head; list_ele_t *l2 = (*head)->next; while (l2 && l2->next) { l1 = l1->next; l2 = l2->next->next; } l2 = l1->next; l1->next = NULL; l1 = *head; merge_sort(&l1); merge_sort(&l2); list_ele_t **tmp = head; while (l1 && l2) { if (strncmp(l1->value, l2->value, strlen(l2->value)) < 0) { *tmp = l1; l1 = l1->next; } else { *tmp = l2; l2 = l2->next; } tmp = &((*tmp)->next); } if (l1) *tmp = l1; else *tmp = l2; } ``` ### Git 使用 - Commit 成功 ```shell= (base) jiahao@jiahao-GP63-Leopard-8RE:~/Desktop/code/lab0-c$ git commit -a Following files need to be cleaned up: queue.c queue.h [master 110414a] Implement all function correctly 2 files changed, 151 insertions(+), 18 deletions(-) ``` ### Clang-format 使用 - 透過 ```clang-format -i queue``` 直接改 coding-style ```cpp= (base) jiahao@jiahao-GP63-Leopard-8RE:~/Desktop/code/lab0-c$ git commit -a --- modified queue.c +++ expected coding style @@ -1,9 +1,9 @@ +#include "queue.h" +#include <assert.h> #include <stdio.h> #include <stdlib.h> #include <string.h> -#include <assert.h> #include "harness.h" -#include "queue.h" * Create empty queue. @@ -12,12 +12,13 @@ queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); - if(!q)return NULL; + if (!q) + return NULL; q->head = NULL; q->tail = NULL; q->size = 0; - + return q; } /***其他程式碼，請容許我省略***/ /***其他程式碼，請容許我省略***/ /***其他程式碼，請容許我省略***/ [!] queue.c does not follow the consistent coding style. Make sure you indent as the following: clang-format -i queue.c Make sure you indent as the following: clang-format -i queue.h Following files need to be cleaned up: queue.c queue.h ``` ### Valgrind 使用 - 我快把程式寫完的時候才發現有這個東西可以用 - 不過這也幫助我找到幾個 `bug` ，使用 ```valgrind ./qtest``` 會進入一個很像 `python` 直譯器的模式，裡面輸入指令會一行一行執行，我用這個工具找到問題 - 如：在 `trace16` 裡 ```python= # Test performance of sort with random and descending orders # 10000: all correct sorting algorithms are expected pass # 50000: sorting algorithms with O(n^2) time complexity are expected failed # 100000: sorting algorithms with O(nlogn) time complexity are expected pass option fail 0 option malloc 1 new ih RAND 10000 sort reverse sort free new ih RAND 50000 sort reverse sort free new ih RAND 100000 sort reverse sort free ``` - 我發現每次只要先 `sort` 後再 `revese` 就會爆炸，仔細研究之後，知道是我在 `sort` 的時候 `head` 不小心改到它的值 - massif 圖例(執行 trace17 時記憶體使用情形） ![](https://i.imgur.com/ZERXcnk.png) - 以下是我不小心造成了無窮迴圈的情形，可注意到在圖表的最後記憶體並沒有歸零( trace05 ) ![](https://i.imgur.com/HiA6qR8.png) - 自動測試軟體也發出了以下錯誤訊息 ```shell= ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient ``` ### 研讀論文〈[Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)〉 #### `dudect` 的運作原理： - 經由多次測量程式執行的時間取得大量樣本，並使用統計學的方法，從而推得執行時間是否為常數時間 - 論文中提供了兩個方法： - t-test: - Welch’s t-test ( `dudect` 使用了此方法) - non-parametric test: - Kolmogorov-Smirnov - 剩下的部份來不及做完了

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.