# 2020q1 Homework1 (lab0)
contributed by < `MetalheadKen` >
:::info
:mega: 題目出處
* [H01: lab0](https://hackmd.io/@sysprog/linux2020-lab0)
* [C Programming Lab: Assessing Your C Programming Skills](http://www.cs.cmu.edu/~213/labs/cprogramminglab.pdf)
:::
## 實驗環境
```shell
$ uname -a
Linux kendai-XPS-13-9360 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ gcc --version | head -n 1
gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
```
## 題目摘要
* Some of the skills tested are :
* Explicit memory management, as required in C
* Creating and manipulating pointer-based data structures
* Working with strings
* Enhancing the proformance of key operations by storing redundant information in data structures
* Implementing robust code that operates correctly with invalid arguments, including NULL pointers
* Implementing a queue
* Supporting both last-in, first-out (LIFO) and first-in-first-out (FIFO)
* The underlying data structure is a singly-linked list
* Resources
* C programming
* Kernighan and Ritchie, *The C Programming Language, second edition*
* [Wikibooks](https://en.wikibooks.org/wiki/C_Programming)
* [Linked lists](http://www.cs.cmu.edu/~iliano/courses/18S-CMU-CS122/handouts/10-linkedlist.pdf)
* [Asympotic (big-Oh) notation](http://www.cs.cmu.edu/~iliano/courses/18S-CMU-CS122/handouts/05-sort.pdf)
* Linux Man pages
* The task is to modify the code in `queue.h` and `queue.c` to fully implement the following functions
* `q_new`: Create a new, empty queue
* `q_free`: Free all storage used by a queue
* `q_insert_head`: Attempt to insert a new element at the head of the queue (LIFO discipline)
* `q_insert_tail`: Attempt to insert a new element at the tail of the queue (FIFO discipline)
* `q_remove_head`: Attempt to remove the element at the head of the queue
* `q_size`: Compute the number of elements in the queue
* `q_reverse`: Reorder the list so that the queue elements are reversed in order.
* `q_sort`: Sort elements of queue in ascending order.
* ==Precautions==
* A *NULL* queue is one for which the pointer has value `NULL`. An *empty* queue is one pointing to a valid structure, but the `head` field has value `NULL`
* For functions that provide strings as arguments, you must create and store a copy of the string by calling `malloc` to allocate space and then copying from the source to the newly allocated space
* You cannot assume any fixed upper bound on the length of a string ─ you must allocate space for each string based on its length
* `q_insert_tail` and `q_size` will require that your implementations operate in time $O(1)$
* `q_reverse` should not allocate or free any list elements. Instead, it should rearrange the existing elements
## 實作流程
### `queue.h`
依據題意,為了讓函式 `q_insert_tail` 和 `q_size` 的時間複雜度為 $O(1)$,在 struct `queue_t` 裡新增成員 `tail` 和 `size`,使得在 runtime 的時候可以動態紀錄 queue 的尾端和其大小
```cpp
/* Queue structure */
typedef struct {
list_ele_t *head;
list_ele_t **tail;
size_t size;
} queue_t;
```
### `q_new`
需注意若 `malloc` 失敗時需回傳 NULL
```cpp
queue_t *q_new()
{
queue_t *q = (queue_t *) malloc(sizeof(queue_t));
if (!q)
return NULL;
q->head = NULL;
q->tail = &(q->head);
q->size = 0;
return q;
}
```
### `q_free`
釋放 queue 的空間,注意記得一併釋放節點和字串所佔的記憶體空間
```cpp
void q_free(queue_t *q)
{
if (!q)
return;
while (q->head != NULL) {
list_ele_t *tmp = q->head;
q->head = q->head->next;
free(tmp->value);
free(tmp);
}
free(q);
}
```
### `q_insert_head`
* 分別建立節點和字串的空間,之後進行字串複製和節點的串接
* 需注意若 `malloc` 失敗時,在回傳 NULL 前需釋放之前已經配置好的記憶體空間
* 在這邊使用 `goto` 來作錯誤處理,可參考 [goto in Linux kernel coding style]( https://www.kernel.org/doc/html/v4.18/process/coding-style.html#centralized-exiting-of-functions) 章節
* 紀錄節點數量
```cpp
bool q_insert_head(queue_t *q, char *s)
{
if (!q)
goto ERR;
list_ele_t *newh;
newh = (list_ele_t *) malloc(sizeof(list_ele_t));
if (!newh)
goto ERR;
int len = strlen(s) + 1;
char *str = (char *) malloc(len);
if (!str)
goto ERR_FREE_LIST;
memcpy(str, s, len);
newh->value = str;
newh->next = q->head;
if (!q->head) {
q->tail = &(newh->next);
}
q->head = newh;
q->size++;
return true;
ERR_FREE_LIST:
free(newh);
ERR:
return false;
}
```
### `q_insert_tail`
* 大致上與先前的 `q_insert_head` 的實作方向相同
* 因題意規定其時間複雜度需為 $O(1)$,這邊使用 `tail` 來串接新節點
* 撰寫「有品味」的程式碼
* 讓 `tail` 的 data type 為 `list_ele_t**`,使得可以減少 branch penalty,可參考
* [Linus Torvalds 在 TED 2016 的訪談](https://www.ted.com/talks/linus_torvalds_the_mind_behind_linux?language=zh-tw)
* [Applying the Linus Torvalds “Good Taste” Coding Requirement](https://medium.com/@bartobri/applying-the-linus-tarvolds-good-taste-coding-requirement-99749f37684a)
* [Naetw 的開發紀錄](https://hackmd.io/s/BysQssYHN#q_insert_tail)
* 紀錄節點數量
```cpp
bool q_insert_tail(queue_t *q, char *s)
{
if (!q)
goto ERR;
list_ele_t *newt = (list_ele_t *) malloc(sizeof(list_ele_t));
if (!newt)
goto ERR;
int len = strlen(s) + 1;
char *str = (char *) malloc(len);
if (!str)
goto ERR_FREE_LIST;
memcpy(str, s, len);
newt->value = str;
newt->next = NULL;
*(q->tail) = newt;
q->tail = &(newt->next);
q->size++;
return true;
ERR_FREE_LIST:
free(newt);
ERR:
return false;
}
```
### `q_remove_head`
* 若 `sp` 不為 NULL 時代表需把移除的字串複製到 `sp` 中
* 注意有可能要移除的字串的長度超過 `sp` 的大小,因而最多只需複製長度為 `bufsize - 1`,並注意要在結尾上加上 terminating null byte
* 調整節點數量
```cpp
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
if (!q || !q->head)
return false;
if (sp != NULL) {
strncpy(sp, q->head->value, bufsize - 1);
sp[bufsize - 1] = '\0';
}
list_ele_t *tmp = q->head;
q->head = q->head->next;
q->tail = (--q->size == 0) ? &(q->head) : q->tail;
free(tmp->value);
free(tmp);
return true;
}
```
### `q_size`
* 題意規定時間複雜度為 $O(1)$,因此藉由 `size` 動態紀錄 queue 的大小,並在此函式直接回傳 `size` 的內容
* 需注意有可能在沒有使用 `new` command 的情況下直接使用 `size` command,所以需判斷若 queue 為 NULL queue 或 empty queue 則直接回傳數值 0
```cpp
int q_size(queue_t *q)
{
if (!q || !q->head) {
return 0;
} else {
return q->size;
}
}
```
### `q_reverse`
* 若 queue 為 NULL queue 或 empty queue 或長度為 1 的 queue 則無需 reverse
* 使用 `prev`, `curr`, `next` 來紀錄反轉的過程
* `prev` 紀錄 queue 前一個的節點
* `curr` 紀錄 queue 當前的節點,在反轉過程中,`curr` 的下一個節點應指派為 `curr` 的前一個節點
* `next` 紀錄 queue 下一個的節點
* 注意最後 `head` 和 `tail` 需重設方向
```cpp
void q_reverse(queue_t *q)
{
if (!q || !q->head || !q->head->next)
return;
list_ele_t *prev, *curr, *next;
prev = q->head;
curr = q->head->next;
q->head->next = NULL;
q->tail = &(q->head->next);
while (curr != NULL) {
next = curr->next;
curr->next = prev;
prev = curr;
curr = next;
}
q->head = prev;
}
```
### `q_sort`
* 在這邊使用了 merge sort 來排序 queue 裡面的元素
* 可以注意到 `merge_sort` 裡面透過 `fast` 和 `slow` pointer 來找到該 list 的中間點為何
* 剛好 [LeetCode](https://leetcode.com/problems/sort-list/) 上有相關題目可以練練手,順便可以用獨立環境來跑一下 unit test
```cpp
void q_sort(queue_t *q)
{
if (!q || !q->head || !q->head->next)
return;
q->head = merge_sort(q->head);
q->tail = &(q->head);
while (*(q->tail)) {
q->tail = &((*q->tail)->next);
}
}
static list_ele_t *merge_sort(list_ele_t *head)
{
if (!head || !head->next)
return head;
list_ele_t *slow = head;
list_ele_t *fast = head->next;
while (fast && fast->next) {
slow = slow->next;
fast = fast->next->next;
}
list_ele_t *mid = slow->next;
slow->next = NULL;
return merge(merge_sort(head), merge_sort(mid));
}
/* Merge two list in ascending order */
static list_ele_t *merge(list_ele_t *left, list_ele_t *right)
{
if (!left)
return right;
if (!right)
return left;
list_ele_t *dummy = NULL;
list_ele_t *curr = NULL;
if (strcmp(left->value, right->value) > 0) {
dummy = right;
right = right->next;
} else {
dummy = left;
left = left->next;
}
curr = dummy;
while (left && right) {
if (strcmp(left->value, right->value) > 0) {
curr->next = right;
right = right->next;
} else {
curr->next = left;
left = left->next;
}
curr = curr->next;
}
if (left)
curr->next = left;
if (right)
curr->next = right;
return dummy;
}
```
### Implement Natural Sort
這邊採用 [sourcefrog/natsort](https://github.com/sourcefrog/natsort) 所實作的 natural sort 的 compare function - `strnatcmp`。從原專案中取得 `strnatcmp.[ch]` 後放入 `lab0-c` 的專案中。
:::info
這邊為了明確的表示是使用別人的專案,原本是想採用 git submodule 的方式建構的,但是考慮到若要對 natsort 專案做改動,勢必要 push 回去到原專案中,或者在當初 submodule init 時只能指定 fork 後的專案,但都相對的要麻煩許多。
不知道有什麼方式可以除了把有 reference 到的專案之相關訊息寫進 commit message 以外達到明確的使用到以及隨時跟新進度在別人的專案上。
> [name=MetalheadKen]
:::
:::warning
這是練習撰寫清晰的 git commit messages 的好機會
:notes: jserv
:::
接著把在排序中採用到的 `strcmp` 一律改成 `strnatcmp`
* `qtest.c`
```cpp
bool do_sort(int argc, char *argv[])
{
...
for (list_ele_t *e = q->head; e && --cnt; e = e->next) {
/* Ensure each element in ascending order */
/* FIXME: add an option to specify sorting order */
if (strnatcmp(e->value, e->next->value) > 0) {
report(1, "ERROR: Not sorted in ascending order");
ok = false;
break;
...
}
```
* `queue.c`
```cpp
static list_ele_t *merge(list_ele_t *left, list_ele_t *right)
{
...
if (strnatcmp(left->value, right->value) > 0) {
dummy = right;
right = right->next;
} else {
...
while (left && right) {
if (strnatcmp(left->value, right->value) > 0) {
curr->next = right;
right = right->next;
} else {
...
}
```
更改 `Makefile`,使其可以編譯 `strnatcmp.c`
* `Makefile`
```cpp
...
OBJS := qtest.o report.o console.o harness.o queue.o \
random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \
natsort/strnatcmp.o
...
```
為了對現有的 natsort 做測試,增加新的 trace file
* `traces/trace-18-natsort.cmd`
```cpp
# Test operation of natural sort
option fail 0
option malloc 0
new
ih fred
ih pic2
ih pic100a
ih pic120
ih pic121
ih jane
ih tom
ih pic02a
ih pic3
ih pic4
ih 1-20
ih pic100
ih pic02000
sort
```
* `script/driver.py`
```diff
--- a/scripts/driver.py
+++ b/scripts/driver.py
@@ -35,7 +35,8 @@ class Tracer:
14: "trace-14-perf",
15: "trace-15-perf",
16: "trace-16-perf",
- 17: "trace-17-complexity"
+ 17: "trace-17-complexity",
+ 18: "trace-18-natsort"
}
traceProbs = {
@@ -55,10 +56,11 @@ class Tracer:
14: "Trace-14",
15: "Trace-15",
16: "Trace-16",
- 17: "Trace-17"
+ 17: "Trace-17",
+ 18: "Trace-18"
}
- maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5]
+ maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 6]
RED = '\033[91m'
GREEN = '\033[92m'
```
對 `trace-16-perf.cmd` 採用到的 `RAND` 提供英文與數字的亂數排序,其實只要新增數字到 `charset` 就可以了
* `qtest.c`
```diff
--- a/qtest.c
+++ b/qtest.c
@@ -60,7 +60,7 @@ static int string_length = MAXSTRING;
#define MIN_RANDSTR_LEN 5
#define MAX_RANDSTR_LEN 10
- static const char charset[] = "abcdefghijklmnopqrstuvwxyz";
+ static const char charset[] = "abcdefghijklmnopqrstuvwxyz0123456789";
```
執行 `make test` 後出現以下錯誤,依錯誤訊息可得知為超時
```SHELL
...
ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient
...
```
這是因為在 natsort 中所使用的 `strnatcmp` 比 glibc 所提供的 `strcmp` 的執行時間還要長的緣故。為了解決這個問題,這邊採用的方法為新增一個新的 command option `time`,藉由 `option time <limit=1>` 可以指定在這次測試中所會參考的 time limit。
* `qtest.c`
```cpp
static void console_init()
{
...
add_param("fail", (void *) &fail_limit, sizeof(fail_limit),
"Number of times allow queue operations to return false", NULL);
add_param("time", (void *) &time_limit, sizeof(time_limit),
"Maximum number of time limit", NULL);
}
```
為了要讓 `time_limit` 可以動態更改,必須把 `time_limit` 改為 global variable
* `harness.c`
```diff
--- a/harness.c
+++ b/harness.c
@@ -52,7 +52,7 @@ static bool noallocate_mode = false;
static bool error_occurred = false;
static char *error_message = "";
- static int time_limit = 1;
+ int time_limit = 1;
/*
* Data for managing exceptions
```
* `harness.h`
```diff
--- a/harness.c
+++ b/harness.c;
@@ -25,6 +25,9 @@ size_t allocation_check();
/* Probability of malloc failing, expressed as percent */
extern int fail_probability;
+ /* Amount of time limit */
+ extern int time_limit;
+
/*
* Set/unset cautious mode.
* In this mode, makes extra sure any block to be freed is currently allocated.;
```
加入 `option time` 到會因為 natsort 的緣故而造成 Time limit exceeded 的 trace 中
* `traces/trace-15-perf.cmd`
```diff
--- a/traces/trace-15-perf.cmd
+++ b/traces/trace-15-perf.cmd
@@ -1,6 +1,8 @@
# Test performance of insert_tail, size, reverse, and sort
option fail 0
option malloc 0
+ # For test natural sort, we extend the amount of time limit
+ option time 3
new
ih dolphin 1000000
it gerbil 1000000
...
```
## Address Sanitizer
執行 `make SANITIZER=1 test` 後出現下列錯誤
```shell
...
# Test if q_insert_tail and q_size is constant time complexity
=================================================================
==26442==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55ba43b1e9c0 at pc 0x55ba4390f86e bp 0x7ffe34a65e40 sp 0x7ffe34a65e30
READ of size 4 at 0x55ba43b1e9c0 thread T0
#0 0x55ba4390f86d in do_option_cmd /home/kendai/git/lab0-c/console.c:368
#1 0x55ba4390e48a in interpret_cmda /home/kendai/git/lab0-c/console.c:220
#2 0x55ba4390ee8e in interpret_cmd /home/kendai/git/lab0-c/console.c:243
#3 0x55ba4390fb77 in cmd_select /home/kendai/git/lab0-c/console.c:569
#4 0x55ba4390ff94 in run_console /home/kendai/git/lab0-c/console.c:628
#5 0x55ba4390d0ad in main /home/kendai/git/lab0-c/qtest.c:772
#6 0x7f0b512e6b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
#7 0x55ba4390a809 in _start (/home/kendai/git/lab0-c/qtest+0x6809)
0x55ba43b1e9c1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:20:6' (0x55ba43b1e9c0) of size 1
'simulation' is ascii string ''
SUMMARY: AddressSanitizer: global-buffer-overflow /home/kendai/git/lab0-c/console.c:368 in do_option_cmd
...
+++ TESTING trace trace-17-complexity:
--- trace-17-complexity 0/5
--- TOTAL 95/100
```
先用 GDB 定位問題
```SHELL
$ gdb -q --args ./qtest -f traces/trace-17-complexity.cmd
...
(gdb) b console.c:368
(gdb) r
...
Breakpoint 1, do_option_cmd (argc=3, argv=0x555555763ce0) at console.c:368
368 int oldval = *plist->valp;
(gdb) p *plist
$1 = {name = 0x55555555bd47 "simulation", valp = 0x55555575e520 <simulation>, documentation = 0x55555555bd2c "Start/Stop simulation mode", setter = 0x0,
next = 0x5555557613e0}
(gdb) whatis (*plist).valp
type = int *
(gdb) whatis simulation
type = _Bool
(gdb) p sizeof(_Bool)
$2 = 1
```
從這邊可以知道由於 `simulation` 從 `bool` 轉成 `int *` 並存入 `valp`,之後當直接 dereference 的時候就會因為要取出 4 bytes 的資料而存取到不該存取的地方故造成 buffer overflow
為了要解決 global-buffer-overflow,我們需要把轉型成 `int *` 的地方改為 `void *`,根據 C99 規格,pointer to void 做為一個 generic pointer 可以把原本的 object type 轉成 pointer to void 再轉回來。藉由於此,我們可以透過 pointer to void 來做到不同的 data type 的存取
> C99 Spec 6.3.2.3:1 [Pointers]
>
> A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.
除了 `void *` 來儲存各種的 data type 以外,還需要一個 variable 來儲存原本的 object 所佔的大小。最後,根據上面所述,我們需要修改以下動作來解決所需的問題
* 修改 struct `PELE` 來儲存 `void *` 和原始的 object size
* 修改對應的 function argument 來完成儲存 `void *` 和原始的 object size 的動作
* 新增一個新的 function 來讓 `void *` 可以透過原始的 object size 來轉型回原來的 data type
修改如下
* `console.h`
```diff
--- a/console.h
+++ b/console.h
@@ -23,13 +23,14 @@ struct CELE {
};
/* Optionally supply function that gets invoked when parameter changes */
- typedef void (*setter_function)(int oldval);
+ typedef void (*setter_function)(void *oldval, int oldsize);
/* Integer-valued parameters */
typedef struct PELE param_ele, *param_ptr;
struct PELE {
char *name;
- int *valp;
+ void *valp;
+ int valsize;
char *documentation;
/* Function that gets called whenever parameter changes */
setter_function setter;
@@ -44,7 +45,8 @@ void add_cmd(char *name, cmd_function operation, char *documentation);
/* Add a new parameter */
void add_param(char *name,
- int *valp,
+ void *valp,
+ int valsize,
char *doccumentation,
setter_function setter);
```
* `console.c`
```diff
--- a/console.c
+++ b/console.c
@@ -82,6 +82,8 @@ static void pop_file();
static bool interpret_cmda(int argc, char *argv[]);
+ static uint32_t get_plist_valp(param_ptr p);
+
/* Initialize interpreter */
void init_cmd()
{
@@ -99,11 +101,14 @@ void init_cmd()
add_cmd("log", do_log_cmd, " file | Copy output to file");
add_cmd("time", do_time_cmd, " cmd arg ... | Time command execution");
add_cmd("#", do_comment_cmd, " ... | Display comment");
- add_param("simulation", (int *) &simulation, "Start/Stop simulation mode",
+ add_param("simulation", (void *) &simulation, sizeof(simulation),
+ "Start/Stop simulation mode", NULL);
+ add_param("verbose", (void *) &verblevel, sizeof(verblevel),
+ "Verbosity level", NULL);
+ add_param("error", (void *) &err_limit, sizeof(err_limit),
+ "Number of errors until exit", NULL);
+ add_param("echo", (void *) &echo, sizeof(echo), "Do/don't echo commands",
NULL);
- add_param("verbose", &verblevel, "Verbosity level", NULL);
- add_param("error", &err_limit, "Number of errors until exit", NULL);
- add_param("echo", (int *) &echo, "Do/don't echo commands", NULL);
init_in();
init_time(&last_time);
@@ -130,7 +135,8 @@ void add_cmd(char *name, cmd_function operation, char *documentation)
/* Add a new parameter */
void add_param(char *name,
- int *valp,
+ void *valp,
+ int valsize,
char *documentation,
setter_function setter)
{
@@ -144,6 +150,7 @@ void add_param(char *name,
param_ptr ele = (param_ptr) malloc_or_fail(sizeof(param_ele), "add_param");
ele->name = name;
ele->valp = valp;
+ ele->valsize = valsize;
ele->documentation = documentation;
ele->setter = setter;
ele->next = next_param;
@@ -248,6 +255,20 @@ static bool interpret_cmd(char *cmdline)
return ok;
}
+ static uint32_t get_plist_valp(param_ptr p)
+ {
+ switch (p->valsize) {
+ case 1:
+ return (*(uint8_t *) p->valp);
+ case 2:
+ return (*(uint16_t *) p->valp);
+ default:
+ return (*(uint32_t *) p->valp);
+ }
+ }
+
/* Set function to be executed as part of program exit */
void add_quit_helper(cmd_function qf)
{
@@ -303,7 +324,7 @@ static bool do_help_cmd(int argc, char *argv[])
param_ptr plist = param_list;
report(1, "Options:");
while (plist) {
- report(1, "\t%s\t%d\t%s", plist->name, *plist->valp,
+ report(1, "\t%s\t%d\t%s", plist->name, get_plist_valp(plist),
plist->documentation);
plist = plist->next;
}
@@ -342,7 +363,7 @@ static bool do_option_cmd(int argc, char *argv[])
param_ptr plist = param_list;
report(1, "Options:");
while (plist) {
- report(1, "\t%s\t%d\t%s", plist->name, *plist->valp,
+ report(1, "\t%s\t%d\t%s", plist->name, get_plist_valp(plist),
plist->documentation);
plist = plist->next;
}
@@ -365,10 +386,10 @@ static bool do_option_cmd(int argc, char *argv[])
param_ptr plist = param_list;
while (!found && plist) {
if (strcmp(plist->name, name) == 0) {
- int oldval = *plist->valp;
- *plist->valp = value;
+ void *oldvalp = plist->valp;
+ memcpy(oldvalp, &value, plist->valsize);
if (plist->setter)
- plist->setter(oldval);
+ plist->setter(oldvalp, plist->valsize);
found = true;
} else
plist = plist->next;
```
* `qtest.c`
```diff
--- a/qtest.c
+++ b/qtest.c
@@ -98,11 +98,11 @@ static void console_init()
add_cmd("size", do_size,
" [n] | Compute queue size n times (default: n == 1)");
add_cmd("show", do_show, " | Show queue contents");
- add_param("length", &string_length, "Maximum length of displayed string",
- NULL);
- add_param("malloc", &fail_probability, "Malloc failure probability percent",
- NULL);
- add_param("fail", &fail_limit,
+ add_param("length", (void *) &string_length, sizeof(string_length),
+ "Maximum length of displayed string", NULL);
+ add_param("malloc", (void *) &fail_probability, sizeof(fail_probability),
+ "Malloc failure probability percent", NULL);
+ add_param("fail", (void *) &fail_limit, sizeof(fail_limit),
"Number of times allow queue operations to return false", NULL);
}
```
## Valgrind + Massif Heap Profiler
### 安裝 massif-visualizer
`$ sudo apt install massif-visualizer`
### 使用 valgrind + massif 進行測試
```SHELL
$ valgrind -q --tool=massif --stack=yes --time-unit=i --massif-out-file=massif.out ./qtest
...
cmd> option fail 0
cmd> option malloc 0
cmd> new
cmd> ih RAND 10
cmd> reverse
cmd> sort
cmd> rh
... 總共執行 rh 10 次
cmd> quit
```
### 使用 massif-visualizer 視覺化
`$ massif-visualizer massif.out`

## [Dudect](https://github.com/oreparaz/dudect) 相關研究
## select 系統呼叫
## 自動評分系統相關補充
### 如何呼叫使用者輸入命令對應的函式
一開始使用者執行 `qtest` 後,輸入命令時,會先執行 `console.c` 中的 `interpret_cmd` 並呼叫 `parse_args` 來將輸入的字串以空白為區隔,轉換為 `argc` 和 `argv`,如下
```cpp
bool interpret_cmd(char *cmdline)
{
int argc;
...
char **argv = parse_args(cmdline, &argc);
bool ok = interpret_cmda(argc, argv);
...
return ok;
}
```
之後在函式 `interpret_cmda` 中,利用迴圈走訪 singly linked list,並利用 `strcmp` 找出函式相對應的字串。可以發現在 `qtest` 中使用函式指標的方式可以很簡單方便的維護使用者可輸入的命令
```cpp
static bool interpret_cmda(int argc, char *argv[])
{
if (argc == 0)
return true;
/* Try to find matching command */
cmd_ptr next_cmd = cmd_list;
bool ok = true;
while (next_cmd && strcmp(argv[0], next_cmd->name) != 0)
next_cmd = next_cmd->next;
if (next_cmd) {
ok = next_cmd->operation(argc, argv);
if (!ok)
record_error();
} else {
report(1, "Unknown command '%s'", argv[0]);
record_error();
ok = false;
}
return ok;
}
```
### 在寫好 trace file 後程式如何運作
在 `qtest` 中有提供參數 `-f <filename>` 來讀取 trace file。當使用者輸入命令後,在 `qtest.c` 中的 `main` 函式會呼叫 `getopt` 來解析使用者先前輸入命令的後面所添加的參數,節錄如下
```cpp
int main(int argc, char *argv[])
{
...
while ((c = getopt(argc, argv, "hv:f:l:")) != -1) {
switch (c) {
case 'h':
usage(argv[0]);
break;
case 'f':
strncpy(buf, optarg, BUFSIZE);
buf[BUFSIZE - 1] = '\0';
infile_name = buf;
break;
case 'v':
level = atoi(optarg);
break;
case 'l':
strncpy(lbuf, optarg, BUFSIZE);
buf[BUFSIZE - 1] = '\0';
logfile_name = lbuf;
break;
default:
printf("Unknown option '%c'\n", c);
usage(argv[0]);
break;
}
}
...
}
```
:::info
`getopt` 是一個可以很方便的去 parse 從 command line 傳過來的參數的函式,該函式會一直讀取參數直到沒有更多的參數可以讀時才回傳 -1
* function prototype:
`int getopt(int argc, char *argv[], const char *optstring)`
* `int argc`: the argument count passed to the `main()` function
* `char *argv[]`: the argument array passed to the `main()` function
* `const char *optstring`: a string containing the legitimate option characters
* 一個字元︰ 一個可用的 option
* 一個字元搭配冒號︰該 option 後有額外帶參數
* 一個字元當配兩個冒號︰該 option 後帶的參數是可選的 **(此為 GNU extension)**
* `getopt` 回傳數值後
* 成功回傳 option 字元,無更多 option 可讀則回傳 -1,在 `optstring` 找不到 option 字元則回傳 `?`,在 `optstring` 設定需帶參數的 option 沒有帶參數的話則回傳 `:`
* `optarg` 變數儲存該 option 後面所帶的參數
* `optind` 變數儲存下次呼叫 `getopt` 時要處理的 argv 的 index
* `optopt` 變數儲存當 `getopt` 找不到 option 字元或是缺少某些參數時的該 option
* ==注意事項==
* `getopt` 是 POSIX standard 但是並不是 C standard 並在 glibc 實作中的某些行為是 GNU extensions,這點需要特別注意
* 參考資料
* [man 3 getopt](https:// "http://man7.org/linux/man-pages/man3/getopt.3.html")
* [Oracle Docs](https://docs.oracle.com/cd/E19253-01/816-5168/getopt-3c/index.html)
* [POSIX-1.200x Draft](http://www.open-std.org/jtc1/sc22/open/n4217.pdf#page=1052)
* Michael Kerrisk, *The Linux Programming Interface: Appendix B*
:::
在 `console.c` 中定義以下 structure 來儲存每個要使用的檔案的 file descritptor
```cpp
#define RIO_BUFSIZE 8192
typedef struct RIO_ELE rio_t, *rio_ptr;
struct RIO_ELE {
int fd; /* File descriptor */
int cnt; /* Unread bytes in internal buffer */
char *bufptr; /* Next unread byte in internal buffer */
char buf[RIO_BUFSIZE]; /* Internal buffer */
rio_ptr prev; /* Next element in stack */
};
```
> RIO 在 CS:APP 第 10.5 章提及,全名為 Robust I/O
> Ref: [System-Level I/O](https://csapp.cs.cmu.edu/2e/ch10-preview.pdf#page=6) (preview version)
並在 `push_file` 函式中把開啟檔案的 file descriptor 放進 `struct RIO_ELE` 中的 `fd`,若無指定開啟檔案的路徑則選擇為標準輸入 (也就是 `stdin`)
```cpp
static bool push_file(char *fname)
{
int fd = fname ? open(fname, O_RDONLY) : STDIN_FILENO;
if (fd < 0)
return false;
if (fd > fd_max)
fd_max = fd;
rio_ptr rnew = malloc_or_fail(sizeof(rio_t), "push_file");
rnew->fd = fd;
rnew->cnt = 0;
rnew->bufptr = rnew->buf;
rnew->prev = buf_stack;
buf_stack = rnew;
return true;
}
```
### 如何偵測記憶體洩漏
`harness.[ch]` 中藉由替換掉 `malloc` 和 `free` 的實作方式使得可以檢查到 allocation error
從 `harness.h` 中可以見到 `qtest` 使用 macro 來把 `malloc` 和 `free` 替換成 `test_malloc` 和 `test_free`
```cpp
/* Tested program use our versions of malloc and free */
#define malloc test_malloc
#define free test_free
```
而從 `test_malloc` 所配置出來的記憶體都會被紀錄在 `struct BELE` 中,structure 在 `harness.c` 中定義如下並以 doubly-linked list 來串接節點
```cpp
typedef struct BELE {
struct BELE *next;
struct BELE *prev;
size_t payload_size;
size_t magic_header; /* Marker to see if block seems legitimate */
unsigned char payload[0];
/* Also place magic number at tail of every block */
} block_ele_t;
```
在 structure 中 `payload[0]` 是一個 struct hack,在 [GCC manual](https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html) 中稱呼為 Arry of Length Zero,藉由這種方式可以做到讓 structure 中的成員可以動態的配置記憶體又不用因為多次呼叫 `malloc` 造成記憶體碎片化
用法如下
```c
block_ele_t *new_block = malloc(array_size + sizeof(block_ele_t));
```
這樣的手法其實就是利用 GNU C compiler 預設不會去檢查 Array index out of bounds 的問題來做到的,但是使用這個技巧需要思考以下的問題
* Flexible array members 需為 incomplete type,並且 `sizeof` 不能用在 incomplete type 上
* Flexible array members 需宣告在最後一個 non-empty structure 成員上
* Flexible array members 不能單獨出現,至少需一個成員在
* Flexible array members 本身不可以宣告或包含在 union 中
* [Clang 支援該 GNU extension,但不支援 static initialization](http://clang.llvm.org/docs/UsersManual.html#intentionally-unsupported-gcc-extensions)
> Note that clang does support flexible array members (arrays with zero or unspecified size at the end of a structure).
>
> clang does not support static initialization of flexible array members. This appears to be a rarely used extension, but could be implemented pending user demand.
另外在 ISO C90 可以使用 Array of Length One 來做到同樣的事情,而從 ISO C99 開始支援 Flexible Array Members,請參照
* [Wikipedia](https://en.wikipedia.org/wiki/Flexible_array_member)
* [C99 §6.7.2.1, item 16](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf#page=115)
> As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.
### 如何配置記憶體
`test_malloc` 把經由 `malloc` 配置出來的新的記憶體空間,將其串連到 doubly linked list `allocated` 中,其中配置出記憶體空間的寫法如下
```c
block_ele_t *new_block = malloc(size + sizeof(block_ele_t) + sizeof(size_t));
```
可以看到除了 `size` 和 `sizeof(block_ele_t)` 以外,還多了 `sizeof(size_t)`,這是因為在 `qtest` 中,藉由在尾端多配置出一個空間並填入 magic number 來查看若當該數值有被更動過的話,即表示出現 memory corruption (array access out of bounds),以及偵測是否是由 `test_malloc` 配置出來的記憶體空間,而前一個成員 `magic_header` 也是基於此用途的。`harness.c` 中的 `test_malloc` 節錄如下
```cpp
void *test_malloc(size_t size)
{
if (fail_allocation()) {
report_event(MSG_WARN, "Malloc returning NULL");
return NULL;
}
block_ele_t *new_block =
malloc(size + sizeof(block_ele_t) + sizeof(size_t));
...
new_block->magic_header = MAGICHEADER;
new_block->payload_size = size;
*find_footer(new_block) = MAGICFOOTER;
void *p = (void *) &new_block->payload;
memset(p, FILLCHAR, size);
new_block->next = allocated;
new_block->prev = NULL;
if (allocated)
allocated->prev = new_block;
allocated = new_block;
allocated_count++;
return p;
}
```
:::info
在 `fail_allocation` 中,實作了當亂數產生出來的數小於 0.01 乘上預先設定好的失敗機率 (`fail_probability`) 的話,即 malloc failure,為何需要如此實作?
> [name=MetalheadKen]
:::
:::warning
回去複習機率統計,思考上述做法是否合理,若有疑慮就提出論證和實驗
:notes: jserv
:::
其中為了方便取得到 `payload` 的尾端來指派 magic number (`MAGICFOOTER`),`qtest` 實作了 `find_footer` 函式,如下
```cpp
static size_t *find_footer(block_ele_t *b)
{
size_t *p =
(size_t *) ((size_t) b + b->payload_size + sizeof(block_ele_t));
return p;
}
```
藉由傳進來的 `block_ele_t *` 的開頭,在加上 payload size 和該 structure 本身的大小來取得到 `payload` 的尾端