2020q1 Homework1 (lab0)

contributed by < MetalheadKen >

:mega: 題目出處

實驗環境

$ uname -a
Linux kendai-XPS-13-9360 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ gcc --version | head -n 1
gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0

題目摘要

Some of the skills tested are :
- Explicit memory management, as required in C
- Creating and manipulating pointer-based data structures
- Working with strings
- Enhancing the proformance of key operations by storing redundant information in data structures
- Implementing robust code that operates correctly with invalid arguments, including NULL pointers
Implementing a queue
- Supporting both last-in, first-out (LIFO) and first-in-first-out (FIFO)
- The underlying data structure is a singly-linked list
Resources
- C programming
  - Kernighan and Ritchie, The C Programming Language, second edition
  - Wikibooks
- Linked lists
- Asympotic (big-Oh) notation
- Linux Man pages
The task is to modify the code in queue.h and queue.c to fully implement the following functions
- q_new: Create a new, empty queue
- q_free: Free all storage used by a queue
- q_insert_head: Attempt to insert a new element at the head of the queue (LIFO discipline)
- q_insert_tail: Attempt to insert a new element at the tail of the queue (FIFO discipline)
- q_remove_head: Attempt to remove the element at the head of the queue
- q_size: Compute the number of elements in the queue
- q_reverse: Reorder the list so that the queue elements are reversed in order.
- q_sort: Sort elements of queue in ascending order.
Precautions
- A NULL queue is one for which the pointer has value NULL. An empty queue is one pointing to a valid structure, but the head field has value NULL
- For functions that provide strings as arguments, you must create and store a copy of the string by calling malloc to allocate space and then copying from the source to the newly allocated space
- You cannot assume any fixed upper bound on the length of a string ─ you must allocate space for each string based on its length
- q_insert_tail and q_size will require that your implementations operate in time
  $O (1)$
- q_reverse should not allocate or free any list elements. Instead, it should rearrange the existing elements

實作流程

`queue.h`

依據題意，為了讓函式 q_insert_tail 和 q_size 的時間複雜度為

O (1)

，在 struct queue_t 裡新增成員 tail 和 size，使得在 runtime 的時候可以動態紀錄 queue 的尾端和其大小

/* Queue structure */
typedef struct {
    list_ele_t *head;
    list_ele_t **tail;
    size_t size;
} queue_t;

`q_new`

需注意若 malloc 失敗時需回傳 NULL

queue_t *q_new()
{
    queue_t *q = (queue_t *) malloc(sizeof(queue_t));
    if (!q)
        return NULL;

    q->head = NULL;
    q->tail = &(q->head);
    q->size = 0;
    return q;
}

`q_free`

釋放 queue 的空間，注意記得一併釋放節點和字串所佔的記憶體空間

void q_free(queue_t *q)
{
    if (!q)
        return;

    while (q->head != NULL) {
        list_ele_t *tmp = q->head;
        q->head = q->head->next;

        free(tmp->value);
        free(tmp);
    }

    free(q);
}

`q_insert_head`

分別建立節點和字串的空間，之後進行字串複製和節點的串接
需注意若 malloc 失敗時，在回傳 NULL 前需釋放之前已經配置好的記憶體空間
- 在這邊使用 goto 來作錯誤處理，可參考 goto in Linux kernel coding style 章節
紀錄節點數量

bool q_insert_head(queue_t *q, char *s)
{
    if (!q)
        goto ERR;

    list_ele_t *newh;
    newh = (list_ele_t *) malloc(sizeof(list_ele_t));
    if (!newh)
        goto ERR;

    int len = strlen(s) + 1;
    char *str = (char *) malloc(len);
    if (!str)
        goto ERR_FREE_LIST;

    memcpy(str, s, len);

    newh->value = str;
    newh->next = q->head;

    if (!q->head) {
        q->tail = &(newh->next);
    }
    q->head = newh;
    q->size++;

    return true;

ERR_FREE_LIST:
    free(newh);
ERR:
    return false;
}

`q_insert_tail`

大致上與先前的 q_insert_head 的實作方向相同
因題意規定其時間複雜度需為
$O (1)$ ，這邊使用 tail 來串接新節點
撰寫「有品味」的程式碼
- 讓 tail 的 data type 為 list_ele_t**，使得可以減少 branch penalty，可參考
紀錄節點數量

bool q_insert_tail(queue_t *q, char *s)
{
    if (!q)
        goto ERR;

    list_ele_t *newt = (list_ele_t *) malloc(sizeof(list_ele_t));
    if (!newt)
        goto ERR;

    int len = strlen(s) + 1;
    char *str = (char *) malloc(len);
    if (!str)
        goto ERR_FREE_LIST;

    memcpy(str, s, len);

    newt->value = str;
    newt->next = NULL;

    *(q->tail) = newt;
    q->tail = &(newt->next);
    q->size++;

    return true;

ERR_FREE_LIST:
    free(newt);
ERR:
    return false;
}

`q_remove_head`

若 sp 不為 NULL 時代表需把移除的字串複製到 sp 中
注意有可能要移除的字串的長度超過 sp 的大小，因而最多只需複製長度為 bufsize - 1，並注意要在結尾上加上 terminating null byte
調整節點數量

bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
    if (!q || !q->head)
        return false;

    if (sp != NULL) {
        strncpy(sp, q->head->value, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }

    list_ele_t *tmp = q->head;
    q->head = q->head->next;
    q->tail = (--q->size == 0) ? &(q->head) : q->tail;

    free(tmp->value);
    free(tmp);
    return true;
}

`q_size`

題意規定時間複雜度為
$O (1)$ ，因此藉由 size 動態紀錄 queue 的大小，並在此函式直接回傳 size 的內容
需注意有可能在沒有使用 new command 的情況下直接使用 size command，所以需判斷若 queue 為 NULL queue 或 empty queue 則直接回傳數值 0

int q_size(queue_t *q)
{
    if (!q || !q->head) {
        return 0;
    } else {
        return q->size;
    }
}

`q_reverse`

若 queue 為 NULL queue 或 empty queue 或長度為 1 的 queue 則無需 reverse
使用 prev, curr, next 來紀錄反轉的過程
- prev 紀錄 queue 前一個的節點
- curr 紀錄 queue 當前的節點，在反轉過程中，curr 的下一個節點應指派為 curr 的前一個節點
- next 紀錄 queue 下一個的節點
注意最後 head 和 tail 需重設方向

void q_reverse(queue_t *q)
{
    if (!q || !q->head || !q->head->next)
        return;

    list_ele_t *prev, *curr, *next;

    prev = q->head;
    curr = q->head->next;
    q->head->next = NULL;
    q->tail = &(q->head->next);

    while (curr != NULL) {
        next = curr->next;
        curr->next = prev;
        prev = curr;
        curr = next;
    }
    q->head = prev;
}

`q_sort`

在這邊使用了 merge sort 來排序 queue 裡面的元素
可以注意到 merge_sort 裡面透過 fast 和 slow pointer 來找到該 list 的中間點為何
剛好 LeetCode 上有相關題目可以練練手，順便可以用獨立環境來跑一下 unit test

void q_sort(queue_t *q)
{
    if (!q || !q->head || !q->head->next)
        return;

    q->head = merge_sort(q->head);

    q->tail = &(q->head);
    while (*(q->tail)) {
        q->tail = &((*q->tail)->next);
    }
}

static list_ele_t *merge_sort(list_ele_t *head)
{
    if (!head || !head->next)
        return head;

    list_ele_t *slow = head;
    list_ele_t *fast = head->next;

    while (fast && fast->next) {
        slow = slow->next;
        fast = fast->next->next;
    }

    list_ele_t *mid = slow->next;
    slow->next = NULL;

    return merge(merge_sort(head), merge_sort(mid));
}

/* Merge two list in ascending order */
static list_ele_t *merge(list_ele_t *left, list_ele_t *right)
{
    if (!left)
        return right;
    if (!right)
        return left;

    list_ele_t *dummy = NULL;
    list_ele_t *curr = NULL;

    if (strcmp(left->value, right->value) > 0) {
        dummy = right;
        right = right->next;
    } else {
        dummy = left;
        left = left->next;
    }
    curr = dummy;

    while (left && right) {
        if (strcmp(left->value, right->value) > 0) {
            curr->next = right;
            right = right->next;
        } else {
            curr->next = left;
            left = left->next;
        }
        curr = curr->next;
    }

    if (left)
        curr->next = left;
    if (right)
        curr->next = right;

    return dummy;
}

Implement Natural Sort

這邊採用 sourcefrog/natsort 所實作的 natural sort 的 compare function - strnatcmp。從原專案中取得 strnatcmp.[ch] 後放入 lab0-c 的專案中。

這邊為了明確的表示是使用別人的專案，原本是想採用 git submodule 的方式建構的，但是考慮到若要對 natsort 專案做改動，勢必要 push 回去到原專案中，或者在當初 submodule init 時只能指定 fork 後的專案，但都相對的要麻煩許多。

不知道有什麼方式可以除了把有 reference 到的專案之相關訊息寫進 commit message 以外達到明確的使用到以及隨時跟新進度在別人的專案上。

MetalheadKen

這是練習撰寫清晰的 git commit messages 的好機會
:notes: jserv

接著把在排序中採用到的 strcmp 一律改成 strnatcmp

qtest.c

bool do_sort(int argc, char *argv[])
{
...
    for (list_ele_t *e = q->head; e && --cnt; e = e->next) {
        /* Ensure each element in ascending order */
        /* FIXME: add an option to specify sorting order */
        if (strnatcmp(e->value, e->next->value) > 0) {
            report(1, "ERROR: Not sorted in ascending order");
            ok = false;
            break;
...
}

queue.c

static list_ele_t *merge(list_ele_t *left, list_ele_t *right)
{
...
    if (strnatcmp(left->value, right->value) > 0) {
        dummy = right;
        right = right->next;
    } else {
...
    while (left && right) {
        if (strnatcmp(left->value, right->value) > 0) {
            curr->next = right;
            right = right->next;
        } else {
...
}

更改 Makefile，使其可以編譯 strnatcmp.c

Makefile

...
OBJS := qtest.o report.o console.o harness.o queue.o \
        random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \
        natsort/strnatcmp.o
...

為了對現有的 natsort 做測試，增加新的 trace file

traces/trace-18-natsort.cmd

# Test operation of natural sort
option fail 0
option malloc 0
new
ih fred
ih pic2
ih pic100a
ih pic120
ih pic121
ih jane
ih tom
ih pic02a
ih pic3
ih pic4
ih 1-20
ih pic100
ih pic02000
sort

script/driver.py

--- a/scripts/driver.py
+++ b/scripts/driver.py
@@ -35,7 +35,8 @@ class Tracer:
         14: "trace-14-perf",
         15: "trace-15-perf",
         16: "trace-16-perf",
-        17: "trace-17-complexity"
+        17: "trace-17-complexity",
+        18: "trace-18-natsort"
     }
 
     traceProbs = {
@@ -55,10 +56,11 @@ class Tracer:
         14: "Trace-14",
         15: "Trace-15",
         16: "Trace-16",
-        17: "Trace-17"
+        17: "Trace-17",
+        18: "Trace-18"
     }
 
-    maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5]
+    maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 6]
 
     RED = '\033[91m'
     GREEN = '\033[92m'

對 trace-16-perf.cmd 採用到的 RAND 提供英文與數字的亂數排序，其實只要新增數字到 charset 就可以了

qtest.c

--- a/qtest.c
+++ b/qtest.c
@@ -60,7 +60,7 @@ static int string_length = MAXSTRING;
 
  #define MIN_RANDSTR_LEN 5
  #define MAX_RANDSTR_LEN 10
- static const char charset[] = "abcdefghijklmnopqrstuvwxyz";
+ static const char charset[] = "abcdefghijklmnopqrstuvwxyz0123456789";

執行 make test 後出現以下錯誤，依錯誤訊息可得知為超時

...
ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient
...

這是因為在 natsort 中所使用的 strnatcmp 比 glibc 所提供的 strcmp 的執行時間還要長的緣故。為了解決這個問題，這邊採用的方法為新增一個新的 command option time，藉由 option time <limit=1> 可以指定在這次測試中所會參考的 time limit。

qtest.c

static void console_init()
{
...
    add_param("fail", (void *) &fail_limit, sizeof(fail_limit),
              "Number of times allow queue operations to return false", NULL);
    add_param("time", (void *) &time_limit, sizeof(time_limit),
              "Maximum number of time limit", NULL);
}

為了要讓 time_limit 可以動態更改，必須把 time_limit 改為 global variable

harness.c

--- a/harness.c
+++ b/harness.c
@@ -52,7 +52,7 @@ static bool noallocate_mode = false;
  static bool error_occurred = false;
  static char *error_message = "";

- static int time_limit = 1;
+ int time_limit = 1;

  /*
   * Data for managing exceptions

harness.h

--- a/harness.c
+++ b/harness.c;
@@ -25,6 +25,9 @@ size_t allocation_check();
  /* Probability of malloc failing, expressed as percent */
  extern int fail_probability;

+ /* Amount of time limit */
+ extern int time_limit;
+ 
  /*
   * Set/unset cautious mode.
   * In this mode, makes extra sure any block to be freed is currently allocated.;

加入 option time 到會因為 natsort 的緣故而造成 Time limit exceeded 的 trace 中

traces/trace-15-perf.cmd

--- a/traces/trace-15-perf.cmd
+++ b/traces/trace-15-perf.cmd
@@ -1,6 +1,8 @@
  # Test performance of insert_tail, size, reverse, and sort
  option fail 0
  option malloc 0
+ # For test natural sort, we extend the amount of time limit
+ option time 3
  new
  ih dolphin 1000000
  it gerbil 1000000
...

Address Sanitizer

執行 make SANITIZER=1 test 後出現下列錯誤

...
# Test if q_insert_tail and q_size is constant time complexity   
=================================================================
==26442==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55ba43b1e9c0 at pc 0x55ba4390f86e bp 0x7ffe34a65e40 sp 0x7ffe34a65e30
READ of size 4 at 0x55ba43b1e9c0 thread T0                       
    #0 0x55ba4390f86d in do_option_cmd /home/kendai/git/lab0-c/console.c:368
    #1 0x55ba4390e48a in interpret_cmda /home/kendai/git/lab0-c/console.c:220
    #2 0x55ba4390ee8e in interpret_cmd /home/kendai/git/lab0-c/console.c:243
    #3 0x55ba4390fb77 in cmd_select /home/kendai/git/lab0-c/console.c:569
    #4 0x55ba4390ff94 in run_console /home/kendai/git/lab0-c/console.c:628
    #5 0x55ba4390d0ad in main /home/kendai/git/lab0-c/qtest.c:772
    #6 0x7f0b512e6b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #7 0x55ba4390a809 in _start (/home/kendai/git/lab0-c/qtest+0x6809)
                                
0x55ba43b1e9c1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:20:6' (0x55ba43b1e9c0) of size 1
  'simulation' is ascii string ''
SUMMARY: AddressSanitizer: global-buffer-overflow /home/kendai/git/lab0-c/console.c:368 in do_option_cmd
...
+++ TESTING trace trace-17-complexity:
---     trace-17-complexity     0/5
---     TOTAL           95/100

先用 GDB 定位問題

$ gdb -q --args ./qtest -f traces/trace-17-complexity.cmd
...
(gdb) b console.c:368
(gdb) r
...
Breakpoint 1, do_option_cmd (argc=3, argv=0x555555763ce0) at console.c:368
368                     int oldval = *plist->valp;
(gdb) p *plist
$1 = {name = 0x55555555bd47 "simulation", valp = 0x55555575e520 <simulation>, documentation = 0x55555555bd2c "Start/Stop simulation mode", setter = 0x0, 
  next = 0x5555557613e0}
(gdb) whatis (*plist).valp
type = int *
(gdb) whatis simulation
type = _Bool
(gdb) p sizeof(_Bool)
$2 = 1

從這邊可以知道由於 simulation 從 bool 轉成 int * 並存入 valp，之後當直接 dereference 的時候就會因為要取出 4 bytes 的資料而存取到不該存取的地方故造成 buffer overflow

為了要解決 global-buffer-overflow，我們需要把轉型成 int * 的地方改為 void *，根據 C99 規格，pointer to void 做為一個 generic pointer 可以把原本的 object type 轉成 pointer to void 再轉回來。藉由於此，我們可以透過 pointer to void 來做到不同的 data type 的存取

C99 Spec 6.3.2.3:1 [Pointers]

A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

除了 void * 來儲存各種的 data type 以外，還需要一個 variable 來儲存原本的 object 所佔的大小。最後，根據上面所述，我們需要修改以下動作來解決所需的問題

修改 struct PELE 來儲存 void * 和原始的 object size
修改對應的 function argument 來完成儲存 void * 和原始的 object size 的動作
新增一個新的 function 來讓 void * 可以透過原始的 object size 來轉型回原來的 data type

修改如下

console.h

--- a/console.h
+++ b/console.h
@@ -23,13 +23,14 @@ struct CELE {
 };
 
  /* Optionally supply function that gets invoked when parameter changes */
- typedef void (*setter_function)(int oldval);
+ typedef void (*setter_function)(void *oldval, int oldsize);
 
  /* Integer-valued parameters */
  typedef struct PELE param_ele, *param_ptr;
  struct PELE {
      char *name;
-     int *valp;
+     void *valp;
+     int valsize;
      char *documentation;
      /* Function that gets called whenever parameter changes */
      setter_function setter;
@@ -44,7 +45,8 @@ void add_cmd(char *name, cmd_function operation, char *documentation);
 
  /* Add a new parameter */
  void add_param(char *name,
-                int *valp,
+                void *valp,
+                int valsize,
                 char *doccumentation,
                 setter_function setter);

console.c

--- a/console.c
+++ b/console.c
@@ -82,6 +82,8 @@ static void pop_file();
 
  static bool interpret_cmda(int argc, char *argv[]);
 
+ static uint32_t get_plist_valp(param_ptr p);
+ 
  /* Initialize interpreter */
  void init_cmd()
  {
@@ -99,11 +101,14 @@ void init_cmd()
      add_cmd("log", do_log_cmd, " file           | Copy output to file");
      add_cmd("time", do_time_cmd, " cmd arg ...    | Time command execution");
      add_cmd("#", do_comment_cmd, " ...            | Display comment");
-     add_param("simulation", (int *) &simulation, "Start/Stop simulation mode",
+     add_param("simulation", (void *) &simulation, sizeof(simulation),
+               "Start/Stop simulation mode", NULL);
+     add_param("verbose", (void *) &verblevel, sizeof(verblevel),
+               "Verbosity level", NULL);
+     add_param("error", (void *) &err_limit, sizeof(err_limit),
+               "Number of errors until exit", NULL);
+     add_param("echo", (void *) &echo, sizeof(echo), "Do/don't echo commands",
               NULL);
-     add_param("verbose", &verblevel, "Verbosity level", NULL);
-     add_param("error", &err_limit, "Number of errors until exit", NULL);
-     add_param("echo", (int *) &echo, "Do/don't echo commands", NULL);
 
      init_in();
      init_time(&last_time);
@@ -130,7 +135,8 @@ void add_cmd(char *name, cmd_function operation, char *documentation)
 
  /* Add a new parameter */
  void add_param(char *name,
-                int *valp,
+                void *valp,
+                int valsize,
                 char *documentation,
                 setter_function setter)
  {
@@ -144,6 +150,7 @@ void add_param(char *name,
      param_ptr ele = (param_ptr) malloc_or_fail(sizeof(param_ele), "add_param");
      ele->name = name;
      ele->valp = valp;
+     ele->valsize = valsize;
      ele->documentation = documentation;
      ele->setter = setter;
      ele->next = next_param;
@@ -248,6 +255,20 @@ static bool interpret_cmd(char *cmdline)
      return ok;
  }
 
+ static uint32_t get_plist_valp(param_ptr p)
+ {
+     switch (p->valsize) {
+     case 1:
+         return (*(uint8_t *) p->valp);
+     case 2:
+         return (*(uint16_t *) p->valp);
+     default:
+         return (*(uint32_t *) p->valp);
+     }
+ }
+ 
  /* Set function to be executed as part of program exit */
  void add_quit_helper(cmd_function qf)
  {
@@ -303,7 +324,7 @@ static bool do_help_cmd(int argc, char *argv[])
      param_ptr plist = param_list;
      report(1, "Options:");
      while (plist) {
-         report(1, "\t%s\t%d\t%s", plist->name, *plist->valp,
+         report(1, "\t%s\t%d\t%s", plist->name, get_plist_valp(plist),
                 plist->documentation);
          plist = plist->next;
      }
@@ -342,7 +363,7 @@ static bool do_option_cmd(int argc, char *argv[])
          param_ptr plist = param_list;
          report(1, "Options:");
          while (plist) {
-             report(1, "\t%s\t%d\t%s", plist->name, *plist->valp,
+             report(1, "\t%s\t%d\t%s", plist->name, get_plist_valp(plist),
                     plist->documentation);
              plist = plist->next;
          }
@@ -365,10 +386,10 @@ static bool do_option_cmd(int argc, char *argv[])
          param_ptr plist = param_list;
          while (!found && plist) {
              if (strcmp(plist->name, name) == 0) {
-                 int oldval = *plist->valp;
-                 *plist->valp = value;
+                 void *oldvalp = plist->valp;
+                 memcpy(oldvalp, &value, plist->valsize);
                  if (plist->setter)
-                     plist->setter(oldval);
+                     plist->setter(oldvalp, plist->valsize);
                  found = true;
              } else
                  plist = plist->next;

qtest.c

--- a/qtest.c
+++ b/qtest.c
@@ -98,11 +98,11 @@ static void console_init()
      add_cmd("size", do_size,
              " [n]            | Compute queue size n times (default: n == 1)");
      add_cmd("show", do_show, "                | Show queue contents");
-     add_param("length", &string_length, "Maximum length of displayed string",
-               NULL);
-     add_param("malloc", &fail_probability, "Malloc failure probability percent",
-               NULL);
-     add_param("fail", &fail_limit,
+     add_param("length", (void *) &string_length, sizeof(string_length),
+               "Maximum length of displayed string", NULL);
+     add_param("malloc", (void *) &fail_probability, sizeof(fail_probability),
+               "Malloc failure probability percent", NULL);
+     add_param("fail", (void *) &fail_limit, sizeof(fail_limit),
               "Number of times allow queue operations to return false",  NULL);
  }

Valgrind + Massif Heap Profiler

安裝 massif-visualizer

$ sudo apt install massif-visualizer

使用 valgrind + massif 進行測試

$ valgrind -q --tool=massif --stack=yes --time-unit=i --massif-out-file=massif.out ./qtest
...
cmd> option fail 0
cmd> option malloc 0
cmd> new
cmd> ih RAND 10
cmd> reverse
cmd> sort
cmd> rh
... 總共執行 rh 10 次
cmd> quit

使用 massif-visualizer 視覺化

$ massif-visualizer massif.out

Dudect 相關研究

select 系統呼叫

自動評分系統相關補充

如何呼叫使用者輸入命令對應的函式

一開始使用者執行 qtest 後，輸入命令時，會先執行 console.c 中的 interpret_cmd 並呼叫 parse_args 來將輸入的字串以空白為區隔，轉換為 argc 和 argv，如下

bool interpret_cmd(char *cmdline)
{
    int argc;
    ...
    char **argv = parse_args(cmdline, &argc);
    bool ok = interpret_cmda(argc, argv);
    ...
    return ok;
}

之後在函式 interpret_cmda 中，利用迴圈走訪 singly linked list，並利用 strcmp 找出函式相對應的字串。可以發現在 qtest 中使用函式指標的方式可以很簡單方便的維護使用者可輸入的命令

static bool interpret_cmda(int argc, char *argv[])
{
    if (argc == 0)
        return true;
    /* Try to find matching command */
    cmd_ptr next_cmd = cmd_list;
    bool ok = true;
    while (next_cmd && strcmp(argv[0], next_cmd->name) != 0)
        next_cmd = next_cmd->next;
    if (next_cmd) {
        ok = next_cmd->operation(argc, argv);
        if (!ok)
            record_error();
    } else {
        report(1, "Unknown command '%s'", argv[0]);
        record_error();
        ok = false;
    }
    return ok;
}

在寫好 trace file 後程式如何運作

在 qtest 中有提供參數 -f <filename> 來讀取 trace file。當使用者輸入命令後，在 qtest.c 中的 main 函式會呼叫 getopt 來解析使用者先前輸入命令的後面所添加的參數，節錄如下

int main(int argc, char *argv[])
{
    ...
    while ((c = getopt(argc, argv, "hv:f:l:")) != -1) {
        switch (c) {
        case 'h':
            usage(argv[0]);
            break;
        case 'f':
            strncpy(buf, optarg, BUFSIZE);
            buf[BUFSIZE - 1] = '\0';
            infile_name = buf;
            break;
        case 'v':
            level = atoi(optarg);
            break;
        case 'l':
            strncpy(lbuf, optarg, BUFSIZE);
            buf[BUFSIZE - 1] = '\0';
            logfile_name = lbuf;
            break;
        default:
            printf("Unknown option '%c'\n", c);
            usage(argv[0]);
            break;
        }
    }
    ...
}

getopt 是一個可以很方便的去 parse 從 command line 傳過來的參數的函式，該函式會一直讀取參數直到沒有更多的參數可以讀時才回傳 -1

function prototype:
int getopt(int argc, char *argv[], const char *optstring)
int argc: the argument count passed to the main() function
char *argv[]: the argument array passed to the main() function
const char *optstring: a string containing the legitimate option characters
- 一個字元︰一個可用的 option
- 一個字元搭配冒號︰該 option 後有額外帶參數
- 一個字元當配兩個冒號︰該 option 後帶的參數是可選的 (此為 GNU extension)
getopt 回傳數值後
- 成功回傳 option 字元，無更多 option 可讀則回傳 -1，在 optstring 找不到 option 字元則回傳 ?，在 optstring 設定需帶參數的 option 沒有帶參數的話則回傳 :
- optarg 變數儲存該 option 後面所帶的參數
- optind 變數儲存下次呼叫 getopt 時要處理的 argv 的 index
- optopt 變數儲存當 getopt 找不到 option 字元或是缺少某些參數時的該 option
注意事項
- getopt 是 POSIX standard 但是並不是 C standard 並在 glibc 實作中的某些行為是 GNU extensions，這點需要特別注意
參考資料
- man 3 getopt
- Oracle Docs
- POSIX-1.200x Draft
- Michael Kerrisk, The Linux Programming Interface: Appendix B

在 console.c 中定義以下 structure 來儲存每個要使用的檔案的 file descritptor

#define RIO_BUFSIZE 8192
typedef struct RIO_ELE rio_t, *rio_ptr;

struct RIO_ELE {
    int fd;                /* File descriptor */
    int cnt;               /* Unread bytes in internal buffer */
    char *bufptr;          /* Next unread byte in internal buffer */
    char buf[RIO_BUFSIZE]; /* Internal buffer */
    rio_ptr prev;          /* Next element in stack */
};

RIO 在 CS:APP 第 10.5 章提及，全名為 Robust I/O
Ref: System-Level I/O (preview version)

並在 push_file 函式中把開啟檔案的 file descriptor 放進 struct RIO_ELE 中的 fd，若無指定開啟檔案的路徑則選擇為標準輸入 (也就是 stdin)

static bool push_file(char *fname)
{
    int fd = fname ? open(fname, O_RDONLY) : STDIN_FILENO;
    if (fd < 0)
        return false;
    if (fd > fd_max)
        fd_max = fd;
    rio_ptr rnew = malloc_or_fail(sizeof(rio_t), "push_file");
    rnew->fd = fd;
    rnew->cnt = 0;
    rnew->bufptr = rnew->buf;
    rnew->prev = buf_stack;
    buf_stack = rnew;
    return true;
}

如何偵測記憶體洩漏

harness.[ch] 中藉由替換掉 malloc 和 free 的實作方式使得可以檢查到 allocation error

從 harness.h 中可以見到 qtest 使用 macro 來把 malloc 和 free 替換成 test_malloc 和 test_free

/* Tested program use our versions of malloc and free */
#define malloc test_malloc
#define free test_free

而從 test_malloc 所配置出來的記憶體都會被紀錄在 struct BELE 中，structure 在 harness.c 中定義如下並以 doubly-linked list 來串接節點

typedef struct BELE {
    struct BELE *next;
    struct BELE *prev;
    size_t payload_size;
    size_t magic_header; /* Marker to see if block seems legitimate */
    unsigned char payload[0];
    /* Also place magic number at tail of every block */
} block_ele_t;

在 structure 中 payload[0] 是一個 struct hack，在 GCC manual 中稱呼為 Arry of Length Zero，藉由這種方式可以做到讓 structure 中的成員可以動態的配置記憶體又不用因為多次呼叫 malloc 造成記憶體碎片化

用法如下

block_ele_t *new_block = malloc(array_size + sizeof(block_ele_t));

這樣的手法其實就是利用 GNU C compiler 預設不會去檢查 Array index out of bounds 的問題來做到的，但是使用這個技巧需要思考以下的問題

Flexible array members 需為 incomplete type，並且 sizeof 不能用在 incomplete type 上
Flexible array members 需宣告在最後一個 non-empty structure 成員上
Flexible array members 不能單獨出現，至少需一個成員在
Flexible array members 本身不可以宣告或包含在 union 中
Clang 支援該 GNU extension，但不支援 static initialization

Note that clang does support flexible array members (arrays with zero or unspecified size at the end of a structure).

clang does not support static initialization of flexible array members. This appears to be a rarely used extension, but could be implemented pending user demand.

另外在 ISO C90 可以使用 Array of Length One 來做到同樣的事情，而從 ISO C99 開始支援 Flexible Array Members，請參照

Wikipedia
C99 §6.7.2.1, item 16

As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.

如何配置記憶體

test_malloc 把經由 malloc 配置出來的新的記憶體空間，將其串連到 doubly linked list allocated 中，其中配置出記憶體空間的寫法如下

block_ele_t *new_block = malloc(size + sizeof(block_ele_t) + sizeof(size_t));

可以看到除了 size 和 sizeof(block_ele_t) 以外，還多了 sizeof(size_t)，這是因為在 qtest 中，藉由在尾端多配置出一個空間並填入 magic number 來查看若當該數值有被更動過的話，即表示出現 memory corruption (array access out of bounds)，以及偵測是否是由 test_malloc 配置出來的記憶體空間，而前一個成員 magic_header 也是基於此用途的。harness.c 中的 test_malloc 節錄如下

void *test_malloc(size_t size)
{
    if (fail_allocation()) {
        report_event(MSG_WARN, "Malloc returning NULL");
        return NULL;
    }
    block_ele_t *new_block =
        malloc(size + sizeof(block_ele_t) + sizeof(size_t));
    ...
    new_block->magic_header = MAGICHEADER;
    new_block->payload_size = size;
    *find_footer(new_block) = MAGICFOOTER;
    void *p = (void *) &new_block->payload;
    memset(p, FILLCHAR, size);
    new_block->next = allocated;
    new_block->prev = NULL;
    if (allocated)
        allocated->prev = new_block;
    allocated = new_block;
    allocated_count++;
    return p;
}

在 fail_allocation 中，實作了當亂數產生出來的數小於 0.01 乘上預先設定好的失敗機率 (fail_probability) 的話，即 malloc failure，為何需要如此實作？

MetalheadKen

回去複習機率統計，思考上述做法是否合理，若有疑慮就提出論證和實驗
:notes: jserv

其中為了方便取得到 payload 的尾端來指派 magic number (MAGICFOOTER)，qtest 實作了 find_footer 函式，如下

static size_t *find_footer(block_ele_t *b)
{
    size_t *p = 
        (size_t *) ((size_t) b + b->payload_size + sizeof(block_ele_t));
    return p;
}

藉由傳進來的 block_ele_t * 的開頭，在加上 payload size 和該 structure 本身的大小來取得到 payload 的尾端

2020q1 Homework1 (lab0)

實驗環境

題目摘要

實作流程

queue.h

q_new

q_free

q_insert_head

q_insert_tail

q_remove_head

q_size

q_reverse

q_sort