Try   HackMD

2020q1 Homework1 (lab0)

contributed by < MetalheadKen >

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
題目出處

實驗環境

$ uname -a
Linux kendai-XPS-13-9360 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ gcc --version | head -n 1
gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0

題目摘要

  • Some of the skills tested are :

    • Explicit memory management, as required in C
    • Creating and manipulating pointer-based data structures
    • Working with strings
    • Enhancing the proformance of key operations by storing redundant information in data structures
    • Implementing robust code that operates correctly with invalid arguments, including NULL pointers
  • Implementing a queue

    • Supporting both last-in, first-out (LIFO) and first-in-first-out (FIFO)
    • The underlying data structure is a singly-linked list
  • Resources

  • The task is to modify the code in queue.h and queue.c to fully implement the following functions

    • q_new: Create a new, empty queue
    • q_free: Free all storage used by a queue
    • q_insert_head: Attempt to insert a new element at the head of the queue (LIFO discipline)
    • q_insert_tail: Attempt to insert a new element at the tail of the queue (FIFO discipline)
    • q_remove_head: Attempt to remove the element at the head of the queue
    • q_size: Compute the number of elements in the queue
    • q_reverse: Reorder the list so that the queue elements are reversed in order.
    • q_sort: Sort elements of queue in ascending order.
  • Precautions

    • A NULL queue is one for which the pointer has value NULL. An empty queue is one pointing to a valid structure, but the head field has value NULL
    • For functions that provide strings as arguments, you must create and store a copy of the string by calling malloc to allocate space and then copying from the source to the newly allocated space
    • You cannot assume any fixed upper bound on the length of a string ─ you must allocate space for each string based on its length
    • q_insert_tail and q_size will require that your implementations operate in time
      O(1)
    • q_reverse should not allocate or free any list elements. Instead, it should rearrange the existing elements

實作流程

queue.h

依據題意,為了讓函式 q_insert_tailq_size 的時間複雜度為

O(1),在 struct queue_t 裡新增成員 tailsize,使得在 runtime 的時候可以動態紀錄 queue 的尾端和其大小

/* Queue structure */
typedef struct {
    list_ele_t *head;
    list_ele_t **tail;
    size_t size;
} queue_t;

q_new

需注意若 malloc 失敗時需回傳 NULL

queue_t *q_new()
{
    queue_t *q = (queue_t *) malloc(sizeof(queue_t));
    if (!q)
        return NULL;

    q->head = NULL;
    q->tail = &(q->head);
    q->size = 0;
    return q;
}

q_free

釋放 queue 的空間,注意記得一併釋放節點和字串所佔的記憶體空間

void q_free(queue_t *q)
{
    if (!q)
        return;

    while (q->head != NULL) {
        list_ele_t *tmp = q->head;
        q->head = q->head->next;

        free(tmp->value);
        free(tmp);
    }

    free(q);
}

q_insert_head

  • 分別建立節點和字串的空間,之後進行字串複製和節點的串接
  • 需注意若 malloc 失敗時,在回傳 NULL 前需釋放之前已經配置好的記憶體空間
  • 紀錄節點數量
bool q_insert_head(queue_t *q, char *s)
{
    if (!q)
        goto ERR;

    list_ele_t *newh;
    newh = (list_ele_t *) malloc(sizeof(list_ele_t));
    if (!newh)
        goto ERR;

    int len = strlen(s) + 1;
    char *str = (char *) malloc(len);
    if (!str)
        goto ERR_FREE_LIST;

    memcpy(str, s, len);

    newh->value = str;
    newh->next = q->head;

    if (!q->head) {
        q->tail = &(newh->next);
    }
    q->head = newh;
    q->size++;

    return true;

ERR_FREE_LIST:
    free(newh);
ERR:
    return false;
}

q_insert_tail

bool q_insert_tail(queue_t *q, char *s)
{
    if (!q)
        goto ERR;

    list_ele_t *newt = (list_ele_t *) malloc(sizeof(list_ele_t));
    if (!newt)
        goto ERR;

    int len = strlen(s) + 1;
    char *str = (char *) malloc(len);
    if (!str)
        goto ERR_FREE_LIST;

    memcpy(str, s, len);

    newt->value = str;
    newt->next = NULL;

    *(q->tail) = newt;
    q->tail = &(newt->next);
    q->size++;

    return true;

ERR_FREE_LIST:
    free(newt);
ERR:
    return false;
}

q_remove_head

  • sp 不為 NULL 時代表需把移除的字串複製到 sp
  • 注意有可能要移除的字串的長度超過 sp 的大小,因而最多只需複製長度為 bufsize - 1,並注意要在結尾上加上 terminating null byte
  • 調整節點數量
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
    if (!q || !q->head)
        return false;

    if (sp != NULL) {
        strncpy(sp, q->head->value, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }

    list_ele_t *tmp = q->head;
    q->head = q->head->next;
    q->tail = (--q->size == 0) ? &(q->head) : q->tail;

    free(tmp->value);
    free(tmp);
    return true;
}

q_size

  • 題意規定時間複雜度為
    O(1)
    ,因此藉由 size 動態紀錄 queue 的大小,並在此函式直接回傳 size 的內容
  • 需注意有可能在沒有使用 new command 的情況下直接使用 size command,所以需判斷若 queue 為 NULL queue 或 empty queue 則直接回傳數值 0
int q_size(queue_t *q)
{
    if (!q || !q->head) {
        return 0;
    } else {
        return q->size;
    }
}

q_reverse

  • 若 queue 為 NULL queue 或 empty queue 或長度為 1 的 queue 則無需 reverse
  • 使用 prev, curr, next 來紀錄反轉的過程
    • prev 紀錄 queue 前一個的節點
    • curr 紀錄 queue 當前的節點,在反轉過程中,curr 的下一個節點應指派為 curr 的前一個節點
    • next 紀錄 queue 下一個的節點
  • 注意最後 headtail 需重設方向
void q_reverse(queue_t *q)
{
    if (!q || !q->head || !q->head->next)
        return;

    list_ele_t *prev, *curr, *next;

    prev = q->head;
    curr = q->head->next;
    q->head->next = NULL;
    q->tail = &(q->head->next);

    while (curr != NULL) {
        next = curr->next;
        curr->next = prev;
        prev = curr;
        curr = next;
    }
    q->head = prev;
}

q_sort

  • 在這邊使用了 merge sort 來排序 queue 裡面的元素
  • 可以注意到 merge_sort 裡面透過 fastslow pointer 來找到該 list 的中間點為何
  • 剛好 LeetCode 上有相關題目可以練練手,順便可以用獨立環境來跑一下 unit test
void q_sort(queue_t *q)
{
    if (!q || !q->head || !q->head->next)
        return;

    q->head = merge_sort(q->head);

    q->tail = &(q->head);
    while (*(q->tail)) {
        q->tail = &((*q->tail)->next);
    }
}

static list_ele_t *merge_sort(list_ele_t *head)
{
    if (!head || !head->next)
        return head;

    list_ele_t *slow = head;
    list_ele_t *fast = head->next;

    while (fast && fast->next) {
        slow = slow->next;
        fast = fast->next->next;
    }

    list_ele_t *mid = slow->next;
    slow->next = NULL;

    return merge(merge_sort(head), merge_sort(mid));
}

/* Merge two list in ascending order */
static list_ele_t *merge(list_ele_t *left, list_ele_t *right)
{
    if (!left)
        return right;
    if (!right)
        return left;

    list_ele_t *dummy = NULL;
    list_ele_t *curr = NULL;

    if (strcmp(left->value, right->value) > 0) {
        dummy = right;
        right = right->next;
    } else {
        dummy = left;
        left = left->next;
    }
    curr = dummy;

    while (left && right) {
        if (strcmp(left->value, right->value) > 0) {
            curr->next = right;
            right = right->next;
        } else {
            curr->next = left;
            left = left->next;
        }
        curr = curr->next;
    }

    if (left)
        curr->next = left;
    if (right)
        curr->next = right;

    return dummy;
}

Implement Natural Sort

這邊採用 sourcefrog/natsort 所實作的 natural sort 的 compare function - strnatcmp。從原專案中取得 strnatcmp.[ch] 後放入 lab0-c 的專案中。

這邊為了明確的表示是使用別人的專案,原本是想採用 git submodule 的方式建構的,但是考慮到若要對 natsort 專案做改動,勢必要 push 回去到原專案中,或者在當初 submodule init 時只能指定 fork 後的專案,但都相對的要麻煩許多。

不知道有什麼方式可以除了把有 reference 到的專案之相關訊息寫進 commit message 以外達到明確的使用到以及隨時跟新進度在別人的專案上。

MetalheadKen

這是練習撰寫清晰的 git commit messages 的好機會

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

接著把在排序中採用到的 strcmp 一律改成 strnatcmp

  • qtest.c
bool do_sort(int argc, char *argv[])
{
...
    for (list_ele_t *e = q->head; e && --cnt; e = e->next) {
        /* Ensure each element in ascending order */
        /* FIXME: add an option to specify sorting order */
        if (strnatcmp(e->value, e->next->value) > 0) {
            report(1, "ERROR: Not sorted in ascending order");
            ok = false;
            break;
...
}
  • queue.c
static list_ele_t *merge(list_ele_t *left, list_ele_t *right)
{
...
    if (strnatcmp(left->value, right->value) > 0) {
        dummy = right;
        right = right->next;
    } else {
...
    while (left && right) {
        if (strnatcmp(left->value, right->value) > 0) {
            curr->next = right;
            right = right->next;
        } else {
...
}

更改 Makefile,使其可以編譯 strnatcmp.c

  • Makefile
...
OBJS := qtest.o report.o console.o harness.o queue.o \
        random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \
        natsort/strnatcmp.o
...

為了對現有的 natsort 做測試,增加新的 trace file

  • traces/trace-18-natsort.cmd
# Test operation of natural sort
option fail 0
option malloc 0
new
ih fred
ih pic2
ih pic100a
ih pic120
ih pic121
ih jane
ih tom
ih pic02a
ih pic3
ih pic4
ih 1-20
ih pic100
ih pic02000
sort
  • script/driver.py
--- a/scripts/driver.py
+++ b/scripts/driver.py
@@ -35,7 +35,8 @@ class Tracer:
         14: "trace-14-perf",
         15: "trace-15-perf",
         16: "trace-16-perf",
-        17: "trace-17-complexity"
+        17: "trace-17-complexity",
+        18: "trace-18-natsort"
     }
 
     traceProbs = {
@@ -55,10 +56,11 @@ class Tracer:
         14: "Trace-14",
         15: "Trace-15",
         16: "Trace-16",
-        17: "Trace-17"
+        17: "Trace-17",
+        18: "Trace-18"
     }
 
-    maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5]
+    maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 6]
 
     RED = '\033[91m'
     GREEN = '\033[92m'

trace-16-perf.cmd 採用到的 RAND 提供英文與數字的亂數排序,其實只要新增數字到 charset 就可以了

  • qtest.c
--- a/qtest.c
+++ b/qtest.c
@@ -60,7 +60,7 @@ static int string_length = MAXSTRING;
 
  #define MIN_RANDSTR_LEN 5
  #define MAX_RANDSTR_LEN 10
- static const char charset[] = "abcdefghijklmnopqrstuvwxyz";
+ static const char charset[] = "abcdefghijklmnopqrstuvwxyz0123456789";

執行 make test 後出現以下錯誤,依錯誤訊息可得知為超時

...
ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient
...

這是因為在 natsort 中所使用的 strnatcmp 比 glibc 所提供的 strcmp 的執行時間還要長的緣故。為了解決這個問題,這邊採用的方法為新增一個新的 command option time,藉由 option time <limit=1> 可以指定在這次測試中所會參考的 time limit。

  • qtest.c
static void console_init()
{
...
    add_param("fail", (void *) &fail_limit, sizeof(fail_limit),
              "Number of times allow queue operations to return false", NULL);
    add_param("time", (void *) &time_limit, sizeof(time_limit),
              "Maximum number of time limit", NULL);
}

為了要讓 time_limit 可以動態更改,必須把 time_limit 改為 global variable

  • harness.c
--- a/harness.c
+++ b/harness.c
@@ -52,7 +52,7 @@ static bool noallocate_mode = false;
  static bool error_occurred = false;
  static char *error_message = "";

- static int time_limit = 1;
+ int time_limit = 1;

  /*
   * Data for managing exceptions
  • harness.h
--- a/harness.c
+++ b/harness.c;
@@ -25,6 +25,9 @@ size_t allocation_check();
  /* Probability of malloc failing, expressed as percent */
  extern int fail_probability;

+ /* Amount of time limit */
+ extern int time_limit;
+ 
  /*
   * Set/unset cautious mode.
   * In this mode, makes extra sure any block to be freed is currently allocated.;

加入 option time 到會因為 natsort 的緣故而造成 Time limit exceeded 的 trace 中

  • traces/trace-15-perf.cmd
--- a/traces/trace-15-perf.cmd
+++ b/traces/trace-15-perf.cmd
@@ -1,6 +1,8 @@
  # Test performance of insert_tail, size, reverse, and sort
  option fail 0
  option malloc 0
+ # For test natural sort, we extend the amount of time limit
+ option time 3
  new
  ih dolphin 1000000
  it gerbil 1000000
...

Address Sanitizer

執行 make SANITIZER=1 test 後出現下列錯誤

...
# Test if q_insert_tail and q_size is constant time complexity   
=================================================================
==26442==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55ba43b1e9c0 at pc 0x55ba4390f86e bp 0x7ffe34a65e40 sp 0x7ffe34a65e30
READ of size 4 at 0x55ba43b1e9c0 thread T0                       
    #0 0x55ba4390f86d in do_option_cmd /home/kendai/git/lab0-c/console.c:368
    #1 0x55ba4390e48a in interpret_cmda /home/kendai/git/lab0-c/console.c:220
    #2 0x55ba4390ee8e in interpret_cmd /home/kendai/git/lab0-c/console.c:243
    #3 0x55ba4390fb77 in cmd_select /home/kendai/git/lab0-c/console.c:569
    #4 0x55ba4390ff94 in run_console /home/kendai/git/lab0-c/console.c:628
    #5 0x55ba4390d0ad in main /home/kendai/git/lab0-c/qtest.c:772
    #6 0x7f0b512e6b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #7 0x55ba4390a809 in _start (/home/kendai/git/lab0-c/qtest+0x6809)
                                
0x55ba43b1e9c1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:20:6' (0x55ba43b1e9c0) of size 1
  'simulation' is ascii string ''
SUMMARY: AddressSanitizer: global-buffer-overflow /home/kendai/git/lab0-c/console.c:368 in do_option_cmd
...
+++ TESTING trace trace-17-complexity:
---     trace-17-complexity     0/5
---     TOTAL           95/100

先用 GDB 定位問題

$ gdb -q --args ./qtest -f traces/trace-17-complexity.cmd
...
(gdb) b console.c:368
(gdb) r
...
Breakpoint 1, do_option_cmd (argc=3, argv=0x555555763ce0) at console.c:368
368                     int oldval = *plist->valp;
(gdb) p *plist
$1 = {name = 0x55555555bd47 "simulation", valp = 0x55555575e520 <simulation>, documentation = 0x55555555bd2c "Start/Stop simulation mode", setter = 0x0, 
  next = 0x5555557613e0}
(gdb) whatis (*plist).valp
type = int *
(gdb) whatis simulation
type = _Bool
(gdb) p sizeof(_Bool)
$2 = 1

從這邊可以知道由於 simulationbool 轉成 int * 並存入 valp,之後當直接 dereference 的時候就會因為要取出 4 bytes 的資料而存取到不該存取的地方故造成 buffer overflow

為了要解決 global-buffer-overflow,我們需要把轉型成 int * 的地方改為 void *,根據 C99 規格,pointer to void 做為一個 generic pointer 可以把原本的 object type 轉成 pointer to void 再轉回來。藉由於此,我們可以透過 pointer to void 來做到不同的 data type 的存取

C99 Spec 6.3.2.3:1 [Pointers]

A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

除了 void * 來儲存各種的 data type 以外,還需要一個 variable 來儲存原本的 object 所佔的大小。最後,根據上面所述,我們需要修改以下動作來解決所需的問題

  • 修改 struct PELE 來儲存 void * 和原始的 object size
  • 修改對應的 function argument 來完成儲存 void * 和原始的 object size 的動作
  • 新增一個新的 function 來讓 void * 可以透過原始的 object size 來轉型回原來的 data type

修改如下

  • console.h
--- a/console.h
+++ b/console.h
@@ -23,13 +23,14 @@ struct CELE {
 };
 
  /* Optionally supply function that gets invoked when parameter changes */
- typedef void (*setter_function)(int oldval);
+ typedef void (*setter_function)(void *oldval, int oldsize);
 
  /* Integer-valued parameters */
  typedef struct PELE param_ele, *param_ptr;
  struct PELE {
      char *name;
-     int *valp;
+     void *valp;
+     int valsize;
      char *documentation;
      /* Function that gets called whenever parameter changes */
      setter_function setter;
@@ -44,7 +45,8 @@ void add_cmd(char *name, cmd_function operation, char *documentation);
 
  /* Add a new parameter */
  void add_param(char *name,
-                int *valp,
+                void *valp,
+                int valsize,
                 char *doccumentation,
                 setter_function setter);
  • console.c
--- a/console.c
+++ b/console.c
@@ -82,6 +82,8 @@ static void pop_file();
 
  static bool interpret_cmda(int argc, char *argv[]);
 
+ static uint32_t get_plist_valp(param_ptr p);
+ 
  /* Initialize interpreter */
  void init_cmd()
  {
@@ -99,11 +101,14 @@ void init_cmd()
      add_cmd("log", do_log_cmd, " file           | Copy output to file");
      add_cmd("time", do_time_cmd, " cmd arg ...    | Time command execution");
      add_cmd("#", do_comment_cmd, " ...            | Display comment");
-     add_param("simulation", (int *) &simulation, "Start/Stop simulation mode",
+     add_param("simulation", (void *) &simulation, sizeof(simulation),
+               "Start/Stop simulation mode", NULL);
+     add_param("verbose", (void *) &verblevel, sizeof(verblevel),
+               "Verbosity level", NULL);
+     add_param("error", (void *) &err_limit, sizeof(err_limit),
+               "Number of errors until exit", NULL);
+     add_param("echo", (void *) &echo, sizeof(echo), "Do/don't echo commands",
               NULL);
-     add_param("verbose", &verblevel, "Verbosity level", NULL);
-     add_param("error", &err_limit, "Number of errors until exit", NULL);
-     add_param("echo", (int *) &echo, "Do/don't echo commands", NULL);
 
      init_in();
      init_time(&last_time);
@@ -130,7 +135,8 @@ void add_cmd(char *name, cmd_function operation, char *documentation)
 
  /* Add a new parameter */
  void add_param(char *name,
-                int *valp,
+                void *valp,
+                int valsize,
                 char *documentation,
                 setter_function setter)
  {
@@ -144,6 +150,7 @@ void add_param(char *name,
      param_ptr ele = (param_ptr) malloc_or_fail(sizeof(param_ele), "add_param");
      ele->name = name;
      ele->valp = valp;
+     ele->valsize = valsize;
      ele->documentation = documentation;
      ele->setter = setter;
      ele->next = next_param;
@@ -248,6 +255,20 @@ static bool interpret_cmd(char *cmdline)
      return ok;
  }
 
+ static uint32_t get_plist_valp(param_ptr p)
+ {
+     switch (p->valsize) {
+     case 1:
+         return (*(uint8_t *) p->valp);
+     case 2:
+         return (*(uint16_t *) p->valp);
+     default:
+         return (*(uint32_t *) p->valp);
+     }
+ }
+ 
  /* Set function to be executed as part of program exit */
  void add_quit_helper(cmd_function qf)
  {
@@ -303,7 +324,7 @@ static bool do_help_cmd(int argc, char *argv[])
      param_ptr plist = param_list;
      report(1, "Options:");
      while (plist) {
-         report(1, "\t%s\t%d\t%s", plist->name, *plist->valp,
+         report(1, "\t%s\t%d\t%s", plist->name, get_plist_valp(plist),
                 plist->documentation);
          plist = plist->next;
      }
@@ -342,7 +363,7 @@ static bool do_option_cmd(int argc, char *argv[])
          param_ptr plist = param_list;
          report(1, "Options:");
          while (plist) {
-             report(1, "\t%s\t%d\t%s", plist->name, *plist->valp,
+             report(1, "\t%s\t%d\t%s", plist->name, get_plist_valp(plist),
                     plist->documentation);
              plist = plist->next;
          }
@@ -365,10 +386,10 @@ static bool do_option_cmd(int argc, char *argv[])
          param_ptr plist = param_list;
          while (!found && plist) {
              if (strcmp(plist->name, name) == 0) {
-                 int oldval = *plist->valp;
-                 *plist->valp = value;
+                 void *oldvalp = plist->valp;
+                 memcpy(oldvalp, &value, plist->valsize);
                  if (plist->setter)
-                     plist->setter(oldval);
+                     plist->setter(oldvalp, plist->valsize);
                  found = true;
              } else
                  plist = plist->next;
  • qtest.c
--- a/qtest.c
+++ b/qtest.c
@@ -98,11 +98,11 @@ static void console_init()
      add_cmd("size", do_size,
              " [n]            | Compute queue size n times (default: n == 1)");
      add_cmd("show", do_show, "                | Show queue contents");
-     add_param("length", &string_length, "Maximum length of displayed string",
-               NULL);
-     add_param("malloc", &fail_probability, "Malloc failure probability percent",
-               NULL);
-     add_param("fail", &fail_limit,
+     add_param("length", (void *) &string_length, sizeof(string_length),
+               "Maximum length of displayed string", NULL);
+     add_param("malloc", (void *) &fail_probability, sizeof(fail_probability),
+               "Malloc failure probability percent", NULL);
+     add_param("fail", (void *) &fail_limit, sizeof(fail_limit),
               "Number of times allow queue operations to return false",  NULL);
  }

Valgrind + Massif Heap Profiler

安裝 massif-visualizer

$ sudo apt install massif-visualizer

使用 valgrind + massif 進行測試

$ valgrind -q --tool=massif --stack=yes --time-unit=i --massif-out-file=massif.out ./qtest
...
cmd> option fail 0
cmd> option malloc 0
cmd> new
cmd> ih RAND 10
cmd> reverse
cmd> sort
cmd> rh
... 總共執行 rh 10 次
cmd> quit

使用 massif-visualizer 視覺化

$ massif-visualizer massif.out

Dudect 相關研究

select 系統呼叫

自動評分系統相關補充

如何呼叫使用者輸入命令對應的函式

一開始使用者執行 qtest 後,輸入命令時,會先執行 console.c 中的 interpret_cmd 並呼叫 parse_args 來將輸入的字串以空白為區隔,轉換為 argcargv,如下

bool interpret_cmd(char *cmdline)
{
    int argc;
    ...
    char **argv = parse_args(cmdline, &argc);
    bool ok = interpret_cmda(argc, argv);
    ...
    return ok;
}

之後在函式 interpret_cmda 中,利用迴圈走訪 singly linked list,並利用 strcmp 找出函式相對應的字串。可以發現在 qtest 中使用函式指標的方式可以很簡單方便的維護使用者可輸入的命令

static bool interpret_cmda(int argc, char *argv[])
{
    if (argc == 0)
        return true;
    /* Try to find matching command */
    cmd_ptr next_cmd = cmd_list;
    bool ok = true;
    while (next_cmd && strcmp(argv[0], next_cmd->name) != 0)
        next_cmd = next_cmd->next;
    if (next_cmd) {
        ok = next_cmd->operation(argc, argv);
        if (!ok)
            record_error();
    } else {
        report(1, "Unknown command '%s'", argv[0]);
        record_error();
        ok = false;
    }
    return ok;
}

在寫好 trace file 後程式如何運作

qtest 中有提供參數 -f <filename> 來讀取 trace file。當使用者輸入命令後,在 qtest.c 中的 main 函式會呼叫 getopt 來解析使用者先前輸入命令的後面所添加的參數,節錄如下

int main(int argc, char *argv[])
{
    ...
    while ((c = getopt(argc, argv, "hv:f:l:")) != -1) {
        switch (c) {
        case 'h':
            usage(argv[0]);
            break;
        case 'f':
            strncpy(buf, optarg, BUFSIZE);
            buf[BUFSIZE - 1] = '\0';
            infile_name = buf;
            break;
        case 'v':
            level = atoi(optarg);
            break;
        case 'l':
            strncpy(lbuf, optarg, BUFSIZE);
            buf[BUFSIZE - 1] = '\0';
            logfile_name = lbuf;
            break;
        default:
            printf("Unknown option '%c'\n", c);
            usage(argv[0]);
            break;
        }
    }
    ...
}

getopt 是一個可以很方便的去 parse 從 command line 傳過來的參數的函式,該函式會一直讀取參數直到沒有更多的參數可以讀時才回傳 -1

  • function prototype:
    int getopt(int argc, char *argv[], const char *optstring)
  • int argc: the argument count passed to the main() function
  • char *argv[]: the argument array passed to the main() function
  • const char *optstring: a string containing the legitimate option characters
    • 一個字元︰ 一個可用的 option
    • 一個字元搭配冒號︰該 option 後有額外帶參數
    • 一個字元當配兩個冒號︰該 option 後帶的參數是可選的 (此為 GNU extension)
  • getopt 回傳數值後
    • 成功回傳 option 字元,無更多 option 可讀則回傳 -1,在 optstring 找不到 option 字元則回傳 ?,在 optstring 設定需帶參數的 option 沒有帶參數的話則回傳 :
    • optarg 變數儲存該 option 後面所帶的參數
    • optind 變數儲存下次呼叫 getopt 時要處理的 argv 的 index
    • optopt 變數儲存當 getopt 找不到 option 字元或是缺少某些參數時的該 option
  • 注意事項
    • getopt 是 POSIX standard 但是並不是 C standard 並在 glibc 實作中的某些行為是 GNU extensions,這點需要特別注意
  • 參考資料

console.c 中定義以下 structure 來儲存每個要使用的檔案的 file descritptor

#define RIO_BUFSIZE 8192
typedef struct RIO_ELE rio_t, *rio_ptr;

struct RIO_ELE {
    int fd;                /* File descriptor */
    int cnt;               /* Unread bytes in internal buffer */
    char *bufptr;          /* Next unread byte in internal buffer */
    char buf[RIO_BUFSIZE]; /* Internal buffer */
    rio_ptr prev;          /* Next element in stack */
};

RIO 在 CS:APP 第 10.5 章提及,全名為 Robust I/O
Ref: System-Level I/O (preview version)

並在 push_file 函式中把開啟檔案的 file descriptor 放進 struct RIO_ELE 中的 fd,若無指定開啟檔案的路徑則選擇為標準輸入 (也就是 stdin)

static bool push_file(char *fname)
{
    int fd = fname ? open(fname, O_RDONLY) : STDIN_FILENO;
    if (fd < 0)
        return false;
    if (fd > fd_max)
        fd_max = fd;
    rio_ptr rnew = malloc_or_fail(sizeof(rio_t), "push_file");
    rnew->fd = fd;
    rnew->cnt = 0;
    rnew->bufptr = rnew->buf;
    rnew->prev = buf_stack;
    buf_stack = rnew;
    return true;
}

如何偵測記憶體洩漏

harness.[ch] 中藉由替換掉 mallocfree 的實作方式使得可以檢查到 allocation error

harness.h 中可以見到 qtest 使用 macro 來把 mallocfree 替換成 test_malloctest_free

/* Tested program use our versions of malloc and free */
#define malloc test_malloc
#define free test_free

而從 test_malloc 所配置出來的記憶體都會被紀錄在 struct BELE 中,structure 在 harness.c 中定義如下並以 doubly-linked list 來串接節點

typedef struct BELE {
    struct BELE *next;
    struct BELE *prev;
    size_t payload_size;
    size_t magic_header; /* Marker to see if block seems legitimate */
    unsigned char payload[0];
    /* Also place magic number at tail of every block */
} block_ele_t;

在 structure 中 payload[0] 是一個 struct hack,在 GCC manual 中稱呼為 Arry of Length Zero,藉由這種方式可以做到讓 structure 中的成員可以動態的配置記憶體又不用因為多次呼叫 malloc 造成記憶體碎片化

用法如下

block_ele_t *new_block = malloc(array_size + sizeof(block_ele_t));

這樣的手法其實就是利用 GNU C compiler 預設不會去檢查 Array index out of bounds 的問題來做到的,但是使用這個技巧需要思考以下的問題

  • Flexible array members 需為 incomplete type,並且 sizeof 不能用在 incomplete type 上
  • Flexible array members 需宣告在最後一個 non-empty structure 成員上
  • Flexible array members 不能單獨出現,至少需一個成員在
  • Flexible array members 本身不可以宣告或包含在 union 中
  • Clang 支援該 GNU extension,但不支援 static initialization

    Note that clang does support flexible array members (arrays with zero or unspecified size at the end of a structure).

    clang does not support static initialization of flexible array members. This appears to be a rarely used extension, but could be implemented pending user demand.

另外在 ISO C90 可以使用 Array of Length One 來做到同樣的事情,而從 ISO C99 開始支援 Flexible Array Members,請參照

  • Wikipedia
  • C99 §6.7.2.1, item 16

    As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.

如何配置記憶體

test_malloc 把經由 malloc 配置出來的新的記憶體空間,將其串連到 doubly linked list allocated 中,其中配置出記憶體空間的寫法如下

block_ele_t *new_block = malloc(size + sizeof(block_ele_t) + sizeof(size_t));

可以看到除了 sizesizeof(block_ele_t) 以外,還多了 sizeof(size_t),這是因為在 qtest 中,藉由在尾端多配置出一個空間並填入 magic number 來查看若當該數值有被更動過的話,即表示出現 memory corruption (array access out of bounds),以及偵測是否是由 test_malloc 配置出來的記憶體空間,而前一個成員 magic_header 也是基於此用途的。harness.c 中的 test_malloc 節錄如下

void *test_malloc(size_t size)
{
    if (fail_allocation()) {
        report_event(MSG_WARN, "Malloc returning NULL");
        return NULL;
    }
    block_ele_t *new_block =
        malloc(size + sizeof(block_ele_t) + sizeof(size_t));
    ...
    new_block->magic_header = MAGICHEADER;
    new_block->payload_size = size;
    *find_footer(new_block) = MAGICFOOTER;
    void *p = (void *) &new_block->payload;
    memset(p, FILLCHAR, size);
    new_block->next = allocated;
    new_block->prev = NULL;
    if (allocated)
        allocated->prev = new_block;
    allocated = new_block;
    allocated_count++;
    return p;
}

fail_allocation 中,實作了當亂數產生出來的數小於 0.01 乘上預先設定好的失敗機率 (fail_probability) 的話,即 malloc failure,為何需要如此實作?

MetalheadKen

回去複習機率統計,思考上述做法是否合理,若有疑慮就提出論證和實驗

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

其中為了方便取得到 payload 的尾端來指派 magic number (MAGICFOOTER),qtest 實作了 find_footer 函式,如下

static size_t *find_footer(block_ele_t *b)
{
    size_t *p = 
        (size_t *) ((size_t) b + b->payload_size + sizeof(block_ele_t));
    return p;
}

藉由傳進來的 block_ele_t * 的開頭,在加上 payload size 和該 structure 本身的大小來取得到 payload 的尾端