2019q1 Homework1 (lab0)

contributed by <grant7163>

tags: `sysprog2019_q1`

實驗目標

queue data structure as below










/* Linked list element (You shouldn't need to change this) */
typedef struct ELE {
  char *value; /* Pointer to array holding string.*/
  struct ELE *next;
} list_ele_t;

/* Queue structure */
typedef struct {
  list_ele_t *head; /* Linked list of elements */
} queue_t

queue_t* 內的 head 指向一個 queue 的開頭，queue 分爲兩種類型
- queue_t *X = NULL; 則爲 Null queue
- queue_t *X = variable;
  X->head = NULL; 則爲 Empty queue
實作如下function並加入一些 queue = NULL 的情況處理。
- q new:
- q free:
- q insert head:
- q remove head:
- q insert tail: need to O(1) time
- q size: need to O(1) time
- q reverse:

實驗環境

$ lscpu
架構：               x86_64
CPU 作業模式：       32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              8
On-line CPU(s) list: 0-7
每核心執行緒數：     2
每通訊端核心數：     4
Socket(s):           1
NUMA 節點：          1
供應商識別號：       GenuineIntel
CPU 家族：           6
型號：               158
Model name:          Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
製程：               9
CPU MHz：            900.044
CPU max MHz:         3800.0000
CPU min MHz:         800.0000
BogoMIPS:            5616.00
虛擬：               VT-x
L1d 快取：           32K
L1i 快取：           32K
L2 快取：            256K
L3 快取：            6144K
NUMA node0 CPU(s)：  0-7
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp flush_l1d

$ cat /proc/version
Linux version 4.15.0-45-generic (buildd@lgw01-amd64-031) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"

Makefile

先了解 Makefile 語法和示範

前置測試

從老師的 github 上 fork 完 lab0-c 之後，馬上先來測試看看。

到目標資料夾下，輸入如下指令。
$make
$./qtest

結果一推錯誤訊息，不過發現一個有趣的地方，原來老師的操作範例有先加入字串處理的部份。

cmd>new
cmd>new
q = []
cmd>ih 2
cmd>ih 2
q = [ERROR: Segmentation fault occurred.  You dereferenced a NULL or invalid pointer
 ... ]
cmd>size
cmd>size
ERROR:  Computed queue size as 0, but correct value is 1
q = [ERROR: Segmentation fault occurred.  You dereferenced a NULL or invalid pointer
 ... ]
cmd>rh
cmd>rh
ERROR: Failed to store removed value
q = []
cmd>rhq
cmd>rhq
Warning: Calling remove head on empty queue
ERROR: Segmentation fault occurred.  You dereferenced a NULL or invalid pointer
Removal failed
q = []
cmd>time
cmd>time
Elapsed time = 60.328, Delta time = 60.328
cmd>free
cmd>free
q = NULL
ERROR: Freed queue, but 1 blocks are still allocated
cmd>
cmd>
cmd>quit
cmd>quit
Freeing queue
Attempt to free NULL
ERROR: Freed queue, but 1 blocks are still allocated

開發紀錄

queue_t

作業要求 insert queue tail 與 queue size 搜尋的複雜度是 O(1)。
目前的想法是新增 tail 與 size 分別作紀錄，不過在新增或刪除 node 時，須額外對tail 作移動與 size 作加減的動作:






/* Queue structure */
typedef struct {
    list_ele_t *head;  /* pointer to Linked list head of elements */
    list_ele_t *tail;  /* pointer to Linked list tail of elements */
    unsigned int size; /* queue len */
} queue_t;

q_new()

建立 queue 時，加入 malloc 失敗時回傳 NULL 的處理與增加 tail and size 的 init。














queue_t *q_new()
{
    queue_t *q = malloc(sizeof(queue_t));

    /* What if malloc returned NULL? */
    if(q == NULL)
       return NULL;

    q->head = NULL;
    q->tail = NULL;
    q->size = 0;

    return q;
}

q_free()

在 qtest 跳出程式時出現會 Attempt to free NULL 的狀況，後來去 trace qtest.c 中發現 queue_quit() 裡會執行到 free 的動作，需在 free 中加入先判斷 queue 是否為 NULL，否則會造成 double free 的狀況。
想法是從 queue 的 head 一邊依序搜尋一邊釋放 head 裡的 value 與 head 的記憶體空間，最後在釋放整個 queue 的記憶體空間。



















void q_free(queue_t *q)
{
    /* How about freeing the list elements and the strings? */
    if(q == NULL)
       return;

    list_ele_t *temp = q->head;

    while(q->head != NULL) {
       /* Free queue structure */
       free(q->head->value);
       temp = q->head->next;
       free(q->head);
       q->head = temp;
    }
        
    /* Free queue structure */
    free(q);
}

q_insert_head()

判斷 queue 是否為 NULL。
確認是否有 malloc 失敗，有失敗的話則需把前面有 alloc 的記憶體空間釋放。
使用 strcpy 將 dest string 插入到 queue 並一併更新 queue size。




































bool q_insert_head(queue_t *q, char *s)
{
    unsigned int len = 0;
    /* What should you do if the q is NULL? */
    if(q == NULL || s == NULL) 
       return false;

    list_ele_t *newh = malloc(sizeof(list_ele_t));
            
    /* What if malloc returned NULL? */
    if(newh == NULL)
       return false;

    len = strlen(s) + 1;

    /* Don't forget to allocate space for the string and copy it */
    /* What if either call to malloc returns NULL? */
    newh->value = malloc(sizeof(char) * len);

    if(newh->value != NULL) {
       strncpy(newh->value, s, len);

       newh->next = q->head;
       q->head = newh;

       if(q->size == 0)
          q->tail = newh;

       q->size += 1;
       return true;
    }

    free(newh);
           
    return false;
}

q_insert_tail()

同 q_insert_head 類似作法。





































bool q_insert_tail(queue_t *q, char *s)
{
    unsigned int len = 0;
    /* You need to write the complete code for this function */
    /* Remember: It should operate in O(1) time */
    if(q == NULL || s == NULL) 
       return false;

    list_ele_t *newh = malloc(sizeof(list_ele_t));

    if(newh == NULL)
       return false;

    len = strlen(s) + 1;

    newh->value = malloc(sizeof(char) * len);

    if(newh->value != NULL) {
       strncpy(newh->value, s, len);

       if(q->size) { 
          newh->next = NULL;
          q->tail->next = newh;
       } else {
            newh->next = q->head;
            q->head = newh;
       }

       q->tail = newh;
       q->size += 1;
       return true;
    }

    free(newh);
        
    return false;
}

q_remove_head()

判斷 queue 是否為 NULL。
使用 strncpy 將 dest string 移除並複製到 sp，在更新 queue size。

依據 linux man

使用 strcpy '\0' 也會一起複製到 dest。
使用 strncpy 不保證 '\0' 也會一起複製到 dest。

$ man strcpy

The strcpy() function copies the string pointed to by src, including the terminating null byte ('\0'), to the buffer pointed to by dest.
The strncpy if there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.























bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
    unsigned int len = 0;
    /* You need to fix up this code. */
    if(q == NULL || q->head == NULL || sp == NULL)
       return false;

    len = strlen(q->head->value) + 1;

    if(len > bufsize)
        len = bufsize;

    strncpy(sp, q->head->value, len);
    sp[bufsize - 1] = '\0';

    list_ele_t *temp = q->head;
    q->head = q->head->next;
    free(temp->value);
    free(temp);
    q->size--;

    return true;
}

q_size()

直接 return queue 的長度。






int q_size(queue_t *q)
{
    /* You need to write the code for this function */
    /* Remember: It should operate in O(1) time */
    return q ? q->size : 0;
}

q_reverse()

主要為三個 pointer 分別紀錄前，中，後 node 的位置。



















void q_reverse(queue_t *q)
{
    /* You need to write the code for this function */
    if(q == NULL || q->head == NULL)
       return;

    list_ele_t *current = q->head;
    list_ele_t *previous = NULL;
    q->tail = q->head;

    while(current != NULL) {
       current = q->head->next;
       q->head->next = previous;
       previous = q->head;
       q->head = current;
    }

    q->head = previous;
}

q_sort()










void q_sort(queue_t *q)
{
    if(!q || !q->head)
        return;

    q->head = merge_sort(q->head);

    while(q->tail->next)
        q->tail = q->tail->next;
}

實作結果

使用 Naetw 同學提供的指令，額外測試 memory leak。

$ make valgrind
cp qtest /tmp/qtest.Os1C87
chmod u+x /tmp/qtest.Os1C87
sed -i "s/alarm/isnan/g" /tmp/qtest.Os1C87
scripts/driver.py -p /tmp/qtest.Os1C87 --valgrind
---	Trace		Points
.
.
.
# Test of malloc failure on insert_tail
---	trace-12-malloc	7/7
+++ TESTING trace trace-13-perf:
# Test performance of insert_tail
---	trace-13-perf	7/7
+++ TESTING trace trace-14-perf:
# Test performance of size
---	trace-14-perf	7/7
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, and reverse
---	trace-15-perf	7/7
---	TOTAL		100/100

Address Sanitizer

開啟 Address Sanitizer 分析工具，幫助釐清記憶體操作問題。

$ make SANITIZER=1
$ make test

發現在 trace17 有記憶體操作錯誤的問題，從如下訊息指出 console.c 中的 simulation 變數發生 global-buffer-overflow 。

==37297==ERROR: AddressSanitizer: global-buffer-overflow on address 0xaaaaad2c8660 at pc 0xaaaaad2a3de4 bp 0xffffc8e488f0 sp 0xffffc8e48900
READ of size 4 at 0xaaaaad2c8660 thread T0

...

0xaaaaad2c8661 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:21:6' (0xaaaaad2c8660) of size 1

...

接著從 console.c 中收尋 simulation 變數在哪裡被用到，發現在 init_cmd() 函試中會呼叫 add_param() 函式且該函式會將 simulation 變數強制轉型為 int* 型別帶入，由於當初 simulation 是以 bool 型別來做宣告 (1 byte)，但在這裡是以 int 型別去讀取 simulation 的記憶體位址 (4 bytes)，造成在讀取的時候超出可讀取的範圍。

















void init_cmd()
{
    cmd_list = NULL;
    param_list = NULL;
    err_cnt = 0;
    quit_flag = false;

    ...
    
    add_param("simulation", (int *) &simulation, "Start/Stop simulation mode",
              NULL);
    add_param("verbose", &verblevel, "Verbosity level", NULL);
    add_param("error", &err_limit, "Number of errors until exit", NULL);
    add_param("echo", (int *) &echo, "Do/don't echo commands", NULL);

    ...
}

qtest 的行為和裡頭的技巧

在剛開始 main 中使用 getopt 解析執行程式中的 commandline arguement，在做出相對應的 function。

Each option character may be followed by a colon ( : ), indicating that this option expects an argument.
If an option was found, the option character is returned as the function result. If the end of the option list was reached, getopt() returns –1. If an option has an argument, getopt() sets the global variable optarg to point to that argument.
























    while ((c = getopt(argc, argv, "hv:f:l:")) != -1) {
        switch (c) {
        case 'h':
            usage(argv[0]);
            break;
        case 'f':
            strncpy(buf, optarg, BUFSIZE);
            buf[BUFSIZE - 1] = '\0';
            infile_name = buf;
            break;
        case 'v':
            level = atoi(optarg);   // char transfer to int
            break;
        case 'l':
            strncpy(lbuf, optarg, BUFSIZE);
            buf[BUFSIZE - 1] = '\0';
            logfile_name = lbuf;
            break;
        default:
            printf("Unknown option '%c'\n", c);
            usage(argv[0]);
            break;
        }
    }

接下來是初始化的部份，在 qtest.c main 中主要初始化的函式有 queue_init， init_cmd， console_init。

queue_init

初始化一個測試用的 link list q，紀錄錯誤次數的 fail_count，並註冊二個 signal handler。

主要目的為當觸發如下條件時，可以在 signal handler 裡作想要觀察的訊息或提醒使用者

記憶體存取錯誤
設定的計時器到期

$ man 7 signal

Signal Value Action Comment
SIGALRM 14 Term Timer signal from alarm(2)
SIGSEGV 11 Core Invalid memory reference







static void queue_init()
{
    fail_count = 0;
    q = NULL;
    signal(SIGSEGV, sigsegvhandler);
    signal(SIGALRM, sigalrmhandler);
}

簡單作個小實驗

註冊一個 signal handler，當有 segfault 發生時印出訊息。
宣告一個 pointer，並 dereference 為0。














void signaltest_handler(int signal_num)
{
    printf("memory error signal %d \n", signal_num);
}

int main()
{ 
    int *p = NULL;
    signal(SIGSEGV, signaltest_handler);
  
    *p = 0;

    return 0;
}

確實當有 segfault 發生時有印出訊息來，不過有趣的是會進入一個無窮迴圈的狀態。
後來去查閱 linux man，看到一段說明從 SIGSEGV signal handler 返回是未定義行為。
$ man 2 signal

According to POSIX, the behavior of a process is undefined after it ignores a SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(2) or raise(3).

在這篇 stackoverflow 中看到一段說明，在 x86 平台下， cpu 會重新執行這道錯誤指令的關係。

memory error signal 11 
memory error signal 11 
memory error signal 11 
memory error signal 11 
memory error signal 11 
memory error signal 11 
memory error signal 11 
memory error signal 11 
memory error signal 11 
memory error signal 11

在 signaltest_handler 中加入 exit 讓程式直接跳出結束。





void signaltest_handler(int signal_num)
{
    printf("memory error signal %d \n", signal_num);
    exit(1);
}

確實就沒有在無窮迴圈的狀態了。

memory error signal 11

在 qtest.c 的 signal handler 中最後會調用到 siglongjmp(env, 1) 這個函式。
查閱 linux man 了解相關使用說明

sigsetjmp() and siglongjmp() perform nonlocal gotos
sigsetjmp(env, 1) 中會把當下的 stack pointer, the instruction pointer, possibly the values of other registers and the signal mask 存在 env
siglongjmp() 會依據存在 env 中的資料，來去執行呼叫 sigsetjmp() 的地方
直接呼叫 sigsetjmp() 會返回 0, 若是從 siglongjmp(sigjmp_buf env, int val) 返回的話則會返回指定的 val
















void sigsegvhandler(int sig)
{
    trigger_exception(
        "Segmentation fault occurred.  You dereferenced a NULL or invalid "
        "pointer");
}

void trigger_exception(char *msg)
{
    error_occurred = true;
    error_message = msg;
    if (jmp_ready)
        siglongjmp(env, 1);
    else
        exit(1);
}

do_XXX 系列的函式，以 do_new() 為例 :
exception_setup() 主要作三件事

呼叫 sigsetjmp() 來儲存之後 siglongjmp() 要跳回來的地方
設定四個變數來開啟/清除計時器 (jmp_ready, limit_time, time_limit, time_limited)
- jmp_ready : 鎖定在處理 exception handling 前後
- time_limit ：設定計時器時間
- limit_time, time_limited : 控制計時器開關
錯誤訊息處理

exception_cancel() 主要作二件事

清除計時器
清除 exception handler 相關變數

當執行 exception_setup() 之後，若在 do_new() 中發生記憶體存取錯誤或超時的話，就會觸發如下流程：
signal handler() -> siglongjmp() -> sigsetjmp() -> report_event() -> exception_cancel()
使用 sigsetjmp() and siglongjmp() 這二個函式也可以防止觸發 SIGSEGV signal handler 條件之後會進入無窮迴圈的狀態。

if(exception_setup(true))
  q = q_new();
exception_cancel();

使用 qtest.c 的 exception 機制來作實驗觀察一些全域/區域/暫存器/ volatile 變數的變化。










































int globanum = 11;
static jmp_buf env;
static volatile sig_atomic_t jmp_ready = false;
static _Bool time_limited = false;

_Bool exception_setup(_Bool limit_time);
void exception_cancel();

void signaltest_handler(int signal_num)
{
    printf("memory error signal %d \n", signal_num);
    if (jmp_ready)
        siglongjmp(env, 1);
    else
        exit(1);
}

int main(int argc, char *argv[])
{
    int *p = NULL;
    int loacnum = 22;
    register int reg = 33;
    volatile int volnum = 44;

    signal(SIGSEGV, signaltest_handler);

    printf("globanum %d, loacnum %d, reg %d, volnum %d \n", globanum, loacnum, reg, volnum);
    if(exception_setup(true))
    {
       globanum = 66; 
       loacnum = 66;
       reg = 66; 
       volnum = 66;
       *p = 0;
    }   

    printf("globanum %d, loacnum %d, reg %d, volnum %d \n", globanum, loacnum, reg, volnum);

    exception_cancel();

    return 0;
}

印出來的結果發現只有 register 裡面的資料會恢復到先前的資料。

globanum 11, loacnum 22, reg 33, volnum 44 
memory error signal 11 
globanum 66, loacnum 66, reg 33, volnum 66

使用 GDB 觀察 sigsetjmp() 存放的訊息。
設 break point 在 exception_setup 函式進入點，將指令停在進入 __sigsetjmp 之前

frame pointer 0x7fffffffde40
stack pointer 0x7fffffffde30
__sigsetjmp 返回後的 pc 0x0000555555554788

(gdb) b *0x55555555476a
(gdb) r

(gdb) info frame
Stack level 0, frame at 0x7fffffffde50:
 rip = 0x555555554788 in exception_setup (shell.c:48); saved rip = 0x5555555548b9
 called by frame at 0x7fffffffde90
 source language c.
 Arglist at 0x7fffffffde40, args: limit_time=true
 Locals at 0x7fffffffde40, Previous frame's sp is 0x7fffffffde50
 Saved registers:
  rbp at 0x7fffffffde40, rip at 0x7fffffffde48

(gdb) disas
   0x000055555555476a <+0>:	push   %rbp
   0x000055555555476b <+1>:	mov    %rsp,%rbp
   0x000055555555476e <+4>:	sub    $0x10,%rsp
   0x0000555555554772 <+8>:	mov    %edi,%eax
   0x0000555555554774 <+10>:	mov    %al,-0x4(%rbp)
   0x0000555555554777 <+13>:	mov    $0x1,%esi
   0x000055555555477c <+18>:	lea    0x2008bd(%rip),%rdi        # 0x555555755040 <env>
=> 0x0000555555554783 <+25>:	callq  0x555555554640 <__sigsetjmp@plt>
   0x0000555555554788 <+30>:	test   %eax,%eax
   0x000055555555478a <+32>:	je     0x5555555547af <exception_setup+69>
   0x000055555555478c <+34>:	movl   $0x0,0x200972(%rip)        # 0x555555755108 <jmp_ready>
...
...
End of assembler dump.

(gdb) x /gx $rbp
0x7fffffffde40:	0x00007fffffffde80
(gdb) x /gx $rsp
0x7fffffffde30:	0x0000000000000000

接著到 __sigsetjmp 進入點
觀察此時 sp 往下位移 8 byte 並保存 __sigsetjmp 返回後下一道指令的位址

(gdb) disas
Dump of assembler code for function __sigsetjmp@plt:
=> 0x0000555555554640 <+0>:	jmpq   *0x20098a(%rip)        # 0x555555754fd0
   0x0000555555554646 <+6>:	pushq  $0x4
   0x000055555555464b <+11>:	jmpq   0x5555555545f0
End of assembler dump.

(gdb) x /gx $rbp
0x7fffffffde40:	0x00007fffffffde80
(gdb) x /gx $rsp
0x7fffffffde28:	0x0000555555554788

主要 __sigsetjmp 將資料複製到 env 中 __jmp_buf[8] 的部份：

3行：將 register data 複製到 __jmp_buf[0]
4~7行：將 frame pointer 加密並複製到 __jmp_buf[1]
12~15行：將 stack pointer 加密並複製到 __jmp_buf[6]
16~20行：將 pc 加密並複製到 __jmp_buf[7]

利用 GDB 觀察 sigsetjmp() 確實保存當下在 exception_setup() 的 stack frame 與 sigsetjmp() 返回後的下一道指令。
沒想到複製 env 中 __jmp_buf[8] 的部份資料會先經過加密的步驟！


























(gdb) disas
Dump of assembler code for function __sigsetjmp:
   0x00007ffff7a22b70 <+0>:	mov    %rbx,(%rdi)
   0x00007ffff7a22b73 <+3>:	mov    %rbp,%rax
   0x00007ffff7a22b76 <+6>:	xor    %fs:0x30,%rax
   0x00007ffff7a22b7f <+15>:	rol    $0x11,%rax
   0x00007ffff7a22b83 <+19>:	mov    %rax,0x8(%rdi)
   0x00007ffff7a22b87 <+23>:	mov    %r12,0x10(%rdi)
   0x00007ffff7a22b8b <+27>:	mov    %r13,0x18(%rdi)
   0x00007ffff7a22b8f <+31>:	mov    %r14,0x20(%rdi)
   0x00007ffff7a22b93 <+35>:	mov    %r15,0x28(%rdi)
   0x00007ffff7a22b97 <+39>:	lea    0x8(%rsp),%rdx
   0x00007ffff7a22b9c <+44>:	xor    %fs:0x30,%rdx
   0x00007ffff7a22ba5 <+53>:	rol    $0x11,%rdx
   0x00007ffff7a22ba9 <+57>:	mov    %rdx,0x30(%rdi)
   0x00007ffff7a22bad <+61>:	mov    (%rsp),%rax
   0x00007ffff7a22bb1 <+65>:	nop
   0x00007ffff7a22bb2 <+66>:	xor    %fs:0x30,%rax
   0x00007ffff7a22bbb <+75>:	rol    $0x11,%rax
=> 0x00007ffff7a22bbf <+79>:	mov    %rax,0x38(%rdi)
   0x00007ffff7a22bc3 <+83>:	jmpq   0x7ffff7a22bd0 <__sigjmp_save>
End of assembler dump.

(gdb) p env
$19 = {{__jmpbuf = {33, -2702089176337870233, 93824992233056, 140737488346976, 0, 0, -2702089176352550297, 
      -8082530450169130393}, __mask_was_saved = 0, __saved_mask = {__val = {0 <repeats 16 times>}}}}

init_cmd

init_cmd 函式中主要是增加 command 與 parameter。

add_cmd("help", do_help_cmd, "| Show documentation");
add_cmd("option", do_option_cmd, "[name val] | Display or set options");
add_cmd("quit", do_quit_cmd, "| Exit program");
add_param("verbose", &verblevel, "Verbosity level", NULL);
add_param("error", &err_limit, "Number of errors until exit", NULL);
add_param("echo", &echo, "Do/don't echo commands", NULL);

並在 consloe.h 中使用到 function pointer 的技巧，將函式的返回值與代入參數的資料類型一致就可以方便的新增新的 command。
用一個全域變數 cmd_list 紀錄已經加入的 command。

static cmd_ptr cmd_list = NULL;
typedef bool (*cmd_function)(int argc, char *argv[]);
typedef struct CELE cmd_ele, *cmd_ptr;
struct CELE {
    char *name;
    cmd_function operation;
    char *documentation;
    cmd_ptr next;
};

在 add_cmd 會依照 name string 由小到大的排序，並藉由 last_loc (pointer to pointer) 間接的把新的 command 加入到 cmd_list。















void add_cmd(char *name, cmd_function operation, char *documentation)
{
    cmd_ptr next_cmd = cmd_list;
    cmd_ptr *last_loc = &cmd_list;
    while (next_cmd && strcmp(name, next_cmd->name) > 0) {
        last_loc = &next_cmd->next;
        next_cmd = next_cmd->next;
    }
    cmd_ptr ele = (cmd_ptr) malloc_or_fail(sizeof(cmd_ele), "add_cmd");
    ele->name = name;
    ele->operation = operation;
    ele->documentation = documentation;
    ele->next = next_cmd;
    *last_loc = ele;
}

自動評分系統運作的原理

依據輸入的命令 make test ，去 makefile 中找尋如何執行自動化測試程式
在 makefile 中發現是以 scripts/driver.py 調用 qtest 進行測試

test: qtest scripts/driver.py
        scripts/driver.py

在 driver.py 中定義一個 class Tracer 並初始化

traceDirectory : 存放測試用的腳本

class Tracer:
    
    traceDirectory = "./traces"
    qtest = "./qtest"
    command = qtest
    verbLevel = 0
    autograde = False
    useValgrind = False

    traceDict = {
        1 : "trace-01-ops",
        2 : "trace-02-ops",
        3 : "trace-03-ops",
        4 : "trace-04-ops",
        5 : "trace-05-ops",
        6 : "trace-06-string",
        7 : "trace-07-robust",
        8 : "trace-08-robust",
        9 : "trace-09-robust",
        10 : "trace-10-malloc",
        11 : "trace-11-malloc",
        12 : "trace-12-malloc",
        13 : "trace-13-perf",
        14 : "trace-14-perf",
        15 : "trace-15-perf"
        }
        ...

在 run 中主要執行測試程式的部份

maxScores : 存放每一個測試項目能獲的最大分數
第 7 行 : self.runTrace() 會藉由 subprocess.call() 來執行 qtest
第 8 ~ 13 行 : 計算分數















    def run(self, tid = 0):
        ...
        for t in tidList:
            tname = self.traceDict[t]
            if self.verbLevel > 0:
                print("+++ TESTING trace %s:" % tname)
            ok = self.runTrace(t)
            maxval = self.maxScores[t]
            tval = maxval if ok else 0
            print("---\t%s\t%d/%d" % (tname, tval, maxval))
            score += tval
            maxscore += maxval
            scoreDict[t] = tval
        print("---\tTOTAL\t\t%d/%d" % (score, maxscore))
        ...

subprocess.call() 實際上會執行如下命令

$ ./qtest -v 3 -f ./traces/trace-01-ops.cmd

cmd># Test of insert_head and remove_head
cmd>option fail 0
cmd>option malloc 0
cmd>new
q = []
cmd>ih gerbil
q = [gerbil]
...
cmd>rh gerbil
Removed gerbil from queue
q = []
cmd>
Freeing queue

增加自己的腳本

在 traceDict 中加入名稱 : 16 : "trace-16-user"
在 traceProbs 中加入名稱 : 16 : "Trace-16"
在 maxScores 中加入分數 : 100
在 teaces 目錄下新增要測試的檔案名稱為 trace-16-user.cmd

成功加入並執行了

$ make test
...
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, and reverse
---	trace-15-perf	7/7
+++ TESTING trace trace-16-user:
# User test
---	trace-16-user	100/100
---	TOTAL		200/200

Select 系統呼叫


int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);

依據 linux man page 說明

$ man select

select() 能夠監控多個 file descriptor 且會被 blocking(除了 timeout = 0)，直到有 file descriptor 可以進行操作或被 interrupt 打斷或者超時才會返回。

nfds: 三種 file descriptor 中的最高值加 1
readfds: 將想要知道的 file descriptor 就緒可讀的狀態(有資料可以接收)，則可以存在 readfds 參數
writefds: 將想要知道的 file descriptor 就緒可寫的狀態(有資料可以傳送)，則可以存在 writefds 參數
exceptfds: 將想要知道的 file descriptor 例外錯誤發生的狀態，則可以存在 exceptfds 參數
timeout: 用來告訴 select() 要 block 多久的時間
- timeout = 0: 會立刻返回
- timeout = NULL: 當 file descriptor 已經準備好才返回
- timeout = number: 如下條件才返回
  - file descriptor 已經準備好
  - system call 被 signal handler 中斷(interrupt)
  - timeout 時間到期

select() 的返回值

成功時，回傳 3 種 set 中的 descriptors 總數(sets 會被改過，表示哪幾個 fd 是已經就緒的)
發生 timeout 回傳 0
錯誤時回傳 -1（並設定相對應的 errno）

fd_set 的相關 MACRO

FD_SET(int fd, fd_set *set): 將 fd 新增至 set
FD_CLR(int fd, fd_set *set): 從 set 中移除 fd
FD_ISSET(int fd, fd_set *set): 若 fd 在 set 中則傳回 true
FD_ZERO(fd_set *set): 將 set 清為零

當 select() 的返回時 3 種 set 會被修改(保留就緒的 fd)，因此在呼叫 select() 之前必須對 set 重新初始化。

console.c 的實作

在 qtest.c 中會呼叫 run_console()，接著 push_file() 檢查是否有檔案輸入，若無則為鍵盤輸入(STDIN)， while 迴圈判斷命令是否執行完成並呼叫 cmd_select()。

在 cmd_select() 中將目標 fd 存在 infd 變數並加到 readfds 中。











if (!readfds)
    readfds = &local_readset;

/* Add input fd to readset for select */
infd = buf_stack->fd;
FD_SET(infd, readfds);

...

if (infd >= nfds)
    nfds = infd + 1;

接著透過 select() 判斷 readfds 中的 file 是否可讀，若檔案可讀時會從 blocking 中解除並回傳 >0 的結果。讀出 file 中的 commmand 且解析對應操作，並將該 fd 從 readfds 移除。













int result = select(nfds, readfds, writefds, exceptfds, timeout);
if (result <= 0)
    return result;

infd = buf_stack->fd;
if (readfds && FD_ISSET(infd, readfds)) {
    /* Commandline input available */
    FD_CLR(infd, readfds);
    result--;
    cmdline = readline();
    if (cmdline)
        interpret_cmd(cmdline);
}

RIO 套件(Robust I/O)的原理

RIO 套件是 CSAPP 中提供的一種 system I/O 套件，基於一些情境下於原本的 read 與 write 系統呼叫上再做一層包裝(參考 csapp.c 與 csapp.h)


ssize_t read(int fd, void *buf, size_t count);

依據 linux man page 說明，回傳值會有以下幾種狀況

$ man read

回傳大於 0:
- 回傳值等於 count 數: 成功讀取
- 回傳值小於 count 數:
  - 可讀取的大小不足 count(達到EOF)
  - 讀取過程中被信號中斷
回傳等於 0 發生 end of file(EOF)
回傳等於 -1（並設定相對應的 errno）錯誤

RIO 套件設計的考量點

在一些情境下(network socket or terminal)，對於 read 的操作可能一次只需讀少量的 byte 數，如此一來會因為頻繁的呼叫 read(system call) 導致程式會頻繁在 user mode and kernel mode 之間作切換而造成效能低落。

對於 read 回傳值也做相對應的處理，如遇到訊號中斷則會自動在重讀一次

記憶體管理

在 harness.h 可以看到本次作業中使用到的 malloc, free 其實都被改成呼叫 test_malloc, test_free

#define malloc test_malloc
#define free test_free

在 test_malloc() 函式中看到是用一個全域 allocated 變數( doubly-linked list 結構)來管理配置記憶體的一些資訊

typedef struct BELE {
    struct BELE *next;
    struct BELE *prev;
    size_t payload_size;
    size_t magic_header; /* Marker to see if block seems legitimate */
    unsigned char payload[0];
    /* Also place magic number at tail of every block */
} block_ele_t;

static block_ele_t *allocated = NULL;

從 block_ele_t 結構中看到一個特別的成員 unsigned char payload[0];，經 google 搜尋了解原來是用到 GCC 中一個叫 Arrays of Length Zero 的技巧

結構中的最後一個成員可以動態的配置記憶體大小
使用一次性配置記憶體空間比多次呼叫 malloc 的方式不易造成記憶體碎片化

語法跟 C11 有以下幾點的不同

需為 incomplete type
在結構中不能單獨出現，至少需要有一個成員
需為一個 non-empty structure 的最後成員
不可以包含在 union 中

根據 C11 規格書 6.7.2.1

As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.

In most situations, the flexible array member is ignored.

In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply.

If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.

    block_ele_t *new_block = malloc(size + sizeof(block_ele_t) + sizeof(size_t));

Valgrind 的運作原理和針對本程式的使用

valgrind 架構為一個主要運算核心，在搭配一些週邊的工具，核心會模擬 CPU 的環境，並提供服務給週邊工具，週邊工具則類似於一個模組，利用核心提供的服務來完成各種特定任務，也可以增加自己的工具進去測試。

Memcheck : 可以用來檢查 memory 的一些問題
Cachegrind : 可以用來檢查 cache 的一些問題
Helgrind : 可以用來檢查 mulitithread 中資料競爭的問題

依據 valgrind introduction

在本次作業主要使用 Memcheck 來偵測記憶體漏洞

先單純使用 ih 命令並搭配 valgrind 中的 massif 觀察記憶體使用狀況

$ valgrind --tool=massif ./qtest
cmd> new
q = []
cmd> ih RAND 10000
q = [jrgngwony ncngk lvyen qzfaxrn uylqvlre lrppzami jwknm euxyzjsyk aptbciixj cjsxba zisbmkk mfial pgswcd mbxadqhe jdbtam oywgaf crdhwomc gnjakyw fgatwablx vkkznafgq wpwtjwx oxwvabp vfbptdd soohrxyhh alvdo kwjoifxr rhjxnmj ezgwtiba cgsekuh ftgdb ... ]
cmd> free
q = NULL
cmd> quit
Freeing queue

使用 massif-visualizer 圖形化界面，觀察使用 ih 命令增加10000個亂數所產生出的字串，記憶體消耗量最高達到將近 1 MB，在藉由 free cmd 將所配置的記憶體釋放。

$ massif-visualizer massif.out.xxxx

...

peak cost: "1.0 MiB" heap "297.4 KiB" heap extra "0 B" stacks

嘗試使用 lab0-c 中所用到 Arrays of Length Zero 的技巧，改寫 q_insert_head 函式
首先修改 list_ele_t 中的 value 為 incomplete array type 並為此結構的最後一位成員。








/* Linked list element */
typedef struct ELE {
    /* Pointer to array holding string.
     * This array needs to be explicitly allocated and freed
     */
    struct ELE *next;
	char value[0];
} list_ele_t;

原始要為 newh 與 newh->value 呼叫2次 malloc() 來配置記憶體空間，使用 Arrays of Length Zero 的技巧改為1次就一起將 newh 與 newh->value 配置記憶體空間。



























bool q_insert_head(queue_t *q, char *s)
{
	unsigned int len = 0;

	if(!q || !s)
		return false;

    len = strlen(s) + 1;

    list_ele_t *newh = malloc(sizeof(list_ele_t) + sizeof(char) * len);

	if(!newh)
		return false;

	strncpy(newh->value, s, len);
 
	newh->next = q->head;

    q->head = newh;

	if(!q->size)
		q->tail = newh;

	q->size++;

    return true;
}

如上相同情境下( ih 命令增加10000個亂數所產生出的字串) 觀察記憶體使用狀況，從下圖觀察記憶體消耗量最高達到 547 KB，發現整體程式執行時間有縮短，不過記憶體消耗有很明顯差別，將近比原始的版本少了快一半的記憶體使用量。

peak cost: "547.0 KiB" heap "156.9 KiB" heap extra "0 B" stacks

這次直接使用 trace-16 的測項來觀察記憶體使用狀況

$ valgrind --tool=massif ./qtest -f traces/trace-16-perf.cmd

peak cost: "9.9 MiB" heap "2.9 MiB" heap extra "0 B" stacks

原始版本

使用 Arrays of Length Zero 的版本

peak cost: "5.3 MiB" heap "1.5 MiB" heap extra "0 B" stacks

Reference

應用 Valgrind 發現 Linux 程序的內存問題

持續更新中