Try   HackMD

2021q1 Homework1 (lab0)

tags: linux2021

contributed by < >

開發環境

$ uname -a
Linux notebook 5.8.0-55-generic #62~20.04.1-Ubuntu SMP Wed Jun 2 08:55:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ gcc -v
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)

作業要求

  • Implement queue.[ch]
  • 修正執行 Address Sanitizer 時會發生的錯誤
  • 運用 Valgrind 排除 qtest 實作的記憶體錯誤,並透過 Massif 視覺化 “simulation” 過程中的記憶體使用量,需要設計對應的實驗
  • 在 qtest 中實作 coroutine,並提供新的命令 web,提供 web 伺服器功能,注意: web 伺服器運作過程中,qtest 仍可接受其他命令
    • 可嘗試整合 tiny-web-server,將 FORK_COUNT 變更為 0,並以 coroutine 取代原本的 fork 系統呼叫
  • 解釋 select 系統呼叫在本程式的使用方式,並分析 console.c 的實作,說明其中運用 CS:APP RIO 套件 的原理和考量點。可對照參考 CS:APP 第 10 章重點提示 :arrow_forward: 為避免「舉燭」,請比照 qtest 實作,撰寫獨立的終端機命令解譯器 (可建立新的 GitHub repository)
  • 說明 antirez/linenoise 的運作原理,注意到 termios 的運用
  • 研讀論文 Dude, is my code constant time?,解釋本程式的 “simulation” 模式是如何透過以實驗而非理論分析,達到驗證時間複雜度,需要解釋 Student’s t-distribution 及程式實作的原理。注意:現有實作存在若干致命缺陷,請討論並提出解決方案;
  • 指出現有程式的缺陷 (提示: 和 RIO 套件 有關),嘗試強化並提交 pull request

Implement queue.[ch]

Address Sanitizer

編譯程式:

開啟 Address Sanitizer

$ make clean
$ make SANITIZER=1

執行:

$ ./qtest 
cmd> help

錯誤訊息:

Commands:
...
Options:
=================================================================
==4212==ERROR: AddressSanitizer: global-buffer-overflow on address 0x562e8c8e5400 at pc 0x562e8c8ce90f bp 0x7ffd9df41200 sp 0x7ffd9df411f0
READ of size 4 at 0x562e8c8e5400 thread T0
    #0 0x562e8c8ce90e in do_help_cmd /home/henson/linux2021/lab0-c/console.c:308
    #1 0x562e8c8cea22 in interpret_cmda /home/henson/linux2021/lab0-c/console.c:222
    #2 0x562e8c8cf207 in interpret_cmd /home/henson/linux2021/lab0-c/console.c:245
    #3 0x562e8c8d094a in run_console /home/henson/linux2021/lab0-c/console.c:661
    #4 0x562e8c8cd531 in main /home/henson/linux2021/lab0-c/qtest.c:788
    #5 0x7f1b35d240b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
    #6 0x562e8c8cab8d in _start (/home/henson/linux2021/lab0-c/qtest+0x8b8d)

0x562e8c8e5401 is located 0 bytes to the right of global variable 'echo' defined in 'console.c:60:13' (0x562e8c8e5400) of size 1
SUMMARY: AddressSanitizer: global-buffer-overflow /home/henson/linux2021/lab0-c/console.c:308 in do_help_cmd
Shadow bytes around the buggy address:
  0x0ac651914a30: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
  0x0ac651914a40: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
  0x0ac651914a50: f9 f9 f9 f9 00 00 00 00 04 f9 f9 f9 f9 f9 f9 f9
  0x0ac651914a60: 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9 f9 f9
  0x0ac651914a70: 01 f9 f9 f9 f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9
=>0x0ac651914a80:[01]f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
  0x0ac651914a90: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x0ac651914aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ac651914ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ac651914ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ac651914ad0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==4212==ABORTING

錯誤處理:

根據錯誤訊息可得知

  1. 程式 crush 發生在 console.c:308do_help_cmd 函式
  2. 讀取全域變數 echo 發生 global-buffer-overflow

追蹤 do_help_cmd 函式:

static bool do_help_cmd(int argc, char *argv[]) { cmd_ptr clist = cmd_list; report(1, "Commands:", argv[0]); while (clist) { report(1, "\t%s\t%s", clist->name, clist->documentation); clist = clist->next; } param_ptr plist = param_list; report(1, "Options:"); while (plist) { report(1, "\t%s\t%d\t%s", plist->name, *plist->valp, plist->documentation); plist = plist->next; } return true; }

看到 307 呼叫了 report 這個 function 後發生了問題,但同樣的 301 卻沒有問題,所以推估是 plist 所指向變數發生錯誤。
plistparam_list 有關,所以在 console.c 尋找 param_list。可以得知 param_list 是 global varible 以及找到主要兩個對 param_list 作用的 function (init_cmd,add_parameter)

static param_ptr param_list = NULL
void init_cmd()
{
    cmd_list = NULL;
    param_list = NULL;
    err_cnt = 0;
    quit_flag = false;
    ...
    add_param("simulation", (int *) &simulation, "Start/Stop simulation mode",
              NULL);
    add_param("verbose", &verblevel, "Verbosity level", NULL);
    add_param("error", &err_limit, "Number of errors until exit", NULL);
    add_param("echo", (int *) &echo, "Do/don't echo commands", NULL);
    
    init_in();
    init_time(&last_time);
    first_time = last_time;
}
/* Add a new parameter */
void add_param(char *name,
               int *valp,
               char *documentation,
               setter_function setter)
{
    param_ptr next_param = param_list;
    param_ptr *last_loc = &param_list;
    while (next_param && strcmp(name, next_param->name) > 0) {
        last_loc = &next_param->next;
        next_param = next_param->next;
    }

    param_ptr ele = (param_ptr) malloc_or_fail(sizeof(param_ele), "add_param");
    ele->name = name;
    ele->valp = valp;
    ele->documentation = documentation;
    ele->setter = setter;
    ele->next = next_param;
    *last_loc = ele;
}

可以看到 init_cmd 中的add_param 有使用 echo 並轉形成int * 當作輸入,而原本的 echostatic bool

static bool echo = 0;

因為原本 boolint 的大小不同,所以在 console.c:308 中對 plist->valp 做 dereference 發生錯誤 global-buffer-overflow

report(1, "\t%s\t%d\t%s", plist->name, *plist->valp, plist->documentation);

可以透過將 echo 原本的型別改成 static int

static bool echo = 0; -> static int echo = 0;

simulation 也有相似問題,所以也將在 console.[ch]simulation 改為 int ,即可消除錯誤。

成功執行如下:

henson@notebook:~/linux2021/lab0-c$ ./qtest 
cmd> help
Commands:
	#	 ...            | Display comment
	free	                | Delete queue
	help	                | Show documentation
	ih	 str [n]        | Insert string str at head of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
	it	 str [n]        | Insert string str at tail of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
	log	 file           | Copy output to file
	new	                | Create new queue
	option	 [name val]     | Display or set options
	quit	                | Exit program
	reverse	                | Reverse queue
	rh	 [str]          | Remove from head of queue.  Optionally compare to expected value str
	rhq	                | Remove from head of queue without reporting value.
	show	                | Show queue contents
	size	 [n]            | Compute queue size n times (default: n == 1)
	sort	                | Sort queue in ascending order
	source	 file           | Read commands from source file
	time	 cmd arg ...    | Time command execution
Options:
	echo	1	Do/don't echo commands
	error	5	Number of errors until exit
	fail	30	Number of times allow queue operations to return false
	length	1024	Maximum length of displayed string
	malloc	0	Malloc failure probability percent
	simulation	0	Start/Stop simulation mode
	verbose	4	Verbosity level

Valgrind 分析記憶體問題

執行程式:

$make valgrind

會執行 Makefile 裡面valgrind 的部份

valgrind: valgrind_existence # Explicitly disable sanitizer(s) $(MAKE) clean SANITIZER=0 qtest $(eval patched_file := $(shell mktemp /tmp/qtest.XXXXXX)) cp qtest $(patched_file) chmod u+x $(patched_file) sed -i "s/alarm/isnan/g" $(patched_file) scripts/driver.py -p $(patched_file) --valgrind $(TCASE) @echo @echo "Test with specific case by running command:" @echo "scripts/driver.py -p $(patched_file) --valgrind -t <tid>"

會產生以下相關錯誤訊息

# Explicitly disable sanitizer(s)
make clean SANITIZER=0 qtest
make[1]: Entering directory '/home/henson/linux2022/lab0-c'
rm -f qtest.o report.o console.o harness.o queue.o random.o dudect/constant.o dudect/fixture.o dudect/ttest.o linenoise.o .qtest.o.d .report.o.d .console.o.d .harness.o.d .queue.o.d .random.o.d .dudect/constant.o.d .dudect/fixture.o.d .dudect/ttest.o.d .linenoise.o.d *~ qtest /tmp/qtest.*
rm -rf .dudect
rm -rf *.dSYM
(cd traces; rm -f *~)
  CC	qtest.o
  CC	report.o
  CC	console.o
  CC	harness.o
  CC	queue.o
  CC	random.o
  CC	dudect/constant.o
  CC	dudect/fixture.o
  CC	dudect/ttest.o
  CC	linenoise.o
  LD	qtest
make[1]: Leaving directory '/home/henson/linux2021/lab0-c'
cp qtest /tmp/qtest.pK4SCR
chmod u+x /tmp/qtest.pK4SCR
sed -i "s/alarm/isnan/g" /tmp/qtest.pK4SCR
scripts/driver.py -p /tmp/qtest.pK4SCR --valgrind 
---	Trace		Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
==8511== 7 bytes in 1 blocks are still reachable in loss record 1 of 3
==8511==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==8511==    by 0x4A4D50E: strdup (strdup.c:42)
==8511==    by 0x11011F: linenoiseHistoryAdd (linenoise.c:1236)
==8511==    by 0x110CB2: linenoiseHistoryLoad (linenoise.c:1325)
==8511==    by 0x10C287: main (qtest.c:777)
==8511== 
==8511== 96 bytes in 19 blocks are still reachable in loss record 2 of 3
==8511==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==8511==    by 0x4A4D50E: strdup (strdup.c:42)
==8511==    by 0x110093: linenoiseHistoryAdd (linenoise.c:1236)
==8511==    by 0x110CB2: linenoiseHistoryLoad (linenoise.c:1325)
==8511==    by 0x10C287: main (qtest.c:777)
==8511== 
==8511== 160 bytes in 1 blocks are still reachable in loss record 3 of 3
==8511==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==8511==    by 0x1100DF: linenoiseHistoryAdd (linenoise.c:1224)
==8511==    by 0x110CB2: linenoiseHistoryLoad (linenoise.c:1325)
==8511==    by 0x10C287: main (qtest.c:777)
==8511== 
---	trace-01-ops	6/6
...

Test with specific case by running command:
scripts/driver.py -p /tmp/qtest.pK4SCR --valgrind -t <tid>

錯誤處理:

錯誤訊息告訴我們記憶體 still reachable ,代表程式結束時仍有記憶體未釋放。

那麼依照錯誤訊息查看相關程式碼

  • main.c
linenoiseHistorySetMaxLen(HISTORY_LEN); linenoiseHistoryLoad(HISTORY_FILE); /* Load the history at startup */
  • linenoise.c
int linenoiseHistoryLoad(const char *filename) { FILE *fp = fopen(filename, "r"); char buf[LINENOISE_MAX_LINE]; if (fp == NULL) return -1; while (fgets(buf, LINENOISE_MAX_LINE, fp) != NULL) { char *p; p = strchr(buf, '\r'); if (!p) p = strchr(buf, '\n'); if (p) *p = '\0'; linenoiseHistoryAdd(buf); } fclose(fp); return 0; }
int linenoiseHistoryAdd(const char *line) { char *linecopy; if (history_max_len == 0) return 0; /* Initialization on first call. */ if (history == NULL) { history = malloc(sizeof(char *) * history_max_len); if (history == NULL) return 0; memset(history, 0, (sizeof(char *) * history_max_len)); } /* Don't add duplicated lines. */ if (history_len && !strcmp(history[history_len - 1], line)) return 0; /* Add an heap allocated copy of the line in the history. * If we reached the max length, remove the older line. */ linecopy = strdup(line); if (!linecopy) return 0; if (history_len == history_max_len) { free(history[0]); memmove(history, history + 1, sizeof(char *) * (history_max_len - 1)); history_len--; } history[history_len] = linecopy; history_len++; return 1; }

它做的事情是將 HISTORY_FILE = .cmd_history 的資料載入到在 linenoise.c:132static char **history

註: HISTORY_FILE 可以用 grep -r HISTORY_FILE 找到在 console.c 定義

可以發現無相關釋放記憶體的程式碼在其中。所以嘗試在程式碼中搜尋 free 關鍵字可以找到 freeHistory 。以 freeHistory 為關鍵字搜尋可以找到在 linenoiseAtExit 被使用,而 linenoiseAtExitenableRawMode 被使用

最後可以發現這樣的關係, linenosie() -> linenoiseRaw() -> enableRawMode() -> atexit(linenoiseAtExit())-> freeHistory(), linenosie() 則是在 run_console() 被呼叫。

  • console.c
bool run_console(char *infile_name) { if (!push_file(infile_name)) { report(1, "ERROR: Could not open source file '%s'", infile_name); return false; } if (!has_infile) { char *cmdline; while ((cmdline = linenoise(prompt)) != NULL) { interpret_cmd(cmdline); linenoiseHistoryAdd(cmdline); /* Add to the history. */ linenoiseHistorySave(HISTORY_FILE); /* Save the history on disk. */ linenoiseFree(cmdline); } } else { while (!cmd_done()) cmd_select(0, NULL, NULL, NULL, NULL); } return err_cnt == 0; }

而 run_console() 則是在 qtest.c 使用到

bool ok = true; ok = ok && run_console(infile_name); ok = ok && finish_cmd()

查看 run_consoleconsole.c:650 的程式碼,發現到只有在 has_infile = false 的時候才會呼叫到 linenoise,而 has_infile 這個變數則會在 push_file() 中被修改 console.c:443 。如果有 infile 則為 ture,反之亦然。

所以在有 infile 的情況下,不會正常釋放 history 的記憶體。因此,可將qtest.c:777 改成如下,必有在有 infile 的情況下載入 history

rm -f qtest.o report.o console.o harness.o queue.o random.o dudect/constant.o dudect/fixture.o dudect/ttest.o linenoise.o .qtest.o.d .report.o.d .console.o.d .harness.o.d .queue.o.d .random.o.d .dudect/constant.o.d .dudect/fixture.o.d .dudect/ttest.o.d .linenoise.o.d *~ qtest /tmp/qtest.*
rm -rf .dudect
rm -rf *.dSYM
(cd traces; rm -f *~)
  CC	qtest.o
  CC	report.o
  CC	console.o
  CC	harness.o
  CC	queue.o
  CC	random.o
  CC	dudect/constant.o
  CC	dudect/fixture.o
  CC	dudect/ttest.o
  CC	linenoise.o
  LD	qtest
make[1]: Leaving directory '/home/henson/linux2021/lab0-c'
cp qtest /tmp/qtest.fL0Wom
chmod u+x /tmp/qtest.fL0Wom
sed -i "s/alarm/isnan/g" /tmp/qtest.fL0Wom
scripts/driver.py -p /tmp/qtest.fL0Wom --valgrind 
---	Trace		Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
---	trace-01-ops	6/6
...
---	trace-16-perf	6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
---     trace-17-complexity     5/5
---	TOTAL		100/100

Massif:

可以查找 Valgrind manual #massif 搜尋相關資訊

執行 Valgrind tool=massif,會產生 visualize 檔,需要透過 sudo apt-get install 安裝 massif-visualizer。接著使用 massif-visualizer 開啟

$ valgrind --tool=massif ./qtest -f testfile $ massif-visualizer massif.out.XXX

testfile 則是使用如下

option simulation 1
it
option simulation 0

用 massif-visualizer 開啟輸出檔之後畫面如下

註:default max-snapshots = 100

可以看到程式在運行時所使用到的 heap memory 的使用量。

$ valgrind --tool=massif --max-snapshots=1000 ./qtest -f testfile

將 max-snapshots 改為 1000 觀察更細緻的 heap memory 變化

觀察圖表之後可以發現到 heap memory 的使用變化量相當大,推估是在使用 simulation 來測量程式的 malloc 記憶體量變化很大。

dudect/constant.c 找到相對應程式碼如下

void measure(int64_t *before_ticks, int64_t *after_ticks, uint8_t *input_data, int mode) { assert(mode == test_insert_tail || mode == test_size); if (mode == test_insert_tail) { for (size_t i = drop_size; i < number_measurements - drop_size; i++) { char *s = get_random_string(); dut_new(); dut_insert_head( get_random_string(), *(uint16_t *) (input_data + i * chunk_size) % 10000); before_ticks[i] = cpucycles(); dut_insert_tail(s, 1); after_ticks[i] = cpucycles(); dut_free(); } } else { for (size_t i = drop_size; i < number_measurements - drop_size; i++) { dut_new(); dut_insert_head( get_random_string(), *(uint16_t *) (input_data + i * chunk_size) % 10000); before_ticks[i] = cpucycles(); dut_size(1); after_ticks[i] = cpucycles(); dut_free(); } } }

在 65~67 行中,插入 string 在 0~9999次之間。將次數固定為 10 來做觀察。可以發現到 heap memory 的使用量和變化量都變小了

dut_insert_head(get_random_string(),10);

將次數固定為 10000,在 heap memory 的使用量上仍有極大的變化,但與原本相比則有比較固定的 pattern, 推測造成變化量的原因為 get_random_string()

在 qtest 中實作 coroutine,並提供新的命令 web,提供 web 伺服器功能

Reference
from Polai's note

  1. 增加 web 功能至 qtest直譯器 , 即執行 qtest 進入 cmd> 命令提示列 , 輸入 web , 直譯器接受到命令並解譯後 , 執行 echo server 功能 , 當使用者從瀏覽器輸入命令後會 echo 回來並顯示於瀏覽器上.
  2. 更改 echo server 功能, 改為由瀏覽器發出命令後 , server 接受命令執行對應的 queue operation , 之後把對應的結果 response 給瀏覽器.
  3. 因為執行 web 命令後,會 block 住直譯器再接收下一個命令 , 故須整合 select function 使得執行完 web後可以同時接受兩個 I/O event.

自動評分程式探討

nothing

參考連結

jasonmatobo-lab0 note
XDEv11-lab0 note #VALGRIND
jhan1998-lab0 #Massif
如何寫好 git commit message
C 語言 setjmp 與 longjmp 函數用法教學
coroutine:link0, link1, link2