# 2021q1 Homework1 (lab0)
contributed by < [`eliangcs`](https://github.com/eliangcs) >
> [程式碼變更的列表](https://github.com/eliangcs/lab0-c/compare/81079ded..HEAD)
## 進度
* [x] C Programming Lab
- [x] `q_new`
- [x] `q_insert_head`
- [x] `q_insert_tail`
- [x] `q_remove_head`
- [x] `q_size`
- [x] `q_reverse`
- [x] `q_sort`
- [x] `q_free`
* [x] 利用 Address Sanitizer 除錯
* [ ] 運用 Valgrind 排除記憶體錯誤
* [ ] 實做 coroutine,增加新命令 `web`
* [ ] 解釋 `select` 使用方式
* [ ] 說明 `linenoise` 運作原理
* [ ] 閱讀論文 [Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)
* [ ] 指出現有程式與 RIO 有關的缺陷
## 使用 LLDB 除錯 core dump
節錄自 [core(5)](https://man7.org/linux/man-pages/man5/core.5.html):
> The default action of certain signals is to cause a process to terminate and produce a core dump file, a file containing an image of the process's memory at the time of termination. This image can be used in a debugger (e.g., gdb(1)) to inspect the state of the program at the time that it terminated.
使用 core dump 檔案除錯前,需先解除 core dump 數量限制:
```shell
ulimit -c unlimited
```
確定 `/cores` 目錄可以寫入:
```shell
sudo chmod a+w /cores
```
行程不正常結束後,可以用以下命令偵錯當下的狀態:
```shell
lldb qtest -c /cores/core.PID
```
## 實做 Merge Sort
WIP
## 利用 Address Sanitizer 除錯
以 `make SANITIZER=1` 編譯後執行 `./qtest`,輸入 `help` 命令會出現以下錯誤訊息:
```=
$ ./qtest
cmd> help
Commands:
# ... | Display comment
free | Delete queue
help | Show documentation
ih str [n] | Insert string str at head of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
it str [n] | Insert string str at tail of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
log file | Copy output to file
new | Create new queue
option [name val] | Display or set options
quit | Exit program
reverse | Reverse queue
rh [str] | Remove from head of queue. Optionally compare to expected value str
rhq | Remove from head of queue without reporting value.
show | Show queue contents
size [n] | Compute queue size n times (default: n == 1)
sort | Sort queue in ascending order
source file | Read commands from source file
time cmd arg ... | Time command execution
Options:
=================================================================
==59852==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0001066e62a0 at pc 0x0001066cdf5f bp 0x7ffee9537220 sp 0x7ffee9537218
READ of size 4 at 0x0001066e62a0 thread T0
#0 0x1066cdf5e in do_help_cmd console.c:307
#1 0x1066cfa81 in interpret_cmda console.c:221
#2 0x1066cf1a6 in interpret_cmd console.c:244
#3 0x1066cf7b7 in run_console console.c:660
#4 0x1066ca8a6 in main qtest.c:788
#5 0x7fff67aa5cc8 in start+0x0 (libdyld.dylib:x86_64+0x1acc8)
0x0001066e62a1 is located 0 bytes to the right of global variable 'echo' defined in 'console.c:59:13' (0x1066e62a0) of size 1
SUMMARY: AddressSanitizer: global-buffer-overflow console.c:307 in do_help_cmd
...(以下省略)...
```
觀察:
- 第 23 行顯示 `global-buffer-overflow`,似乎是存取全域變數時超出所能存取範圍的邊界?
- 第 25 行告訴我們錯誤發生在 `console.c:307`
- 第 32 行提到全域變數 `echo`
全域變數 `echo` 是一個布林,`qtest` 的使用者可透過 `echo` 命令去改變它。我們試試把 `echo` 命令刪除,若錯誤訊息因此消失,我們就能確定問題出在 `echo` 命令:
```diff=
--- a/console.c
+++ b/console.c
@@ -105,7 +105,6 @@ void init_cmd()
NULL);
add_param("verbose", &verblevel, "Verbosity level", NULL);
add_param("error", &err_limit, "Number of errors until exit", NULL);
- add_param("echo", (int *) &echo, "Do/don't echo commands", NULL);
init_in();
init_time(&last_time);
```
重新執行 `qtest`,結果錯誤訊息變成是關於 `simulation` 命令(第 36 行):
```=
./qtest
cmd> help
Commands:
# ... | Display comment
free | Delete queue
help | Show documentation
ih str [n] | Insert string str at head of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
it str [n] | Insert string str at tail of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
log file | Copy output to file
new | Create new queue
option [name val] | Display or set options
quit | Exit program
reverse | Reverse queue
rh [str] | Remove from head of queue. Optionally compare to expected value str
rhq | Remove from head of queue without reporting value.
show | Show queue contents
size [n] | Compute queue size n times (default: n == 1)
sort | Sort queue in ascending order
source file | Read commands from source file
time cmd arg ... | Time command execution
Options:
error 5 Number of errors until exit
fail 30 Number of times allow queue operations to return false
length 1024 Maximum length of displayed string
malloc 0 Malloc failure probability percent
=================================================================
==60691==ERROR: AddressSanitizer: global-buffer-overflow on address 0x000105fc2820 at pc 0x000105fa6fdf bp 0x7ffee9c5e220 sp 0x7ffee9c5e218
READ of size 4 at 0x000105fc2820 thread T0
#0 0x105fa6fde in do_help_cmd console.c:306
#1 0x105fa8b01 in interpret_cmda console.c:220
#2 0x105fa8226 in interpret_cmd console.c:243
#3 0x105fa8837 in run_console console.c:659
#4 0x105fa3946 in main qtest.c:788
#5 0x7fff67aa5cc8 in start+0x0 (libdyld.dylib:x86_64+0x1acc8)
0x000105fc2821 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:21:6' (0x105fc2820) of size 1
'simulation' is ascii string ''
SUMMARY: AddressSanitizer: global-buffer-overflow console.c:306 in do_help_cmd
...(以下省略)...
```
若我們把 `simulation` 和 `echo` 都刪除呢?
```diff=
--- a/console.c
+++ b/console.c
@@ -101,11 +101,8 @@ void init_cmd()
add_cmd("log", do_log_cmd, " file | Copy output to file");
add_cmd("time", do_time_cmd, " cmd arg ... | Time command execution");
add_cmd("#", do_comment_cmd, " ... | Display comment");
- add_param("simulation", (int *) &simulation, "Start/Stop simulation mode",
- NULL);
add_param("verbose", &verblevel, "Verbosity level", NULL);
add_param("error", &err_limit, "Number of errors until exit", NULL);
- add_param("echo", (int *) &echo, "Do/don't echo commands", NULL);
init_in();
init_time(&last_time);
```
**結果發現刪除 `simulation` 和 `echo` 後,`help` 命令不再出現錯誤了。**
全域變數 `simulation` 和 `echo` 型態都是 `bool`,傳入 `add_param` 函式時被轉型成 `(int *)`,莫非在 C 語言中 `bool` 和 `int` 所佔記憶體空間不同?
做個小實驗:
```c=
#include <stdio.h>
#include <stdbool.h>
int main() {
printf("%d\n", (int) sizeof(bool));
printf("%d\n", (int) sizeof(int));
return 0;
}
```
輸出:
```
1
4
```
這確認了 `bool` 的大小是 1;`int` 的大小是 4。所以當我們把一個 `bool` 指標轉型成 `int` 指標時,`int` 指標的使用者(即 `add_param` 函式)可能會存取超出當時 `bool` 變數所配置的記憶體空間。
修正方法如下,將這個兩個全域變數的型態從 `bool` 改成 `int`:
```diff=
diff --git a/console.c b/console.c
index 5a2b9bb..7bb9358 100644
--- a/console.c
+++ b/console.c
@@ -18,7 +18,7 @@
#include "report.h"
/* Some global values */
-bool simulation = false;
+int simulation = 0;
static cmd_ptr cmd_list = NULL;
static param_ptr param_list = NULL;
static bool block_flag = false;
@@ -56,7 +56,7 @@ static int fd_max = 0;
/* Parameters */
static int err_limit = 5;
static int err_cnt = 0;
-static bool echo = 0;
+static int echo = 0;
static bool quit_flag = false;
static char *prompt = "cmd> ";
diff --git a/console.h b/console.h
index a36cd83..51e42f9 100644
--- a/console.h
+++ b/console.h
@@ -8,7 +8,7 @@
/* Implementation of simple command-line interface */
/* Simulation flag of console option */
-extern bool simulation;
+extern int simulation;
/* Each command defined in terms of a function */
typedef bool (*cmd_function)(int argc, char *argv[]);
```
## 待釐清的問題
- [ ] Valgrind 和 Address Sanitizer 差別為何?
- [ ] TBA