# 2019q1 Homework1 (lab0)
###### tags: `sysprog` `2019` `2019spring`
contributed by < `czvkcui` >
:::danger
認真看作業要求,除了提及如何逐步達到要求,還需要進行:
1. 改善程式效能
2. 解釋自動評分系統運作的原理
3. 提及 qtest 的行為和裡頭的技巧
還未達成符合預期目標,請繼續付出!
:notes: jserv
:::
# queue implement
`q_new`,`q_free`,`q_insert_head`,`q_insert_tail`,`q_remove_head`,`q_size`,`q_reverse` implement
## q_new
```clike=
typedef struct
{
list_ele_t *head, *tail; /* Linked list of elements */
/*
You will need to add more fields to this structure
to efficiently implement q_size and q_insert_tail
*/
size_t size;
} queue_t;
queue_t *q_new()
{
queue_t *q = malloc(sizeof(queue_t));
if (q)
{
q->head = NULL;
q->tail = NULL;
q->size = 0;
}
return q;
}
```
size保存linked list 的長度
q_new 初始化`queue_t`結構
## q_free
```clike=
void q_free(queue_t *q)
{
/* Free queue structure */
if (!q) return;
list_ele_t* t;
while (q->head)
{
t = q->head->next;
free(q->head->value);
free(q->head);
q->head = t;
}
free(q);
}
```
## q_insert_head
```clike=
bool q_insert_head(queue_t *q, char *s)
{
list_ele_t *newh = malloc(sizeof(list_ele_t));
int len = strlen(s);
if (!q) goto error;
if (!newh) goto error;
newh->value = malloc((len + 1) * sizeof(char));
if (!newh->value) goto error;
strncpy(newh->value, s, len);
newh->value[len] = '\0';
newh->next = q->head;
if (!q->head)
{
q->tail = newh;
}
q->head = newh;
q->size++;
return true;
error:
if (newh) free(newh);
return false;
}
```
利用`goto`來統一處理例外狀況
## q_insert_tail
```clike=
bool q_insert_tail(queue_t *q, char *s)
{
list_ele_t *newh = malloc(sizeof(list_ele_t));
int len = strlen(s);
if (!q) goto error;
if (!newh) goto error;
newh->value = malloc((len + 1) * sizeof(char));
if (!newh->value) goto error;
strncpy(newh->value, s, len);
newh->value[len] = '\0';
newh->next = NULL;
if (!q->tail)
{
q->head = newh;
q->tail = newh;
}
q->tail->next = newh;
q->tail = newh;
q->size++;
return true;
error:
if (newh) free(newh);
return false;
}
```
## q_remove_head
```clike=
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
if (!q || !q->head) goto error;
char* str = q->head->value;
if (sp)
{
strncpy(sp, str, bufsize - 1);
sp[bufsize - 1] = '\0';
}
list_ele_t* t = q->head;
q->head = q->head->next;
free(t->value);
free(t);
q->size--;
if (q->size == 0) q->tail = NULL;
return true;
error:
return false;
}
```
## q_size
```clike=
int q_size(queue_t *q)
{
if (!q || !q->head) return 0;
return q->size;
}
```
## q_reverse
```clike=
void q_reverse(queue_t *q)
{
if (!q || !q->head || !q->head->next) return;
if (q->head == q->tail) return;
q->tail = q->head;
q->head = q->head->next;
list_ele_t* back = q->tail, *curr = q->head->next;
while (curr != NULL)
{
q->head->next = back;
back = q->head;
q->head = curr;
curr = curr->next;
}
q->head->next = back;
q->tail->next = NULL;
}
```
# 解釋自動評分系統運作的原理
## scripts
### git hook
we have 2 hooks inside our scripts:
1. pre-commit.hook
2. commit-msg.hook
### pre-commit.hook
* trigger on `git commit` command
* check format is suitable with clang format
* use cppcheck checking
### commit-msg.hook
1. check commit message with regular expression
## qtest command interpret
### 初始化階段:
```c
queue_init();
init_cmd();
console_init();
set_verblevel(level);
```
init_cmd和console_init會向console註冊所使用的command
### command parse
`main`->`run_console`->`cmd_select`->`interpret_cmd`->`interpret_cmda`
### interpret_cmda執行
```c
cmd_ptr next_cmd = cmd_list;
bool ok = true;
while (next_cmd && strcmp(argv[0], next_cmd->name) != 0)
next_cmd = next_cmd->next;
if (next_cmd) {
ok = next_cmd->operation(argc, argv);
if (!ok)
record_error();
} else {
... // error handling
}
```
`cmd_list`是一個link list,它保存了`queue_init`和`init_cmd`所註冊的command。透過循序搜尋找到符合的cmd,然後執行對應的函式(next_cmd->operation)。
## report.[hc]
```c
static void check_exceed(size_t new_bytes)
{
size_t limit_bytes = (size_t) mblimit << 20;
size_t request_bytes = new_bytes + current_bytes;
if (mblimit > 0 && request_bytes > limit_bytes) {
report_event(MSG_FATAL,
"Exceeded memory limit of %u megabytes with %lu bytes",
mblimit, request_bytes);
}
}
```
Q:Why limit_bytes = mblimit<<20 ?
A:mblimit 作為判斷malloc有無超出上限,`mblimit<<20`代表最高可以使用的大小(KB),在本例中`0`<<20還是為0(`mblimit`預設為`0`表示無上限)
```c
void init_time(double *timep)
{
(void) delta_time(timep);
}
```
Q:Why we need `(void)` before delta_time(timep) ?
A:某些編譯器可能會警告使用者沒有處理返回值,加上`(void)`來處理返回值(轉形成void),但是像是`printf`等常用的function可以在宣告時加上`__attribute__(noreturn)`來避免。
> 在gcc中 `-Wall` 預設會開啟`Wunused-result`,不過在我的平台上(`gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0`),刪除`(void)`和加上`-Wunused-result`都沒有效果。此外也可以在function宣告加上`__attribute__((warn_unused_result))`強迫警告使用者處理返回值。
>
### getrusage:
獲取資源的使用情形
```c
/* Number of bytes resident in physical memory */
size_t resident_bytes()
{
struct rusage r;
size_t mem = 0;
int code = getrusage(RUSAGE_SELF, &r);
if (code < 0) {
report_event(MSG_ERROR, "Call to getrusage failed");
} else {
mem = r.ru_maxrss * 1024;
}
return mem;
}
```
declaration:
`int getrusage(int who, struct rusage *usage);`
usage接受一個參數負責返回結果,而who則有以下參數
* RUSAGE_SELF:
只統計本身process的使用資訊(包含thread)
* RUSAGE_CHILDREN:
會統計所有children和calling的process的資訊
* RUSAGE_THREAD(linux 2.6.26以後):
統計calling thread的資訊
rusage structure:
```c
struct rusage {
struct timeval ru_utime; /* user CPU time used */
struct timeval ru_stime; /* system CPU time used */
long ru_maxrss; /* maximum resident set size */
long ru_ixrss; /* integral shared memory size */
long ru_idrss; /* integral unshared data size */
long ru_isrss; /* integral unshared stack size */
long ru_minflt; /* page reclaims (soft page faults) */
long ru_majflt; /* page faults (hard page faults) */
long ru_nswap; /* swaps */
long ru_inblock; /* block input operations */
long ru_oublock; /* block output operations */
long ru_msgsnd; /* IPC messages sent */
long ru_msgrcv; /* IPC messages received */
long ru_nsignals; /* signals received */
long ru_nvcsw; /* voluntary context switches */
long ru_nivcsw; /* involuntary context switches */
};
```
## console.[hc]
### cmd_select ,run_console:
```c
bool run_console(char *infile_name)
{
if (!push_file(infile_name)) {
report(1, "ERROR: Could not open source file '%s'", infile_name);
return false;
}
while (!cmd_done()) {
cmd_select(0, NULL, NULL, NULL, NULL);
}
return err_cnt == 0;
}
```
## harness.[hc]
### test_malloc and test_free
`queue.c`因為有include `harness.h` 因此呼叫的`malloc`和`free`為`test_malloc`和`test_free`。
struct of block_ele_t
```c
typedef struct BELE {
struct BELE *next;
struct BELE *prev;
size_t payload_size;
size_t magic_header; /* Marker to see if block seems legitimate */
unsigned char payload[0];
/* Also place magic number at tail of every block */
} block_ele_t;
```
payload[0]是C99後引入的語法,可以允許使用者在runtime時才決定該array大小。
```c
block_ele_t *new_block =
malloc(size + sizeof(block_ele_t) + sizeof(size_t));
```
size是是真正要給使用者使用的空間大小(bytes),加上block_ele_t的大小,sizeof(size_t)則是儲存footer的空間。在`test_malloc`中,block_ele_t會依序填上`MAGICHEADER`、`size`、 `FOOTER`、`payload`,接著將next指向link list的開頭(allcated則指向新的new block,舊的prev指標指向新的block),再由`allocated_count++`後完成記憶體分配。`test_free`會檢查要被釋放的block的`MAGICHEADER`和`MAGICFOOTER`。當payload存取的資料超過payload的大小時會覆蓋自己的`MAGICFOOTER`,`MAGICHEADER`則可以檢查自己的block是否被別人所覆蓋[^1]。最後`test_free`將block移出double link list,`allocated_count--`釋放block。
### trigger_exception,exception_setup,exception_cancel
`jmp_read`的type是`volatile`,在參考[from C 語言 setjmp 與 longjmp 函數用法教學](https://blog.gtwang.org/programming/c-setjmp-longjmp-function-tutorial/)後得知:
* 未開啟最佳化options(`-O`)只有 `register` 宣告的變數會改變(not change:`volatile`.`global`,`static`,`local`)
* with `-O` option: `register`,`local`宣告的變數值均會改變
* `sigsetjmp`,`siglongjmp`:
* 和`setjmp`和`longjmp`一樣都是nonlocal jmp
* 但是`setjmp` 和`longjmp`並不會在jump儲存當前的signal mask
```c
void trigger_exception(char *msg)
{
error_occurred = true;
error_message = msg;
if (jmp_ready)
siglongjmp(env, 1);
else
exit(1);
}
```
當exception觸發(SIGSEGV,SIGALRM),`jmp_rady`會被trigger成true,而進行`setlongjmp`至上一個`exception_setup`位置(`if (sigsetjmp(env, 1))`),印出錯誤訊息後返回。
`exception_cancel`:清除'jmp_ready'和error_message,若是超時則會觸發alarm。
## qtest.c
```c
bool do_new(int argc, char *argv[])
{
if (argc != 1) {
report(1, "%s takes no arguments", argv[0]);
return false;
}
bool ok = true;
if (q != NULL) {
report(3, "Freeing old queue");
ok = do_free(argc, argv);
}
error_check();
if (exception_setup(true))
q = q_new();
exception_cancel();
qcnt = 0;
show_queue(3);
return ok && !error_check();
}
```
根據`exception_setup`和`exception_cancel`可以得知
```c
static void queue_init()
{
fail_count = 0;
q = NULL;
signal(SIGSEGV, sigsegvhandler);
signal(SIGALRM, sigalrmhandler);
}
```
`queue_init`會使qtest在初始化時註冊`SIGSEGV`和`SIGALRM`,當中斷發生時OS會通知signal handler做對應的處理,在本例中`SIGSEGV`發生時會交由`sigsegvhandler`,而`SIGALARM`由`sigalrmhandler`處理,兩者都是印出錯誤訊息後,呼叫`trigger_exception`。
# 效能改善
## Test in parallel
修改`driver.py` 以 process pool來執行:
Performance counter stats for 'python scripts/driver.py'(with process pool):
1276.391265 task-clock (msec) # 1.327 CPUs utilized
842 context-switches # 0.660 K/sec
58 cpu-migrations # 0.045 K/sec
135,961 page-faults # 0.107 M/sec
3,941,655,638 cycles # 3.088 GHz
6,034,081,563 instructions # 1.53 insn per cycle
1,167,962,182 branches # 915.050 M/sec
2,832,393 branch-misses # 0.24% of all branches
0.961677493 seconds time elapsed
Performance counter stats for 'python scripts/driver.py':
1007.687420 task-clock (msec) # 0.797 CPUs utilized
803 context-switches # 0.797 K/sec
12 cpu-migrations # 0.012 K/sec
131,010 page-faults # 0.130 M/sec
3,107,377,907 cycles # 3.084 GHz
5,928,785,192 instructions # 1.91 insn per cycle
1,146,581,106 branches # 1137.834 M/sec
2,129,417 branch-misses # 0.19% of all branches
1.264532373 seconds time elapsed
In [2]: (1.264532373-0.961677493)/1.264532373 * 100
Out[2]: 23.949950706402355
## compiler opt
在用perf record分析時,發現有些部份可能可以交由編譯器最佳化,例如每次malloc都會被呼叫但是回傳值都是一樣的`fail_allocation`
* orig
```
21,424,519 cache-misses # 91.576 % of all cache refs ( +- 0.05% )
23,395,281 cache-references ( +- 0.17% )
1,580,167,387 instructions # 1.51 insn per cycle ( +- 0.05% )
1,049,546,666 cycles ( +- 1.31% )
246,893 branch-misses ( +- 2.07% )
0.343512614 seconds time elapsed ( +- 1.37% )
```
* gcc -O2 flag
```
21,198,677 cache-misses # 92.116 % of all cache refs ( +- 0.06% )
23,013,085 cache-references ( +- 0.14% )
1,242,986,343 instructions # 1.39 insn per cycle ( +- 0.05% )
893,054,339 cycles ( +- 1.19% )
232,820 branch-misses ( +- 1.79% )
0.291908910 seconds time elapsed
```
* gcc -O3 flag
```
21,257,036 cache-misses # 92.071 % of all cache refs ( +- 0.06% )
23,087,689 cache-references ( +- 0.15% )
1,245,395,797 instructions # 1.36 insn per cycle ( +- 0.06% )
916,791,967 cycles ( +- 1.35% )
243,474 branch-misses ( +- 1.96% )
0.299784791 seconds time elapsed ( +- 1.45% )
```
* clang -O2 flag
```
21,258,614 cache-misses # 91.996 % of all cache refs ( +- 0.04% )
23,108,203 cache-references ( +- 0.16% )
1,242,551,526 instructions # 1.34 insn per cycle ( +- 0.06% )
924,695,143 cycles ( +- 1.43% )
261,872 branch-misses ( +- 8.10% )
0.302822045 seconds time elapsed ( +- 1.53% )
```
* clang -O3 flag
```
21,243,947 cache-misses # 92.033 % of all cache refs ( +- 0.03% )
23,082,922 cache-references ( +- 0.11% )
1,241,602,442 instructions # 1.35 insn per cycle ( +- 0.05% )
917,623,562 cycles ( +- 1.30% )
249,315 branch-misses ( +- 5.23% )
0.299945914 seconds time elapsed ( +- 1.28% )
```
# 驗證常數時間複雜度[^3]
Reference
---
[^1]:[0xff07筆記](https://hackmd.io/s/SJnc7dtSN?fbclid=IwAR0oJhQQ2GTM4NNXGHi8HHw2D0tRHAtIYxFE60PgceKMUXlqOS4Bl6c2EHg)
[^2]:[cjwind筆記](https://hackmd.io/s/r1vWC9tS4?fbclid=IwAR1ijirhutApLtwOivWqGMh3LBNzYCeXkwdkHYpIPcgUkUYo5Jw-87ZImjs#)
[^3]:[afcidk筆記](https://hackmd.io/s/ry4VZS9SN?fbclid=IwAR0cG3zIIgovml4IUqqPqgOGtdpdJfGAOkzFPXE8-xyJk4n4qm4KoY8_8TU#%E9%A9%97%E8%AD%89%E5%B8%B8%E6%95%B8%E6%99%82%E9%96%93%E8%A4%87%E9%9B%9C%E5%BA%A6)