# 2020q1 Homework1 (lab0)
contributed by < `frankchang0125` >
:::info
題目出處:
- [H01: lab0](https://hackmd.io/@sysprog/linux2020-lab0)
- [C Programming Lab: Assessing Your C Programming Skills](http://www.cs.cmu.edu/~213/labs/cprogramminglab.pdf)
:::
## 開發環境
採用 MacBook Pro + VirtualBox (Ubuntu 18.04),環境都是在 Ubuntu 上做開發。
```shell
$ uname -a
Linux frankchang-ubuntu 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ gcc --version
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 69
Model name: Intel(R) Core(TM) i5-4288U CPU @ 2.60GHz
Stepping: 1
CPU MHz: 2600.000
BogoMIPS: 5200.00
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0,1
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm invpcid_single pti fsgsbase avx2 invpcid md_clear flush_l1d
```
## 作業要求
依據指示著手修改 queue.[ch] 和連帶的檔案:
- q_new(): Create empty queue.
- q_free(): Free all storage used by queue.
- q_insert_head(): Attempt to insert element at head of queue.
- q_insert_tail(): Attempt to insert element at tail of queue.
- q_remove_head(): Attempt to insert element at tail of queue.
- q_size(): Return number of elements in queue.
- q_reverse(): Reverse elements in queue. This function should not allocate or free any list elements. It should rearrange the existing ones.
- q_sort(): Sort elements of queue in ascending order.
## 實作
依照 [C Programming Lab: Assessing Your C Programming Skills](http://www.cs.cmu.edu/~213/labs/cprogramminglab.pdf) 的說明,`q_insert_tail` 及 `q_size` 的時間複雜度必須為 $O(1)$,因此在 `queue_t` 中新增 `list_ele_t *tail` 及 `int size` 來指向目前 queue 中最後一個節點及目前的 queue 的大小:
```=c=26
typedef struct {
list_ele_t *head; /* Head node of linked list */
list_ele_t *tail; /* Tail node of linked list */
int size; /* Size of linked list */
} queue_t;
```
### q_new()
```c=9
/*
* Create empty queue.
* Return NULL if could not allocate space.
*/
queue_t *q_new()
{
queue_t *q = malloc(sizeof(queue_t));
if (!q) {
return NULL;
}
q->head = NULL;
q->tail = NULL;
q->size = 0;
return q;
}
```
1. 分配 queue 記憶體空間。
2. 對 `q->head`、`q->tail` 及 `q->size` 初始化。
### q_free()
```c=28
/* Free all storage used by queue */
void q_free(queue_t *q)
{
if (!q) {
return;
}
list_ele_t *cur = q->head;
list_ele_t *node;
while (cur) {
node = cur;
cur = cur->next;
free(node->value);
free(node);
}
/* Free queue structure */
free(q);
}
```
1. 判斷 `q` 是否為 `NULL`。
2. 在 `q_free()` 中,透過 `while` 迴圈,一個一個造訪 queue 中的每個 node 並將其所佔用的記憶體空間給釋放。在釋放 node 本身的記憶體空間前,要先釋放其 `value` 字串所佔的空間。當所有 nodes 所佔用的記憶體空間都釋放完後,最後再釋放 queue 本身所佔的記憶體空間。
### q_allocate_node()
由於分配 node 記憶體空間除了在 `q_insert_head()` 中會用到,也會在 `q_insert_tail()` 中用到。因此我特別多定義了一個 `q_allocate_node()` 來統一處理:
```c=210
/*
* Allocate node space for given string.
* Return NULL if s is NULL, empty string
* or could not allocate space.
* Otherwise, return the address of the allocated node.
*/
list_ele_t *q_allocate_node(char *s)
{
if (!s) {
return NULL;
}
size_t s_len = strlen(s) + 1;
if (s_len == 1) {
return NULL;
}
list_ele_t *node = malloc(sizeof(list_ele_t));
if (!node) {
return NULL;
}
node->next = NULL;
node->value = malloc(s_len);
if (!node->value) {
free(node);
return NULL;
}
memcpy(node->value, s, s_len);
return node;
}
```
1. 檢查所傳入的 `char *s` 是否為 `NULL` 或是`空字串`,若為真,則直接 `return NULL`。
2. 分配 `node` 及其 `value` 字串的記憶體空間。特別要注意的是如果在分配 `value` 字串記憶體空間失敗的時候,需要在 `return NULL` 之前,把已經分配給 node 的記憶體空間給釋放掉,否則會有 memory leak 的情況。
### q_insert_head()
```c=48
/*
* Attempt to insert element at head of queue.
* Return true if successful.
* Return false if q is NULL or could not allocate space.
* Argument s points to the string to be stored.
* The function must explicitly allocate space and copy the string into it.
*/
bool q_insert_head(queue_t *q, char *s)
{
if (!q) {
return false;
}
list_ele_t *node = q_allocate_node(s);
if (!node) {
return false;
}
node->next = q->head;
q->head = node;
if (!q->tail) {
q->tail = node;
}
q->size += 1;
return true;
}
```
1. 檢查 `q` 是否為 `NULL`。
2. 呼叫 `q_allocate_node()` 來分配 node 及其 `value` 字串的記憶體空間。
3. 將 `node->next` 指向目前的 `q->head`。
4. 將 `q->head` 指向新增的 node。
5. 如果 `q->tail` 為 `NULL`,代表原先 queue 中並沒有任何的 nodes,因此 `q->tail` 也應指向本次所新增的 node。
6. 將 `q->size` 加 1。
### q_insert_tail()
```c=79
/*
* Attempt to insert element at tail of queue.
* Return true if successful.
* Return false if q is NULL or could not allocate space.
* Argument s points to the string to be stored.
* The function must explicitly allocate space and copy the string into it.
*/
bool q_insert_tail(queue_t *q, char *s)
{
if (!q) {
return false;
}
list_ele_t *node = q_allocate_node(s);
if (!node) {
return false;
}
if (!q->head) {
q->head = node;
}
if (q->tail) {
q->tail->next = node;
}
q->tail = node;
q->size += 1;
return true;
}
```
1. 檢查 `q` 是否為 `NULL`。
2. 呼叫 `q_allocate_node()` 來分配 node 及其 `value` 字串的記憶體空間。
3. 如果 `q->head` 為 `NULL`,代表原先 queue 中並沒有任何的 nodes,因此 `q->head` 也應指向本次所新增的 node。
4. 如果 `q->tail` 不為 `NULL`,則將目前的 `q->tail` node 的 `next pointer` 指向本次所新增的 node。
5. 將 `q->tail` 更新為本次所新增的 node。
6. 將 `q->size` 加 1。
### q_remove_head()
```c=112
/*
* attempt to remove element from head of queue.
* Return true if successful.
* Return false if queue is NULL or empty.
* If sp is non-NULL and an element is removed, copy the removed string to *sp
* (up to a maximum of bufsize-1 characters, plus a null terminator.)
* The space used by the list element and the string should be freed.
*/
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
if (!q || !q->head) {
return false;
}
list_ele_t *head = q->head;
q->head = q->head->next;
if (q->tail == head) {
q->tail = NULL;
}
if (sp) {
strncpy(sp, head->value, bufsize - 1);
sp[bufsize - 1] = '\0';
}
free(head->value);
free(head);
q->size -= 1;
return true;
}
```
1. 檢查 `q` 是否為 `NULL`,及是否有 head node 可以被移除。
2. 宣告一新的指標指向目前的 `q->head`,以便在最後釋放其記憶體空間;更新 `q->head` 為下一個 node (or `NULL`)。
3. 如果 `q->tail == q->head`,代表目前 queue 終止有一個 node,因此必須將 `q->tail` 也設為 `NULL`。
4. 複製欲被移除的 head node 的 `value` 字串至 `*sp`,最多 `bufsize - 1` 個字元 + `NULL terminator`。
5. 釋放 head node 及其 `value` 字串所佔的記憶體空間。
### q_size()
```c=145
/*
* Return number of elements in queue.
* Return 0 if q is NULL or empty
*/
int q_size(queue_t *q)
{
if (!q) {
return 0;
}
return q->size;
}
```
1. 檢查 `q` 是否為 `NULL`,若為 `NULL` (代表 `new` 指令沒有被執行) 則直接回傳 0。
2. 若 `q` 不為 `NULL`,則直接回傳 `q->size` ($O(1)$ 時間複雜度)。
### q_reverse()
```c=157
/*
* Reverse elements in queue
* No effect if q is NULL or empty
* This function should not allocate or free any list elements
* (e.g., by calling q_insert_head, q_insert_tail, or q_remove_head).
* It should rearrange the existing ones.
*/
void q_reverse(queue_t *q)
{
if (!q || !q->head) {
return;
}
list_ele_t *prev, *cur, *next;
prev = NULL;
cur = q->head;
q->tail = q->head;
while (cur) {
next = cur->next;
cur->next = prev;
prev = cur;
cur = next;
}
q->head = prev;
}
```
1. 檢查 `q` 是否為 `NULL` 或是 empty queue。
2. 透過 `prev` 記錄前一個 node 及 `next` 記錄下一個 node,並依序將 `cur` 的 `next pointer` 指向 `prev`。最後的 `prev` 即為原 queue 最後的 node,因此最後將 `q->head` 指向 `prev` 即完成 queue 的反轉。
### q_sort()
```c=186
/*
* Sort elements of queue in ascending order
* No effect if q is NULL or empty. In addition, if q has only one
* element, do nothing.
*/
void q_sort(queue_t *q)
{
if (!q || !q->head) {
return;
}
if (q->head == q->tail) {
return;
}
q->head = merge_sort(q->head);
// 更新 q->tail
q->tail = q->head;
while (q->tail && q->tail->next) {
q->tail = q->tail->next;
}
}
```
#### merge_sort()
```c=247
list_ele_t *merge_sort(list_ele_t *head)
{
if (!head || !head->next) {
return head;
}
/* Use fast/slow pointers to split list
* into left and right parts.
*/
list_ele_t *fast = head->next;
list_ele_t *slow = head;
while (fast && fast->next) {
fast = fast->next->next;
slow = slow->next;
}
list_ele_t *left = head;
list_ele_t *right = slow->next;
slow->next = NULL;
/* Perform merge sort on left part and right part lists separately. */
left = merge_sort(left);
right = merge_sort(right);
/* Merge the sorted left part and right part lists into one list. */
return merge(left, right);
}
```
#### merge()
```c=280
list_ele_t *merge(list_ele_t *left, list_ele_t *right)
{
if (!left) {
return right;
}
if (!right) {
return left;
}
list_ele_t *result = NULL;
list_ele_t *cur = NULL;
// Initialize, compare left and right list's first node's value
// string length, point result to the node with shorter string length
// determined by strcasecmp().
if (strcasecmp(left->value, right->value) < 1) {
result = cur = left;
left = left->next;
} else {
result = cur = right;
right = right->next;
}
while (left && right) {
if (strcasecmp(left->value, right->value) < 1) {
cur->next = left;
left = left->next;
} else {
cur->next = right;
right = right->next;
}
cur = cur->next;
}
// Left list has nodes left.
if (left) {
cur->next = left;
}
// Right list has nodes left.
if (right) {
cur->next = right;
}
return result;
}
```
- `q_sort()` 使用 merge sort 來排序 queue 中的 nodes。一開始透過快慢 pointer 的方式找出 linked list 中的 middle node,將其切分為左半部及右半部的 lists 分別做 merge sort (i.e. Divide and conquer)。
- Merge sort 的最好/平均/最差的時間複雜度都是 $O(nlogn)$。一般 array 透過 merge sort 排序的時候都需要額外分配同樣長度的陣列當作暫存,因此空間複雜度為 $O(n)$,而這也是為何一般的 `sort()` 底層都不會是採用 merge sort (實務上,會根據資料特性,e.g. 資料長度,混和多個不同的 sort algorithms)。但由於我們這邊排序的是 linked list,只需要改變 `next pointer` 所指向的 node 即可,因此空間複雜度僅為 $O(1)$。
### 執行結果
```
$ make test
scripts/driver.py
--- Trace Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
--- trace-01-ops 6/6
+++ TESTING trace trace-02-ops:
# Test of insert_head, insert_tail, and remove_head
--- trace-02-ops 6/6
+++ TESTING trace trace-03-ops:
# Test of insert_head, insert_tail, reverse, and remove_head
--- trace-03-ops 6/6
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
--- trace-04-ops 6/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
--- trace-05-ops 5/5
+++ TESTING trace trace-06-string:
# Test of truncated strings
--- trace-06-string 6/6
+++ TESTING trace trace-07-robust:
# Test operations on NULL queue
--- trace-07-robust 6/6
+++ TESTING trace trace-08-robust:
# Test operations on empty queue
--- trace-08-robust 6/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
--- trace-09-robust 6/6
+++ TESTING trace trace-10-malloc:
# Test of malloc failure on new
--- trace-10-malloc 6/6
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on insert_head
--- trace-11-malloc 6/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on insert_tail
--- trace-12-malloc 6/6
+++ TESTING trace trace-13-perf:
# Test performance of insert_tail
--- trace-13-perf 6/6
+++ TESTING trace trace-14-perf:
# Test performance of size
--- trace-14-perf 6/6
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
--- trace-15-perf 6/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
--- trace-16-perf 6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
--- trace-17-complexity 5/5
--- TOTAL 100/100
```
:::warning
TODO:
看到<s>別人</s> [MetalheadKen 的作業](https://hackmd.io/@Ken-Dai/linux2020-hw1-lab0#q_insert_tail) 有提到 Linus Torvalds 所分享的 [**Good taste of codes**](https://www.ted.com/talks/linus_torvalds_the_mind_behind_linux)。改寫本作業 linked list 的實作使其符合 **Good taste of codes** 規範。
把人名或代號標注出來,不要說「別人」
:notes: jserv
:::
> 已更正
[name=Frank Chang][time=Mar 01 2020][color=orange]
## Natsort
:::warning
思考 nature sort order 在什麼場合用得到?提示: 思考 Linux 核心版本號碼
:notes: jserv
:::
採用 [natsort](https://github.com/sourcefrog/natsort) 來比對字串。如同 [natsort](https://github.com/sourcefrog/natsort) 說明所述,在排序以下字串:`rfc1.txt`、`rfc2086.txt`、`rfc822.txt` 時,[natsort](https://github.com/sourcefrog/natsort) 的排序結果將會是:`rfc1.txt` -> `rfc822.txt` -> `rfc2086.txt`,也就是我們 "人類" 比較習慣的排序方式。而一般的 `strcmp()` 結果則會是:`rfc1.txt` ->`rfc2086.txt` -> `rfc822.txt`。
1. 將 `strnatcmp.h` 及 `strnatcmp.c` 加入專案中。
2. 將 `strnatcmp.o` 加入 Makefile compile objects (`OBJS`) 中:
```shell
OBJS := qtest.o report.o console.o harness.o queue.o \
random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \
strnatcmp.o
```
3. 在 `queue.c` 中 include `strnatcmp.h`:
```cpp=6
#include "strnatcmp.h"
```
4. 將 merge sort 中 `merge()` 的 `strcasecmp()` 替換為 `strnatcasecmp()`:
```cpp=301
if (strnatcasecmp(left->value, right->value) < 1) {
result = cur = left;
left = left->next;
} else {
result = cur = right;
right = right->next;
}
```
```cpp=309
while (left && right) {
if (strnatcasecmp(left->value, right->value) < 1) {
cur->next = left;
left = left->next;
} else {
cur->next = right;
right = right->next;
}
cur = cur->next;
}
```
5. 新增 traces 測試案例。在 [natsort](https://github.com/sourcefrog/natsort) repo 中已經有其所使用的 test cases 了。將其複製至 `traces/` 下,並加上指令,也就是將:
```=1
2000-1-10
2000-1-2
1999-12-25
2000-3-23
1999-3-3
```
修改為:
```=1
new
ih 2000-1-10
ih 2000-1-2
ih 1999-12-25
ih 2000-3-23
ih 1999-3-3
sort
free
```
6. 修改 `scripts/driver.py`,新增測試案例:
a. 修改 `traceDict`:
```python=21
traceDict = {
1: "trace-01-ops",
2: "trace-02-ops",
3: "trace-03-ops",
4: "trace-04-ops",
...
18: "trace-18-dates",
19: "trace-19-debs",
20: "trace-20-debvers",
21: "trace-21-fractions",
22: "trace-22-versions"
}
```
b. 修改 `traceProbs`:
```python=41
traceProbs = {
1: "Trace-01",
2: "Trace-02",
3: "Trace-03",
4: "Trace-04",
...
18: "Trace-18",
19: "Trace-19",
20: "Trace-20",
21: "Trace-21",
22: "Trace-22"
}
```
c. 修改 `maxScores`
```python=61
maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5,
6, 6, 6, 6, 6]
```
新增五組 tests 的分數 (各為 6 分)。
7. 再次執行 `make test`:
```
$ make test
scripts/driver.py
--- Trace Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
--- trace-01-ops 6/6
+++ TESTING trace trace-02-ops:
# Test of insert_head, insert_tail, and remove_head
--- trace-02-ops 6/6
+++ TESTING trace trace-03-ops:
# Test of insert_head, insert_tail, reverse, and remove_head
--- trace-03-ops 6/6
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
--- trace-04-ops 6/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
--- trace-05-ops 5/5
+++ TESTING trace trace-06-string:
# Test of truncated strings
--- trace-06-string 6/6
+++ TESTING trace trace-07-robust:
# Test operations on NULL queue
--- trace-07-robust 6/6
+++ TESTING trace trace-08-robust:
# Test operations on empty queue
--- trace-08-robust 6/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
--- trace-09-robust 6/6
+++ TESTING trace trace-10-malloc:
# Test of malloc failure on new
--- trace-10-malloc 6/6
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on insert_head
--- trace-11-malloc 6/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on insert_tail
--- trace-12-malloc 6/6
+++ TESTING trace trace-13-perf:
# Test performance of insert_tail
--- trace-13-perf 6/6
+++ TESTING trace trace-14-perf:
# Test performance of size
--- trace-14-perf 6/6
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient
--- trace-15-perf 0/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
--- trace-16-perf 6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
--- trace-17-complexity 5/5
+++ TESTING trace trace-18-dates:
ERROR: Not sorted in ascending order
--- trace-18-dates 0/6
+++ TESTING trace trace-19-debs:
ERROR: Not sorted in ascending order
--- trace-19-debs 0/6
+++ TESTING trace trace-20-debvers:
ERROR: Not sorted in ascending order
--- trace-20-debvers 0/6
+++ TESTING trace trace-21-fractions:
--- trace-21-fractions 6/6
+++ TESTING trace trace-22-versions:
ERROR: Not sorted in ascending order
--- trace-22-versions 0/6
--- TOTAL 100/130
```
可以觀察到下列幾點:
1. 由 [natsort](https://github.com/sourcefrog/natsort) 來比對字串。所新增的 test cases (18 ~ 22),測試結果皆失敗 (`ERROR: Not sorted in ascending order`),這是因為 `qtest.c` 中的 `q_sort()` 所使用的字串比對函式為 `strcasecmp()`:
```c=559
for (list_ele_t *e = q->head; e && --cnt; e = e->next) {
/* Ensure each element in ascending order */
/* FIXME: add an option to specify sorting order */
if (strcasecmp(e->value, e->next->value) > 0) {
```
將其換為 [natsort](https://github.com/sourcefrog/natsort) 的 `strnatcasecmp()` 即可通過 test cases 18 ~ 22:
```
$ make test
scripts/driver.py
--- Trace Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
--- trace-01-ops 6/6
+++ TESTING trace trace-02-ops:
# Test of insert_head, insert_tail, and remove_head
--- trace-02-ops 6/6
+++ TESTING trace trace-03-ops:
# Test of insert_head, insert_tail, reverse, and remove_head
--- trace-03-ops 6/6
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
--- trace-04-ops 6/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
--- trace-05-ops 5/5
+++ TESTING trace trace-06-string:
# Test of truncated strings
--- trace-06-string 6/6
+++ TESTING trace trace-07-robust:
# Test operations on NULL queue
--- trace-07-robust 6/6
+++ TESTING trace trace-08-robust:
# Test operations on empty queue
--- trace-08-robust 6/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
--- trace-09-robust 6/6
+++ TESTING trace trace-10-malloc:
# Test of malloc failure on new
--- trace-10-malloc 6/6
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on insert_head
--- trace-11-malloc 6/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on insert_tail
--- trace-12-malloc 6/6
+++ TESTING trace trace-13-perf:
# Test performance of insert_tail
--- trace-13-perf 6/6
+++ TESTING trace trace-14-perf:
# Test performance of size
--- trace-14-perf 6/6
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient
--- trace-15-perf 0/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
--- trace-16-perf 6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
--- trace-17-complexity 5/5
+++ TESTING trace trace-18-dates:
--- trace-18-dates 6/6
+++ TESTING trace trace-19-debs:
--- trace-19-debs 6/6
+++ TESTING trace trace-20-debvers:
--- trace-20-debvers 6/6
+++ TESTING trace trace-21-fractions:
--- trace-21-fractions 6/6
+++ TESTING trace trace-22-versions:
--- trace-22-versions 6/6
--- TOTAL 124/130
```
2. `trace-15-perf` 超時 (`ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient`)。在交叉比對後,發現是在引入 [natsort](https://github.com/sourcefrog/natsort) 後所造成。需將 `harness.c` 中的 `time_limit` 值增加至 2 以上才能通過。
:::warning
TODO:
研究 [natsort](https://github.com/sourcefrog/natsort) 的 `strnatcmp0()` 是否能提升其演算法效率。
:::
## 使用 Valgrind 排除 `qtest` 實作的記憶體錯誤
使用 `make valgrind` 指令可以使用 Valgrind 分析我們的程式是否有 memory leak 的問題。
### Makefile 分析
```makefile
valgrind_existence:
@which valgrind 2>&1 > /dev/null || (echo "FATAL: valgrind not found"; exit 1)
valgrind: valgrind_existence
# Explicitly disable sanitizer(s)
$(MAKE) clean SANITIZER=0 qtest
$(eval patched_file := $(shell mktemp /tmp/qtest.XXXXXX))
cp qtest $(patched_file)
chmod u+x $(patched_file)
sed -i "s/alarm/isnan/g" $(patched_file)
scripts/driver.py -p $(patched_file) --valgrind
@echo
@echo "Test with specific case by running command:"
@echo "scripts/driver.py -p $(patched_file) --valgrind -t <tid>"
```
1. 檢查 `valgrind` 指令是否存在。
2. 關閉 `SANITIZER`,並重新編譯 `qtest`。
3. 因為在跑 Valgrind 的時候我們必須將 `alarm()` 給取消,因此複製所編譯出來的 `qtest`,至 `/tmp` 資料夾,並透過 `sed` 將 `qtest` binary 中的 `alarm` 函式呼叫全部替換為 `isnan`。
4. 執行 `driver.py`,並加上 `--valgrind` 參數,在 `driver.py` 中就會使用 Valgrind 來執行 `qtest`。
### .valgrindrc 分析
可以閱讀 [Valgrind manual 2.6.7](https://valgrind.org/docs/manual/manual-core.html#manual-core.defopts) 對於 `.valgrindrc` 的說明:
> Note that Valgrind also reads options from three places:
> 1. The file ~/.valgrindrc
> 2. The environment variable $VALGRIND_OPTS
> 3. The file ./.valgrindrc
>
> These are processed in the given order, before the command-line options. Options processed later override those processed earlier; for example, options in ./.valgrindrc will take precedence over those in ~/.valgrindrc.
> Any tool-specific options put in $VALGRIND_OPTS or the .valgrindrc files should be prefixed with the tool name and a colon. For example, if you want Memcheck to always do leak checking, you can put the following entry in ~/.valgrindrc:
```
--memcheck:leak-check=yes
```
> This will be ignored if any tool other than Memcheck is run. Without the memcheck: part, this will cause problems if you select other tools that don't understand --leak-check=yes.
而 `qtest` 的 `.valgrindrc` 內容如下:
```
--quiet
--memcheck:leak-check=full
--show-leak-kinds=all
--error-exitcode=1
```
- **-q --quiet**: Run silently, and only print error messages. Useful if you are running regression tests or have some other automated test machinery.
- **leak-check=<no|summary|yes|full> [default: summary]**: When enabled, search for memory leaks when the client program finishes. If set to summary, it says how many leaks occurred. If set to full or yes, each individual leak will be shown in detail and/or counted as an error, as specified by the options **--show-leak-kinds** and **--errors-for-leak-kinds**.
- **--show-leak-kinds=<set> [default: definite,possible]**: Specifies the leak kinds to show in a full leak search, in one of the following ways:
- a comma separated list of one or more of **definite indirect possible reachable.**
- **all** to specify the complete set (all leak kinds). It is equivalent to **--show-leak-kinds=definite,indirect,possible,reachable.**
- **none** for the empty set.
- **--error-exitcode=<number> [default: 0]**: Specifies an alternative exit code to return if Valgrind reported any errors in the run. When set to the default value (zero), the return value from Valgrind will always be the return value of the process being simulated. When set to a nonzero value, that value is returned instead, if Valgrind detects any errors. This is useful for using Valgrind as part of an automated test suite, since it makes it easy to detect test cases for which Valgrind has reported errors, just by inspecting return codes.
### 分析 `qtest`
在原先版本的 `queue.c` 的 `q_remove_head()` 中,原先複製字串至 `*sp` 我是使用 `memcpy()` 的:
```c=133
if (sp) {
memcpy(sp, head->value, bufsize - 1);
sp[bufsize - 1] = '\0';
}
```
不過在使用 Valgrind 的時候,會出現 `Invalid read of size 8` 的錯誤:
```
--- Trace Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
==8991== Invalid read of size 8
==8991== at 0x4C368AC: memmove (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8991== by 0x10CBBD: memcpy (string_fortified.h:34)
==8991== by 0x10CBBD: q_remove_head (queue.c:135)
==8991== by 0x10A129: do_remove_head (qtest.c:365)
==8991== by 0x10B949: interpret_cmda (console.c:220)
==8991== by 0x10BEBD: interpret_cmd (console.c:243)
==8991== by 0x10C48B: cmd_select (console.c:571)
==8991== by 0x10C6D3: run_console (console.c:630)
==8991== by 0x10ADF8: main (qtest.c:771)
==8991== Address 0x55ced98 is 8 bytes before a block of size 2,049 alloc'd
==8991== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8991== by 0x109F87: do_remove_head (qtest.c:331)
==8991== by 0x10B949: interpret_cmda (console.c:220)
==8991== by 0x10BEBD: interpret_cmd (console.c:243)
==8991== by 0x10C48B: cmd_select (console.c:571)
==8991== by 0x10C6D3: run_console (console.c:630)
==8991== by 0x10ADF8: main (qtest.c:771)
==8991==
...
```
但在將 `memcpy()` 換成 `strncpy()` 後,錯誤就消失了。
原本一直想不透到底是什麼問題,直到看見 [Naetw 作業的說明](https://hackmd.io/@bHCinvrjSmaS33lxDnYjog/BysQssYHN#Invalid-read):
> memcpy 會完全複製 bufsize - 1 bytes 到 sp 裡,但是 ptr->value 的空間長度可能遠小於 bufsize - 1,先前誤解了註解意思,以為要直接複製 bufsize - 1 bytes,實際上它只是最大值而已。將 memcpy 替換成 strncpy 這樣一來在遇到 NULL byte 時複製行為便會直接停止。
才理解原來是 `memcpy()` 和 `strncpy()` 針對 `NULL terminator` 的處理不同所致。
查看 [`memcpy`](https://www.gnu.org/software/libc/manual/html_node/Copying-Strings-and-Arrays.html#index-memcpy) GNU C Library 說明:
```
Function: void * memcpy (void *restrict to, const void *restrict from, size_t size)
Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.
The memcpy function copies size bytes from the object beginning at from into the object beginning at to. The behavior of this function is undefined if the two arrays to and from overlap; use memmove instead if overlapping is possible.
The value returned by memcpy is the value of to.
Here is an example of how you might use memcpy to copy the contents of an array:
struct foo *oldarray, *newarray;
int arraysize;
…
memcpy (new, old, arraysize * sizeof (struct foo));
```
[`strcpy`](https://www.gnu.org/software/libc/manual/html_node/Copying-Strings-and-Arrays.html#index-strcpy)、[`strncpy`](https://www.gnu.org/software/libc/manual/html_node/Truncating-Strings.html#index-strncpy) GNU C Library 的說明:
```
Function: char * strcpy (char *restrict to, const char *restrict from)
Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.
This copies bytes from the string from (up to and including the terminating null byte) into the string to. Like memcpy, this function has undefined results if the strings overlap. The return value is the value of to.
```
```
Function: char * strncpy (char *restrict to, const char *restrict from, size_t size)
Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.
This function is similar to strcpy but always copies exactly size bytes into to.
If from does not contain a null byte in its first size bytes, strncpy copies just the first size bytes. In this case no null terminator is written into to.
Otherwise from must be a string with length less than size. In this case strncpy copies all of from, followed by enough null bytes to add up to size bytes in all.
The behavior of strncpy is undefined if the strings overlap.
This function was designed for now-rarely-used arrays consisting of non-null bytes followed by zero or more null bytes. It needs to set all size bytes of the destination, even when size is much greater than the length of from. As noted below, this function is generally a poor choice for processing text.
```
由 [`strcpy`](https://www.gnu.org/software/libc/manual/html_node/Copying-Strings-and-Arrays.html#index-strcpy) 的說明我們可以看見:
> This copies bytes from the string from (**up to and including the terminating null byte**) into the string to.
所以將 `memcpy()` 改為 `strncpy()` 才不會複製到多餘的 bytes。透過 Valgrind 我們的確可以找到原先所未設想到的記憶體溢漏 (or 錯誤存取) 情況。
## Valgrind massif 分析
Valgrind massif 的說明可以參考 [massif 的官方說明文件](https://valgrind.org/docs/manual/ms-manual.html)。
massif 做為 heap profiler,可以分析並記錄我們程式的 heap 使用量。
使用方式只需指定 `--tool=massif` 即可,e.g.:
```
valgrind --tool=massif prog
```
在程式執行完成後,會在該程式目錄下新增一個:`massif.out.<你的程式 PID>` 的檔案。我們可以再透過 `ms_print` 指令讀取該檔案,顯示 massif 分析的結果,e.g.:
```
ms_print massif.out.<你的程式 PID>
```
以 `trace-13-perf.cmd` 做為範例:
:::warning
原先的 `.valgrindrc` 的 `show-leak-kinds` 前面並沒有加上 `memcheck:`,代表該指令只會在 memcheck 的 tool 下才會使用,因此指定 --tool 為 massif 時會出現:**`valgrind: Unknown option: show-leak-kinds=all`** 的錯誤訊息。將 `.valgrindrc` 中的 `show-leak-kinds` 改為:`--memecheck:show-leak-kinds=all` 即可避免此錯誤。
:::
```
--------------------------------------------------------------------------------
Command: /tmp/qtest.22gTV0 -f traces/trace-13-perf.cmd
Massif arguments: (none)
ms_print arguments: massif.out.26686
--------------------------------------------------------------------------------
MB
122.3^ :
| ::::###::::::::
| :::: # : : ::@
| :::::: # : : ::@:
| :@:::::: # : : ::@::
| :::@:::::: # : : ::@:::
| :::::@:::::: # : : ::@::::
| ::::::@:::::: # : : ::@::::::
| ::::::::@:::::: # : : ::@:::::::
| ::::::::::@:::::: # : : ::@:::::::@
| :: ::::::::@:::::: # : : ::@:::::::@:
| ::::: ::::::::@:::::: # : : ::@:::::::@::
| :::: :: ::::::::@:::::: # : : ::@:::::::@:::
| :@:::: :: ::::::::@:::::: # : : ::@:::::::@::::
| :::@:::: :: ::::::::@:::::: # : : ::@:::::::@:::::@
| ::::@:::: :: ::::::::@:::::: # : : ::@:::::::@:::::@:
| :::::::@:::: :: ::::::::@:::::: # : : ::@:::::::@:::::@::
| ::::::::@:::: :: ::::::::@:::::: # : : ::@:::::::@:::::@:::
| :@::::::::@:::: :: ::::::::@:::::: # : : ::@:::::::@:::::@::::
| :@:@::::::::@:::: :: ::::::::@:::::: # : : ::@:::::::@:::::@::::@
0 +----------------------------------------------------------------------->Mi
0 919.3
Number of snapshots: 80
Detailed snapshots: [4, 6, 16, 33, 41 (peak), 46, 57, 67, 77]
--------------------------------------------------------------------------------
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
--------------------------------------------------------------------------------
0 0 0 0 0 0
1 11,754,735 2,922,096 2,375,811 546,285 0
2 19,830,459 4,965,504 4,036,067 929,437 0
3 33,996,363 8,549,904 6,948,379 1,601,525 0
4 44,114,472 11,110,104 9,028,531 2,081,573 0
81.26% (9,028,531B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->81.17% (9,018,424B) 0x10D57E: test_malloc (harness.c:137)
| ->43.71% (4,856,040B) 0x10DDBD: q_allocate_node (queue.c:229)
| | ->43.71% (4,856,040B) 0x10DA7D: q_insert_head (queue.c:62)
| | ->43.71% (4,856,040B) 0x109B08: do_insert_head (qtest.c:213)
| | ->43.71% (4,856,040B) 0x10C3AD: interpret_cmda (console.c:220)
| | ->43.71% (4,856,040B) 0x10C450: interpret_cmd (console.c:243)
| | ->43.71% (4,856,040B) 0x10D03F: cmd_select (console.c:571)
| | ->43.71% (4,856,040B) 0x10D395: run_console (console.c:630)
| | ->43.71% (4,856,040B) 0x10B184: main (qtest.c:771)
| |
| ->37.46% (4,162,320B) 0x10DDE7: q_allocate_node (queue.c:236)
| | ->37.46% (4,162,320B) 0x10DA7D: q_insert_head (queue.c:62)
| | ->37.46% (4,162,320B) 0x109B08: do_insert_head (qtest.c:213)
| | ->37.46% (4,162,320B) 0x10C3AD: interpret_cmda (console.c:220)
| | ->37.46% (4,162,320B) 0x10C450: interpret_cmd (console.c:243)
| | ->37.46% (4,162,320B) 0x10D03F: cmd_select (console.c:571)
| | ->37.46% (4,162,320B) 0x10D395: run_console (console.c:630)
| | ->37.46% (4,162,320B) 0x10B184: main (qtest.c:771)
| |
| ->00.00% (64B) in 1+ places, all below ms_print's threshold (01.00%)
|
->00.09% (10,107B) in 1+ places, all below ms_print's threshold (01.00%)
... (以下略)
```
Valgrind massif 會在程式執行期間製作多個 snapshots,顯示程式 heap memory 執行時期的使用量。其中:
- **:** 代表 Normal snapshot.
- **@** 代表 Detailed snapshot.
- **#** 代表 Peak snapshot.
Detailed snapshot 和 Peak snapshot 都會顯示當下 heap 使用量的詳細資訊,包含分配記憶體的 stack traces。
我們可以看到因為我們的程式有正確的釋放所佔用的記憶體空間,因此最終的記憶體使用量會回到 0。
若我們修改我們的 `q_free()`:
```c=28
/* Free all storage used by queue */
void q_free(queue_t *q)
{
return;
}
```
再次執行 Valgrind massif:
```
--------------------------------------------------------------------------------
Command: /tmp/qtest.60n4QC -f traces/trace-13-perf.cmd
Massif arguments: (none)
ms_print arguments: massif.out.23011
--------------------------------------------------------------------------------
MB
122.3^ :
| :::::::###::::::::::
| ::::: # : : :
| :::: ::: # : : :
| ::@:::: ::: # : : :
| ::::@:::: ::: # : : :
| :::::::@:::: ::: # : : :
| :::: :::::@:::: ::: # : : :
| ::::::: :::::@:::: ::: # : : :
| ::: ::::: :::::@:::: ::: # : : :
| @::::: ::::: :::::@:::: ::: # : : :
| :@:@::::: ::::: :::::@:::: ::: # : : :
| :::@:@::::: ::::: :::::@:::: ::: # : : :
| :::::::@:@::::: ::::: :::::@:::: ::: # : : :
| :::: ::::@:@::::: ::::: :::::@:::: ::: # : : :
| @:@:::: ::::@:@::::: ::::: :::::@:::: ::: # : : :
| :::@:@:::: ::::@:@::::: ::::: :::::@:::: ::: # : : :
| ::: :@:@:::: ::::@:@::::: ::::: :::::@:::: ::: # : : :
| :::::: :@:@:::: ::::@:@::::: ::::: :::::@:::: ::: # : : :
| :::::::::: :@:@:::: ::::@:@::::: ::::: :::::@:::: ::: # : : :
0 +----------------------------------------------------------------------->Mi
0 628.8
Number of snapshots: 58
Detailed snapshots: [1, 15, 17, 26, 28, 44, 54 (peak)]
--------------------------------------------------------------------------------
```
可以看到,此時的 Valgrind massif 顯示記憶體用量直到程式結束都沒有降低,代表我們有 memory leak 的發生。
透過 Valgrind massif,我們可以分析我們的程式 heap 的用量是否合理,並透過 stack trace,找出 heap 用量的 hot spot,並試著優化我們的程式。
:::info
預設 Valigrind massif 是採用 instructions executed 為橫軸單位,但如果程式執行的時間很短,會導致記憶體分配幾乎集中在圖形最後的部分 (因為 `main()` 在執行前還有很多 "前置作業",也會一併計算在 instructions executed 中)。因此可以在 `.valgrindrc` 中指定 massif tool 使用 `--time-unit=B` 參數 (i.e. `--massif:time-unit=B`),將橫軸改為以 bytes allocated/deallocated 為單位。
:::
## Select 系統呼叫與 RIO 套件分析
```c
int select(int nfds, fd_set *readfds, fd_set *writefds,
fd_set *exceptfds, struct timeval *timeout);
```
`select()` + `read()` 為 I/O multiplexing 的一種。關於 Blocking / Non-blocking I/O / I/O multiplexing 及 Asynchronous I/O,可以參考下列文章的說明:
- [淺談I/O Model](https://medium.com/@clu1022/%E6%B7%BA%E8%AB%87i-o-model-32da09c619e6)
- [Linux IO模式及 select、poll、epoll詳解](https://segmentfault.com/a/1190000003063859)
- [在使用Multiplexed I/O的情况下,还有必要使用Non Blocking I/O么 ?](https://www.zhihu.com/question/33072351)

- Blocking / Non-blocking I/O:在等待 I/O 資料**準備就緒** (e.g. **有資料可以讀取**) 的過程,User space process 是否會被 **blocked**。
- Blocking I/O:在等待 I/O 資料準備就緒的過程,User space process 是被 blocked 的。
- e.g. `read()` + 未設定 **O_NONBLOCK** flag、`select()`、`epoll()`。
- Non-blocking I/O:在向 Kernel 詢問是否可以進行 I/O operation 的過程,不論 I/O 資料是否準備就緒,User space process 都不會被 blocked。若 I/O 資料尚未準備就緒 (e.g. 尚無資料可以讀取),則會直接返回,並設定 `errno` 指明錯誤 (e.g. EAGAIN、`EWOULDBLOCK`)。User space 必須重複詢問 (e.g. 在 `while` 迴圈中重複呼叫 `read()`) 直到 I/O 資料準備就緒才可進行 I/O operation。
- e.g. `read()` + 設定 **O_NONBLOCK** flag、`aio_read()`、`aio_write()`
- Synchronous / Asynchronous I/O:在從/向 Kernel space 讀取/寫入資料 (i.e. **實際在做 I/O operation**) 的過程,User space 是否會被 **blocked**。
- Synchronous I/O:從/向 Kernel space 讀取/寫入資料 的過程,User space process 是被 blocked 的
- e.g. `read()`、`recvfrom()`。
- Asynchronous I/O:從/向 Kernel space 讀取/寫入的過程是交由 Kernel space 來處理的。Kernel 複製完資料後會再通知 User space process。這中間 User space process 不會被 blocked。
- e.g. `aio_read()`、`aio_write()`。
所以 I/O multiplexing 為 Blocking I/O 的一種。但 `select()` + `read()` 比單一 `read()` 的好處在於 `select()` 可以同時等待多個 `file descriptors` (`fd`)。
在呼叫 `select()` 時所傳入的是多個 `fd set`,而非只是一個 `fd`。只要 `fd set` 中任一個以上的 `fd` 的 I/O 資料準備就緒,`select()` 就會返回。呼叫 `select()` 的 User space process 必須 iterate `fd set` 來檢查是哪幾個 `fd` 可以進行 I/O operation,而後再呼叫 `read()`、`recvfrom()` 讀取資料。此時 I/O 資料應準備就緒,可以直接讀取。但也有例外,e.g. `select()` man page 所述:
> Under Linux, select() may report a socket file descriptor as "ready for reading", while nevertheless a subsequent read blocks. This could for example happen when data has arrived but upon examination has wrong checksum and is discarded. There may be other circumstances in which a file descriptor is spuriously reported as ready. Thus it may be safer to use O_NONBLOCK on sockets that should not block.
`select()` 後的 `read()`、`recvfrom()` 是 Blocking I/O 或是 Non-blocking I/O,還是依是否設了 **O_NONBLOCK** flag 而定。將 **read()** 或 **recvfrom()** 設為 **NON_BLOCKING** 才可以避免 User process 再次被 blocked。
P.S. 由於 `select()` 最多只可同時 "監控" (watch) 1024 個 `fd`,且當返回時,User space process 還必須 iterate 整個 `fd set` 才可以知道是哪些 `fd` 的資料已經準備好了,效率不佳。因此才有後續的 `epoll()`。
回到 `qtest`:
1. `main()` 呼叫 `run_console()` 並將 trace cmd 檔路徑傳入。
2. `run_console()` 呼叫 `push_file()`,分配一個 **`RIO_ELE`**,並將 trace cmd 檔的 `fd` 填入。每個 **`RIO_ELE`** 中都有一個 internal buffer:`char buf[RIO_BUFSIZE]`,也就是 [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 所指的 RIO buffer,另外多個 **`RIO_ELE`** 也會透過 `prev pointer` 串接成一個 linked list。`push_file()` 完成後,會再呼叫 `cmd_select()`,準備解析 cmd 檔:
```c=620
bool run_console(char *infile_name)
{
if (!push_file(infile_name)) {
report(1, "ERROR: Could not open source file '%s'", infile_name);
return false;
}
while (!cmd_done())
cmd_select(0, NULL, NULL, NULL, NULL);
return err_cnt == 0;
}
```
```c=435
static bool push_file(char *fname)
{
int fd = fname ? open(fname, O_RDONLY) : STDIN_FILENO;
if (fd < 0)
return false;
if (fd > fd_max)
fd_max = fd;
rio_ptr rnew = malloc_or_fail(sizeof(rio_t), "push_file");
rnew->fd = fd;
rnew->cnt = 0;
rnew->bufptr = rnew->buf;
rnew->prev = buf_stack;
buf_stack = rnew;
return true;
}
```
3. `cmd_select()` 中第一次執行時,`read_ready()` 會回傳 `false`,代表目前尚未有未執行的指令。再來會將 `readfds` 設為 `local_readset`,做為接下來 `select()` 的 `readset` 參數。其中會將 trace cmd 檔的 `fd` (也就是 `buf_stack->fd`) 透過 `FD_SET()` 將 `readfds` 中對應的 bit 設為 `1`,代表我們要 "監控" (watch) 此 `fd` 是否有資料可以讀取:
```c=576
if (!block_flag) {
/* Process any commands in input buffer */
if (!readfds)
readfds = &local_readset;
/* Add input fd to readset for select */
infd = buf_stack->fd;
FD_SET(infd, readfds);
if (infd == STDIN_FILENO && prompt_flag) {
printf("%s", prompt);
fflush(stdout);
prompt_flag = true;
}
if (infd >= nfds)
nfds = infd + 1;
}
```
P.S. `select()` 的 `nfds` 參數要設比我們最大想 "監控" (watch) 的 `fd` 值 + 1,因此 `ndfs = infd + 1`。
4. 由於是讀取本地檔案,因此 `select()` 會立即返回,其返回值 (`result`) 代表目前有多少個 `fd` 資料準備就緒,可以存取。接著透過 `FD_ISSET()` 檢查 trace cmd 檔的 `fd` 是否準備就緒。若為 `true`,則再透過 `readline()` 讀取 trace cmd 檔的指令 (也是在此運用了 [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 的原理),並透過 `FD_CLR()` 從 `readfds` 中清除對應的 `fd` bit:
```c=596
int result = select(nfds, readfds, writefds, exceptfds, timeout);
if (result <= 0)
return result;
infd = buf_stack->fd;
if (readfds && FD_ISSET(infd, readfds)) {
/* Commandline input available */
FD_CLR(infd, readfds);
result--;
cmdline = readline();
if (cmdline)
interpret_cmd(cmdline);
}
return result;
```
:::info
:question: 疑問:
由於同時間只會 "監控" (watch) 一個 trace cmd 檔的 `fd`,是否真的有需要使用 `select()`? 或是其實直接呼叫 `readline()` 讀取 trace cmd 檔即可? 畢竟多呼叫了 `select()` 還是多了一個 system call,但卻沒有享受到 `select()` 所帶來的好處。
:::
5. 根據 [CS:APP 第 10 章重點提示](https://hackmd.io/@sysprog/H1TtmVTTz) 說明,`read()` 在以下情況**會**有 **short count** 的問題:
- Encountering (end-of-file) EOF.
- Reading text lines from a **terminal**.
- Reading and writing network sockets.
而 `read()` 在以下情況**不會**有 **short count** 的問題:
- Reading from disk file (except for EOF).
- Writing to disk files.
:::info
經過自己測試,`read()` 在讀取本地檔案時,碰到 `\n` 並不會返回。
:::
因此,CS:APP [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 為了避免 **short count** 的情形,定義了 `rio_t` 這個 struct,其中包含了 internal buffer:`rio_buf`、`rio_cnt` 記錄 internal buffer 中剩餘未讀取的 bytes 數,及 `rio_bufptr` 指向 internal buffer 中剩餘未讀取資料的起始 offset。在呼叫 `read()` 時,所傳入的 `*buf` 為 `rio_buf` internal buffer。
`readline()` 運用了同樣的原理:
```c=475
/* Read command from input file.
* When hit EOF, close that file and return NULL
*/
static char *readline()
{
int cnt;
char c;
char *lptr = linebuf;
if (!buf_stack)
return NULL;
for (cnt = 0; cnt < RIO_BUFSIZE - 2; cnt++) {
if (buf_stack->cnt <= 0) {
/* Need to read from input file */
buf_stack->cnt = read(buf_stack->fd, buf_stack->buf, RIO_BUFSIZE);
buf_stack->bufptr = buf_stack->buf;
if (buf_stack->cnt <= 0) {
/* Encountered EOF */
pop_file();
if (cnt > 0) {
/* Last line of file did not terminate with newline. */
/* Terminate line & return it */
*lptr++ = '\n';
*lptr++ = '\0';
if (echo) {
report_noreturn(1, prompt);
report_noreturn(1, linebuf);
}
return linebuf;
}
return NULL;
}
}
/* Have text in buffer */
c = *buf_stack->bufptr++;
*lptr++ = c;
buf_stack->cnt--;
if (c == '\n')
break;
}
if (c != '\n') {
/* Hit buffer limit. Artificially terminate line */
*lptr++ = '\n';
}
*lptr++ = '\0';
if (echo) {
report_noreturn(1, prompt);
report_noreturn(1, linebuf);
}
return linebuf;
}
```
- 當 `buf_stack->cnt <= 0` 時 (代表目前沒有尚未解析的指令),會呼叫 `read()` 將讀取的資料存入 `buf_stack->buf`,並重置 `buf_stack->bufptr` 及更新 `buf_stack->cnt`。而後開始解析 `buf_stack->buf` 中所讀入的資料,直到碰到 '\n',代表完成了一個指令解析 (解析完的指令存在 `linebuf` 中)。
- 當 `buf_stack->cnt = 0` 時,有下面兩種情況:
- 本次讀取的 `text size > RIO_BUFSIZE`,且 text 中都沒包含 `\n` (因此 `buf_stack->cnt` 被減為 0 了),此時由於 RIO 的 internal buffer,因此可以繼續讀取後續的檔案內容。
- 碰到 EOF,此時再呼叫 `pop_file()` 清除此 trace cmd 檔對應的 **`RIO_ELE`** 所佔用的記憶體空間。
6. 回到 `cmd_select()`,執行 `interpret_cmd()` 執行對應的指令。
7. 離開 `cmd_select()` 回到 `run_console()`。但由於 `cmd_done()` 仍為 `false`,因此又會再次進到 `cmd_select()`。但此時若仍有尚未解析完成的指令,則 `read_ready()` 為 `true`,會再次呼叫 `readline()`。不過在 `readline()` 中,由於 `buf_stack->cnt` 仍 `> 0`,因此不會再次呼叫 `read()`,而只會更新 `linebuf`,透過 `while` 迴圈,我們就可以將後續未解析完成的指令一一解析完成並執行,直到所有 `buf_stack` linked list 的 **`RIO_ELE`** 都被處理完成為止 (`cmd_done()` 回傳值為 `true`)。:
```c=567
while (!block_flag && read_ready()) {
cmdline = readline();
interpret_cmd(cmdline);
prompt_flag = true;
}
```