# 2020q1 Homework4 (khttpd)
contributed by <_`ire33164`_>
###### tags: `Linux Kernel`
[作業說明](https://hackmd.io/@sysprog/linux2020-khttpd)
## 開發環境
```
$ uname -a
Linux chia-GL72-6QF 4.15.0-96-generic #97-Ubuntu SMP Wed Apr 1 03:25:46 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
```
```
$ gcc --version
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
```
## 自我檢查清單
### 掛載時利用 port=1999 傳遞到核心,作為核心模組初始化的參數
是利用 `module_param` 來達到傳入初始化的參數,在 `main.c` 中其實可以發現不只 `port` 還有另一個 `blocklog` 也可以在掛載時被初始化成指定的值。
```c
static ushort port = DEFAULT_PORT;
module_param(port, ushort, S_IRUGO);
static ushort backlog = DEFAULT_BACKLOG;
module_param(backlog, ushort, S_IRUGO);
```
接著解釋`module_param` 的運作機制 :
以下節錄至 [include/linux/moduleparam.h](https://elixir.bootlin.com/linux/v4.18/source/include/linux/moduleparam.h#L128) :
```c
/**
* module_param - typesafe helper for a module/cmdline parameter
* @value: the variable to alter, and exposed parameter name.
* @type: the type of the parameter
* @perm: visibility in sysfs.
*
* @value becomes the module parameter, or (prefixed by KBUILD_MODNAME and a
* ".") the kernel commandline parameter. Note that - is changed to _, so
* the user can use "foo-bar=1" even for variable "foo_bar".
*
* @perm is 0 if the the variable is not to appear in sysfs, or 0444
* for world-readable, 0644 for root-writable, etc. Note that if it
* is writable, you may need to use kernel_param_lock() around
* accesses (esp. charp, which can be kfreed when it changes).
*
* The @type is simply pasted to refer to a param_ops_##type and a
* param_check_##type: for convenience many standard types are provided but
* you can create your own by defining those variables.
*
* Standard types are:
* byte, short, ushort, int, uint, long, ulong
* charp: a character pointer
* bool: a bool, values 0/1, y/n, Y/N.
* invbool: the above, only sense-reversed (N = true).
*/
#define module_param(name, type, perm) \
module_param_named(name, name, type, perm)
```
上述註解分別說明 `module_param` 的三個參數 :
* `value` : 可以作為 module parameter 或是在 commandline 設定的 parameter,就如範例中,利用 `$ sudo insmod khttpd.ko port=1999` 就可以將 parameter port 初始化成 1999
* `type` : 表示該 variable 的資料型別,可以設定為 byte, short, ushort, int, uint, long, ulong, charp, bool, invbool
* `perm` : 表示該 variable 的存取權限,若為 `0` 將不會在系統中顯示,`0444` 為 world-readable,範例中 `perm` 被設為 `S_IRUGO`,其被定義在 [linux/include/linux/stat.h ](http://lxr.linux.no/linux+v2.6.35/include/linux/stat.h#L52) 中 :
```c
#define S_IRUSR 00400
#define S_IRGRP 00040
#define S_IROTH 00004
#define S_IRUGO (S_IRUSR|S_IRGRP|S_IROTH)
```
範例中的 port 存取權限為 world-readable, 資料型別為 ushort。
若將 `module_param(port, ushort, S_IRUGO)` 繼續展開則為 `module_param_named(port, port, ushort, S_IRUGO)`。
以下節錄至 [/include/linux/moduleparam.h](https://elixir.bootlin.com/linux/v4.18/source/include/linux/moduleparam.h#L148) :
```c
/**
* module_param_named - typesafe helper for a renamed module/cmdline parameter
* @name: a valid C identifier which is the parameter name.
* @value: the actual lvalue to alter.
* @type: the type of the parameter
* @perm: visibility in sysfs.
*
* Usually it's a good idea to have variable names and user-exposed names the
* same, but that's harder if the variable must be non-static or is inside a
* structure. This allows exposure under a different name.
*/
#define module_param_named(name, value, type, perm) \
param_check_##type(name, &(value)); \
module_param_cb(name, ¶m_ops_##type, &value, perm); \
__MODULE_PARM_TYPE(name, #type)
```
說明 `module_param_named` 可以對 module/commandline parameter 進行重新命名。
將 `module_param_cb` 展開 :
```c
/**
* module_param_cb - general callback for a module/cmdline parameter
* @name: a valid C identifier which is the parameter name.
* @ops: the set & get operations for this parameter.
* @perm: visibility in sysfs.
*
* The ops can have NULL set or get functions.
*/
#define module_param_cb(name, ops, arg, perm) \
__module_param_call(MODULE_PARAM_PREFIX, name, ops, arg, perm, -1, 0)
```
繼續展開 `__module_param_call` :
```c
/* This is the fundamental function for registering boot/module
parameters. */
#define __module_param_call(prefix, name, ops, arg, perm, level, flags) \
/* Default value instead of permissions? */ \
static const char __param_str_##name[] = prefix #name; \
static struct kernel_param __moduleparam_const __param_##name \
__used \
__attribute__ ((unused,__section__ ("__param"),aligned(sizeof(void *)))) \
= { __param_str_##name, THIS_MODULE, ops, \
VERIFY_OCTAL_PERMISSIONS(perm), level, flags, { arg } }
```
可以看到裡頭註冊 `struct kernel_param` 各別的值 :
```c
struct kernel_param {
const char *name;
struct module *mod;
const struct kernel_param_ops *ops;
const u16 perm;
s8 level;
u8 flags;
union {
void *arg;
const struct kparam_string *str;
const struct kparam_array *arr;
};
};
```
而 `struct kernel_param` 正是模組參數的資料結構。
### epoll 系統呼叫作用與 HTTP 效能分析工具原理
#### 解釋 epoll
根據 [epoll(7)](http://man7.org/linux/man-pages/man7/epoll.7.html) 的描述,他可以透過監控多個 file descripters 來觀察 I/O 事件。主要是透過 `epoll instance` 來操控,`epoll instance` 是一種 in-kernel 的資料結構,其中包含 `interest list` 與 `ready list`, `interest list` 包含所有被註冊的 file descripters,而 `ready list` 中則包含註冊且可以進行操作的 file descripters,也因此後者為前者的子集。
而提供的系統呼叫有以下三個 :
* `epoll_create` : 創造一個 `epoll instance` 並回傳一個可以代表他的 fd,`size` 代表該 `epoll instance` 可以監控的數量。
* `epoll_ctl` : 用來新增修改或刪除 epfd 上的 fd。
* `epoll_wait` : 等待事件發生,若事件在 timeout 內發生即成功,並回傳 `ready list` 中 fb 的數量。
#### 解釋 HTTP 效能分析工具原理
以下程式碼皆擷取自 [htstress.c](https://github.com/sysprog21/khttpd/blob/master/htstress.c) :
```c
start_time();
```
紀錄在執行以下程式碼前的時間
```c
for (int n = 0; n < num_threads - 1; ++n)
pthread_create(&useless_thread, 0, &worker, 0);
worker(0);
```
創造的 thread 加上自己剛好等於 `num_theads` 個,並利用創造出來的 thread 執行 `worker`,接下來來看看 `worker` 中做了什麼 :
```c
...
int efd = epoll_create(concurrency);
...
for (int n = 0; n < concurrency; ++n)
init_conn(efd, ecs + n);
...
for (;;) {
do {
nevts = epoll_wait(efd, evts, sizeof(evts) / sizeof(evts[0]), -1);
} while (!exit_i && nevts < 0 && errno == EINTR);
...
```
首先註冊了 `concurrency` 個數的 epoll instance,也就代表著一次監聽 `concurrency` 個 fd,接著利用 fd 進行 socket 連線,相當於一次發出 `concurrency` 個 request。接著監聽每個 fd 是否有 I/O 事件的發生。並根據 ready list 中的 event 狀態 (EPOLLOUT|EPOLLIN) 做出 `send` 或 `recv` 等的對應動作,也就是向 server 端傳送或接受訊息。
```c
double delta =
tve.tv_sec - tv.tv_sec + ((double) (tve.tv_usec - tv.tv_usec)) / 1e6;
```
計算執行上面的程式碼所需的時間。
總結 [htstress.c](https://github.com/sysprog21/khttpd/blob/master/htstress.c) 的原理就是在利用 thread 同時向 server 發出 `num_threads` * `concurrency` 個 request,並在達到 `max_requests` requests 且都接收到 response 後,計算從發出到接受到 server 端回應的時間。
## 將 fibdrv 作業的成果整合進 kHTTPd
因為要接受來自客戶端的請求 `/fib/N` ,因此要先找出處理 `http_request` 和 `http_response` 的部份。在 `https_server.c` 中找到 `http_server_response` 用來回應客戶端部份,首先看到預設的回傳 macro 格式 :
```c
#define HTTP_RESPONSE_200_DUMMY \
"" \
"HTTP/1.1 200 OK" CRLF "Server: " KBUILD_MODNAME CRLF \
"Content-Type: text/plain" CRLF "Content-Length: 12" CRLF \
"Connection: Close" CRLF CRLF "Hello World!" CRLF
#define HTTP_RESPONSE_200_KEEPALIVE_DUMMY \
"" \
"HTTP/1.1 200 OK" CRLF "Server: " KBUILD_MODNAME CRLF \
"Content-Type: text/plain" CRLF "Content-Length: 12" CRLF \
"Connection: Keep-Alive" CRLF CRLF "Hello World!" CRLF
```
可以發現回傳的 macro 格式都是固定包含 "Hello World!" 和長度 "12",因此嘗試把這兩個部份改成可變的格式 :
```c
#define HTTP_RESPONSE_200_DUMMY \
"" \
"HTTP/1.1 200 OK" CRLF "Server: " KBUILD_MODNAME CRLF \
"Content-Type: text/plain" CRLF "Content-Length: %d" CRLF \
"Connection: Close" CRLF CRLF "%s" CRLF
#define HTTP_RESPONSE_200_KEEPALIVE_DUMMY \
"" \
"HTTP/1.1 200 OK" CRLF "Server: " KBUILD_MODNAME CRLF \
"Content-Type: text/plain" CRLF "Content-Length: %d" CRLF \
"Connection: Keep-Alive" CRLF CRLF "%s" CRLF
```
再來只需要將可變得兩個部份代入 fibonacci 運算結果即可,開始進行 fibonacci 的運算。因為題目要求必須考慮大數運算,因此匯入之前寫的 [bignum_operation.[ch]](https://github.com/ire33164/khttpd/blob/master/bignum_operation.h) 並重寫 `fib_eval()` 提供 `http_server.c` 求值。
接著改寫 `http_server_response` 為可以解析 `/fib/N` 需求的 function,而可以解析 request url 的部份我在 `http_get_parse_url` 中實做 :
```c=
static char *http_get_parse_url(struct http_request *request, int keep_alive)
{
const char *delim = "/";
char *token, *cur = request->request_url;
char *kbuf;
char *response = keep_alive ? HTTP_RESPONSE_200_KEEPALIVE_DUMMY
: HTTP_RESPONSE_200_DUMMY;
int len = MAX_DIGIT + strlen(response) - 4 + 1;
kbuf = kmalloc(len, GFP_KERNEL);
cur++;
token = strsep(&cur, delim);
if (strcmp(token, "fib") == 0) {
// evalue fib(N)
long N;
char fib_result[MAX_DIGIT];
kstrtol(strsep(&cur, delim), 10, &N);
bignum fib_val = fib_eval(N);
bignum2str(&fib_val, fib_result);
snprintf(kbuf, len, response, strlen(fib_result), fib_result);
return kbuf;
}
snprintf(kbuf, strlen(response) - 3, response, 12, "Hello World!");
return kbuf;
}
```
利用 `strsep` 將 request url 根據 `/` 切開,若 `/` 接著的是 `fib` 那麼理應下一個 `/` 會接著數值 N,因此又切了一次後利用 `kstrol` 將 type 為 string 的 N 轉成 type 為 long 的形式,接著利用 `fib_eval(N)` 求值 fibonacci(N),最後再將運算結果利用 `snprintf` 代入 http response 的特定格式中。另外若 request url 不為 `/fib` 開頭的話,一律回傳 "Hello World!"。
最後改寫 `http_server_response` :
```c
static int http_server_response(struct http_request *request, int keep_alive)
{
char *response;
pr_info("requested_url = %s\n", request->request_url);
if (request->method != HTTP_GET)
response = keep_alive ? HTTP_RESPONSE_501_KEEPALIVE : HTTP_RESPONSE_501;
else
response = http_get_parse_url(request, keep_alive);
http_server_send(request->socket, response, strlen(response));
return 0;
}
```
### recv error : -104 問題
不同於使用瀏覽器像伺服器發出請求,使用 `wget()` 發送請求時會看到 kernel 出現 `recv error : -104` 的錯誤訊息,
:::danger
未完成
:::
## Fibonacci 運算檢驗
這邊我是使用 [Facebook 討論串](https://www.facebook.com/groups/system.software2020/permalink/345124729755066/) 中 [黃鈺盛](https://www.facebook.com/wilson.d.huang?comment_id=Y29tbWVudDozNDUxMjQ3Mjk3NTUwNjZfMzQ1NTcwNjE2Mzc3MTQ0) 提供的 python 實做版本,他向 `http://www.protocol5.com/Fibonacci/{number}.htm` 取得 fibonacci(N) 的值並執行 fibonacci 執行檔與其結果做比對,同時也紀錄下計算的時間。
對於驗證程式我做了一些修改,我把執行檔路徑修改成 `http://localhost:PORT/fib/N` ,並將執行改成對 URL 發出請求,並紀錄下從 http request 到接收到 khttpd 回應的時間。
```c
FIB_URL = "http://localhost:" + args.port + "/fib/" + args.index
```
```c
...
res = requests.get(url)
if res.status_code == requests.codes.ok :
duration = time.time() - start_time
print(f"fibonacci({index}) = {res.text}")
print(f"time cost {duration} (s)")
expect = fetch_fib_number(args.index)[base]
if res.text == expect:
print(Fore.GREEN + "-----------------------Pass----------------------------")
else:
print(Fore.RED + "------------------------No pass--------------------------")
exit(0)
else:
print(Fore.RED + "-----------------------Fail-----------------------")
exit(1)
...
```
接著在 Makefile 中新增 verify :
```
verify:
python3 ./fib_verify.py ${PORT} ${N}
```
接著若要驗證僅須下達 :
```
$ make verify
python3 ./fib_verify.py 8081 10
fibonacci(10) = 55
time cost 0.0032465457916259766 (s)
-----------------------Pass----------------------------
```
## Concurrency Managed Workqueue
參考 [kecho](https://github.com/sysprog21/kecho) 的實做手法引入 cmwq 後 :
```
$ ./htstress -n 100000 -c 4 -t 4 http://localhost:8081/
...
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
socker errors: 0 [0%]
seconds: 2.082
requests/sec: 48038.931
```
與先前的版本 :
```
$ ./htstress -n 100000 -c 4 -t 4 http://localhost:8081/
....
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
socker errors: 0 [0%]
seconds: 5.186
requests/sec: 19282.472
```
`requests/sec` 整整成長了 1.5 倍
* 改成求 fibonacci 數值
舊版 :
```
$ ./htstress -n 100000 -c 4 -t 4 http://localhost:8081/fib/136
...
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
socker errors: 0 [0%]
seconds: 6.231
requests/sec: 16047.714
```
新版 :
```
$ ./htstress -n 100000 -c 4 -t 4 http://localhost:8081/fib/13
...
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
socker errors: 0 [0%]
seconds: 2.282
requests/sec: 43827.125
```
相差幅度又更大了。
## 參考資料
1. [Facebook 討論串](https://www.facebook.com/groups/system.software2020/permalink/345124729755066/)
2. [fibdrv 作業說明](https://hackmd.io/@sysprog/linux2020-fibdrv)
3. [epoll(7)](http://man7.org/linux/man-pages/man7/epoll.7.html)
4. [ip(7)](http://man7.org/linux/man-pages/man7/ip.7.htm)
5. [EPOLLIN和EPOLLOUT究竟什麼時候觸發?](https://blog.csdn.net/hintonic/article/details/16882989)
6. [ERRNO(3)](http://man7.org/linux/man-pages/man3/errno.3.html)
7. [The method to epoll’s madness](https://medium.com/@copyconstruct/the-method-to-epolls-madness-d9d2d6378642)
8. [Concurrency Managed Workqueue之(二):CMWQ概述](http://www.wowotech.net/irq_subsystem/cmwq-intro.html)