2020q1 Homework4 (khttpd)

# 2020q1 Homework4 (khttpd) contributed by < `Yu-Wei-Chang` > > [2020q1 作業 khttpd](https://hackmd.io/@sysprog/linux2020-khttpd) > ## 實驗環境 ```shell $ uname -a Linux ywc-ThinkPad-X220 5.3.0-46-generic #38~18.04.1-Ubuntu SMP Tue Mar 31 04:17:56 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 ``` ## `insmod` 如何將命令行的參數傳遞到核心？ ### 在核心合組中增加參數 * 參考 [LINUX KERNEL DEVELOPMENT – KERNEL MODULE PARAMETERS](https://devarea.com/linux-kernel-development-kernel-module-parameters/#.Xvw6K3UzZ8c)，核心模組可以藉由宣告全域變數以及 `module_param()` 巨集的使用，使載入核心模組時順便指定參數的數值，使用方式如： ```shell $ sudo insmod khttpd.ko port=8082 ``` * 在 `khttpd` 的原始碼 `main.c` 中可以看到全域變數 `port` 的預設值是 8081，然後透過 `module_param()` 巨集指定它為核心模組的參數，型態為 `ushort`，權限為 read-only，讓變數 `port` 可以在載入核心時被指定其數值。 ```cpp static ushort port = DEFAULT_PORT; module_param(port, ushort, S_IRUGO); ``` * 我們可以在 `sysfs` 下找到 `khttpd` 的核心參數，它們以檔案的形式表示。查看其數值可以發現是我們在載入模組時所指定的 8082，然後查看其檔案權限是 read-only 沒錯。 ```shell /sys/module/khttpd/parameters$ ls -l 總計 0 -r--r--r-- 1 root root 4096 7月 1 17:28 backlog -r--r--r-- 1 root root 4096 7月 1 17:28 port /sys/module/khttpd/parameters$ cat port 8082 ``` * 巨集 `module_param()` 可以指定參數的型態，==但無法指定其數值範圍，需要檢查參數的合法輸入值的話就必須使用巨集 `module_param_cb()` 以及搭配 `struct kernel_param_ops` 來自定義核心模組參數的讀寫函式。== ```cpp /** * module_param_cb - general callback for a module/cmdline parameter * @name: a valid C identifier which is the parameter name. * @ops: the set & get operations for this parameter. * @arg: args for @ops * @perm: visibility in sysfs. * * The ops can have NULL set or get functions. */ #define module_param_cb(name, ops, arg, perm) \ __module_param_call(MODULE_PARAM_PREFIX, name, ops, arg, perm, -1, 0) ... struct kernel_param_ops { /* How the ops should behave */ unsigned int flags; /* Returns 0, or -errno. arg is in kp->arg. */ int (*set)(const char *val, const struct kernel_param *kp); /* Returns length written or -errno. Buffer is 4k (ie. be short!) */ int (*get)(char *buffer, const struct kernel_param *kp); /* Optional function to free kp->arg when module unloaded. */ void (*free)(void *arg); }; ``` ### 核心模組參數如何傳遞到核心? * 透過 `strace` 可以看到載入核心模組時，呼叫函式 `finit_module()` 時第二個參數即是我們指定參數的字串。 ```shell $ sudo strace insmod khttpd.ko port=8082 stat("/home/ywc/training_data/linux_kernel/Week6_homework/khttpd/khttpd.ko", {st_mode=S_IFREG|0644, st_size=54432, ...}) = 0 openat(AT_FDCWD, "/home/ywc/training_data/linux_kernel/Week6_homework/khttpd/khttpd.ko", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=54432, ...}) = 0 mmap(NULL, 54432, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f4222b0f000 finit_module(3, "port=8082", 0) = 0 ``` * 由 [Linux-核心模組掛載機制](https://hackmd.io/@sysprog/linux2020-fibdrv#-Linux-%E6%A0%B8%E5%BF%83%E6%A8%A1%E7%B5%84%E6%8E%9B%E8%BC%89%E6%A9%9F%E5%88%B6) 得知核心模組的載入流程中，函式 `finit_module()` 會接著呼叫函式 `load_module()`，其中可以看到它會呼叫函式 `strndup_user()`，推測是把從命令行輸入的參數數值從 user space 搬到 kernel space 來。 (==WIP...==) ```cpp /* Allocate and load the module: note that size of section 0 is always zero, and we rely on this for optional sections. */ static int load_module(struct load_info *info, const char __user *uargs, int flags) { ... /* Now copy in args */ mod->args = strndup_user(uargs, ~0UL >> 1); if (IS_ERR(mod->args)) { err = PTR_ERR(mod->args); goto free_arch_cleanup; } ... ``` * 接著會呼叫函式 `parse_args()`，推測是處理剛剛搬到核心的模組參數。 (==WIP...==) ```cpp /* Module is ready to execute: parsing args may do that. */ after_dashes = parse_args(mod->name, mod->args, mod->kp, mod->num_kp, -32768, 32767, mod, unknown_module_param_cb); ``` ### 和 user space 的 socket API `sys/socket.h` 比較 * 建立一個新的 socket endpoint : user space socket API 建立的 socket 是以 file descriptor 的形式由 `socket()` 回傳；kernel socket API 建立的 socket 則是要宣告 `struct socket *` 作為呼叫 `sock_create()` 的第四個參數傳入。 * 綁定 socket 和 address : user space socket API `bind()` 在綁定 socket 是傳入 socket descriptor；kernel socket API `kernel_bind()` 則是傳入 `struct socket *`。 * 將 socket 設定程 passive listening socket : 差異同上。 * 接受 client 的連線 : * 其中一個差異同上，user space 用 socket descriptor；kernel 都用 `struct socket *` * user space socket API `accept()` 在接受連線後，函式回傳一個新的 socket descriptor 用來和 client 交換資料；kernel socket API `kernel_accept()` 則是回傳 error code，socket 是自己宣告，然後當成參數傳入。 * user space socket API `accept()` 呼叫時會傳入 `struct sockaddr` 參數，當有人連線後，可以從 `struct sockaddr` 得知連線人的 IP address 以及 port number；kernel socket API 則不是，如何在 kernel 得知連線人的資訊 ==待查==。 ### kHttpd 的流程和原理 * 載入核心模組後會建立 kernel thread (`http_server_daemon`)，然後在指定的 port (預設是 8081) 上等待 client 的連線。 * 函式 `kernel_accept()` 在接受 client 的連線後會拿到另一個 struct socket，Web server 會使用這個 socket 來接收 HTTP Request。和 user space 的 socket API `sys/socket.h` 類似，聽連線是在一個 socket 聽，連線成功後會換一個 socket 來收送資料，port 也和之前聽連線的不同。 * 有人連線進來時，又會建立另一個 kernel thread (`http_server_worker`)，透過函式 `kernel_recvmsg()` 讀取 HTTP Request 的內容，然後透過函式 `http_parser_execute()` 來分析出 `URI`，然後呼叫 on_message_complete 的 callback 函式，再透過函式 `http_server_response()` 把 HTTP Response 回傳給 client。最終呼叫函式 `kernel_sock_shutdown()` 把 socket 關閉。 * HTTP Request 的內容：參閱 [CS:APP 第 11 章](https://hackmd.io/@sysprog/CSAPP-ch11?type=view) 的說明，HTTP Request 是由 `method`、`URI`、`version` 以及 `HTTP header` 組成，此例中 `GET` 是 method，`URI` 就是 URL 中 domain name 後面的部份 (此例因為沒輸入任何 URI 所以是 `/`)，`HTTP/1.1` 是 version，剩下的部份都是 http header。 ```shell GET / HTTP/1.1 Host: localhost:8081 Connection: keep-alive Cache-Control: max-age=0 Upgrade-Insecure-Requests: 1 User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed ... ``` ### 將 [fibdrv](https://hackmd.io/@sysprog/linux2020-fibdrv) 整合進 kHttpd #### 分析 HTTP Request 的 URI path * 預期在 khttpd 回送 response 時 (`http_server_response()`) 來分析 URI path 是否符合 `/fib/<number>` 的格式，並且做費氏數列的運算。 * 分析的方式是用 `strstr()` 在 `request_url` 中找 `/fib/` 字串，接著再靠 `kstrtoll()` 將剩下的字串轉成數字，如此我們便可以得知使用者預期要計算的 Fibonacci sequence。 ```cpp static int http_server_response(struct http_request *request, int keep_alive) { char *response; char *match = strstr(request->request_url, FIB_URL_PATH); pr_info("requested_url = %s\n", request->request_url); if (match) { long long fib_seq_idx; match += strlen(FIB_URL_PATH); if (kstrtoll(match, 10, &fib_seq_idx) == 0) { /* Here we got the Fibonacci sequence which user want to calculate. */ } } ``` #### 整合 fibdrv * 之前的 [fibdrv 作業](https://hackmd.io/@Yu-Wei-Chang/SkYVm4xtL)是採用 `BigN` 結構來表示大數，所以除了費氏數列的運算函數之外，還要把 `BigN` 的加、乘、減法運算函式也一併移植，因為用 fast-doubling 方式計算費氏級數時會用到。 * 然後因為之前的作業是先把 `BigN` 結構往 user space 傳，然後交由應用程式去將 `BigN` 轉成字串，所以這個轉換字串的函式也要移植到 `kHttpd` 中。 * 詳細內容見 [commit](https://github.com/Yu-Wei-Chang/khttpd/commit/44e7e2a48dd7ce8055bd96281b0fa76c58b6900f) ### 將費氏數列包進 HTTP Response 訊息中 * 目前預期的 HTTP Response 內容 * 收到 `GET` method： * 如果 URL path 形式為 `/fib/<number>`，則將字串 `Fibonacci(<number>) = <Fibonacci sequence>` 包進 response 之中。 * 否則行為依舊，回覆內容為 `HTTP_RESPONSE_200_DUMMY`。 * 收到其他 method：行為依舊，回覆 `501 Not Implemented`。 * 宣告 local charater array 來存放 HTTP Response 的內容，搭配 `snprintf()` 函式將數字轉成字串，最後透過 `http_server_send()` 將訊息送出去。 * 詳細內容見 [commit](https://github.com/Yu-Wei-Chang/khttpd/commit/0c82a88d8dceff941c05cde27b7fb7fdccd6d2f4) ### 引入 [CMWQ](https://www.kernel.org/doc/html/v4.15/core-api/workqueue.html) 改寫 kHttpd * 原本的 kHttpd 在有人連線後，是使用 `kthread_run()` 來建立新的 kthread 來處理 HTTP 請求，預期引入 cmwq 機制，建立新的 work item 來處理 HTTP 請求。 * 參考 [kecho](https://github.com/sysprog21/kecho) 的實作方式來引入 cmwq 機制。實作如 [commit](https://github.com/Yu-Wei-Chang/khttpd/commit/7f3cb760a4768441322d732e9c6d2ca8df8d9ab9) * 效能比較：引入 cmwq 執行效率增加。 * 原本的作法，建立 `kthread` 來處理每個 HTTP 請求： ```shell $ ./htstress -n 100000 -c 100 -t 4 http://localhost:8081/fib/100 0 requests ... 90000 requests requests: 100000 good requests: 100000 [100%] bad requests: 0 [0%] socker errors: 0 [0%] seconds: 10.605 requests/sec: 9429.879 ``` * 改善後的作法，建立 `work item`，透過 `workqueue` 來處理每個 HTTP 請求： ```shell $ ./htstress -n 100000 -c 100 -t 4 http://localhost:8081/fib/100 0 requests ... 90000 requests requests: 100000 good requests: 100000 [100%] bad requests: 0 [0%] socker errors: 0 [0%] seconds: 7.860 requests/sec: 12723.386 ``` * 由參考資料 [Linux-workqueue講解](https://iter01.com/427752.html) ==發現 workqueue 實際上內部也是將 kernel thread 的用法封裝起來==，但比起自己使用 kernel thread 來處理 HTTP 請求，==為什用 cmwq 的效率會比較好，其實不是非常清楚==。以下自己推測一些原因： * 建立/刪除 kernel thread 是會花費資源的。每次有 HTTP 請求進來，就開 kthread 去處理，處理完又釋放掉很浪費資源。 * 核心會替每個處理器建立兩個 kthread，一個正常優先權，一個高優先權；除此之外還有不屬於任何處理器的 ubound worker-pool 也有建立 kthread。(u 開頭的是不屬於處理器的 worker-pool，其他的是處理器專屬的 worker-pool，名字有 `H` 的待表示高優先權) ```shell $ ps -ef | grep "kworker" root 5 2 0 09:39 ? 00:00:00 [kworker/0:0-eve] root 6 2 0 09:39 ? 00:00:00 [kworker/0:0H-kb] root 13 2 0 09:39 ? 00:00:00 [kworker/0:1-eve] root 20 2 0 09:39 ? 00:00:00 [kworker/1:0H-kb] root 32 2 0 09:39 ? 00:00:00 [kworker/3:0H-kb] root 149 2 0 09:39 ? 00:00:00 [kworker/u17:0-r] ... ``` * 呼叫 `alloc_workqueue()` 透過指定 flags 來決定要用哪個 worker-pool，workqueue 會替使用者有效利用核心預先建立好的 kthread 來處理HTTP 請求，免於自己建立/刪除而花費過多的資源。 ###### tags: `Linux核心課程筆記 - Homework`