# 2023q1 Homework7 (ktcp)
contributed by < `ItisCaleb` >
## 開發環境
```bash
$ gcc --version
gcc (GCC) 12.2.1 20230201
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Vendor ID: GenuineIntel
Model name: 12th Gen Intel(R) Core(TM) i5-12400
CPU family: 6
Model: 151
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 1
Stepping: 2
CPU(s) scaling MHz: 14%
CPU max MHz: 5600.0000
CPU min MHz: 800.0000
BogoMIPS: 4993.00
```
## CMWQ
[CMWQ document](https://www.kernel.org/doc/html/v4.15/core-api/workqueue.html)
在原本的實作中,分別有 MT(multi threaded) wq 及 ST(single threaded) wq
在 MT wq 的狀況下,每個 wq 就必須去維護跟 CPU 核心數量相同的 worker thread,在核心及 wq 越來越多的狀況下,很容易導致預設的 32K pid 用完
而 ST wq 則是只有一個 worker thread
這兩者之間的共同的問題就是每個 wq 都會去維護自己的 worker thread,使得一個 MT wq 在每個核心上一次只能同時執行一件工作,而 ST wq 則是在整個系統上
除了因為有太多 thread 在 idle 而造成資源的浪費外, concurrency 的程度也不理想
CMWQ 就是被設計出來解決上述問題,同時也提供跟原本 workqueue API 相容的 API
> Concurrency Managed Workqueue (cmwq) is a reimplementation of wq with focus on the following goals.
>* Maintain compatibility with the original workqueue API.
>* Use per-CPU unified worker pools shared by all wq to provide flexible level of concurrency on demand without wasting a lot of resource.
>* Automatically regulate worker pool and level of concurrency so that the API users don’t need to worry about such details.
### 引入 CMWQ
[commit ce98dde: Introduce CMWQ to rewrite worker](https://github.com/ItisCaleb/khttpd/commit/ce98dde88fd0edcb0b43abd91fa5ada82d635a63)
先新增兩個新的 struct
`http_service` 是用來存放 daemon 停止的狀態以及 list 的 head
`khttp` 則是做為 CMWQ 的 work item,內嵌 list 是為了在停止 daemon 之後可以去釋放所有 work item
```c
struct http_service {
bool is_stopped;
struct list_head worker;
};
struct khttp {
struct socket *sock;
struct list_head list;
struct work_struct khttp_work;
};
```
接著在 `main.c` 新增 workqueue
```c
struct workqueue_struct *khttp_wq;
```
並且分別在 init 及 exit 的時候分別加上 `alloc_workqueue` 跟 `destroy_workqueue`
```c
khttp_wq = alloc_workqueue(MODULE_NAME, WQ_UNBOUND, 0);
```
```c
destroy_workqueue(khttp_wq);
```
在 `http_server.c` 裡把原本的 worker 改成建立 work item 並 push 到 workqueue 裡
而當 daemon 停止的時候,也要去釋放所有 work item 以及對應的 socket
```diff
while (!kthread_should_stop()) {
int err = kernel_accept(param->listen_socket, &socket, 0);
if (err < 0) {
if (signal_pending(current))
break;
pr_err("kernel_accept() error: %d\n", err);
continue;
}
- worker = kthread_run(http_server_worker, socket, KBUILD_MODNAME);
- if (IS_ERR(worker)) {
- pr_err("can't create more worker process\n");
+ if (unlikely(!(work = create_work(socket)))) {
+ printk(KERN_ERR MODULE_NAME
+ ": create work error, connection closed\n");
+ kernel_sock_shutdown(socket, SHUT_RDWR);
+ sock_release(socket);
continue;
}
+ queue_work(khttp_wq, work);
}
+daemon.is_stopped = true;
+free_work();
```
使用 htstress 測量效能
| type | requests/sec |
| -------- | -------- |
| original | 76858.575 |
| CMWQ | 176263.634 |
可以看到光是引入 CMWQ 效能就提昇了一倍多