# 2023q1 Homework7 (ktcp) contributed by < `ItisCaleb` > ## 開發環境 ```bash $ gcc --version gcc (GCC) 12.2.1 20230201 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Vendor ID: GenuineIntel Model name: 12th Gen Intel(R) Core(TM) i5-12400 CPU family: 6 Model: 151 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 Stepping: 2 CPU(s) scaling MHz: 14% CPU max MHz: 5600.0000 CPU min MHz: 800.0000 BogoMIPS: 4993.00 ``` ## CMWQ [CMWQ document](https://www.kernel.org/doc/html/v4.15/core-api/workqueue.html) 在原本的實作中,分別有 MT(multi threaded) wq 及 ST(single threaded) wq 在 MT wq 的狀況下,每個 wq 就必須去維護跟 CPU 核心數量相同的 worker thread,在核心及 wq 越來越多的狀況下,很容易導致預設的 32K pid 用完 而 ST wq 則是只有一個 worker thread 這兩者之間的共同的問題就是每個 wq 都會去維護自己的 worker thread,使得一個 MT wq 在每個核心上一次只能同時執行一件工作,而 ST wq 則是在整個系統上 除了因為有太多 thread 在 idle 而造成資源的浪費外, concurrency 的程度也不理想 CMWQ 就是被設計出來解決上述問題,同時也提供跟原本 workqueue API 相容的 API > Concurrency Managed Workqueue (cmwq) is a reimplementation of wq with focus on the following goals. >* Maintain compatibility with the original workqueue API. >* Use per-CPU unified worker pools shared by all wq to provide flexible level of concurrency on demand without wasting a lot of resource. >* Automatically regulate worker pool and level of concurrency so that the API users don’t need to worry about such details. ### 引入 CMWQ [commit ce98dde: Introduce CMWQ to rewrite worker](https://github.com/ItisCaleb/khttpd/commit/ce98dde88fd0edcb0b43abd91fa5ada82d635a63) 先新增兩個新的 struct `http_service` 是用來存放 daemon 停止的狀態以及 list 的 head `khttp` 則是做為 CMWQ 的 work item,內嵌 list 是為了在停止 daemon 之後可以去釋放所有 work item ```c struct http_service { bool is_stopped; struct list_head worker; }; struct khttp { struct socket *sock; struct list_head list; struct work_struct khttp_work; }; ``` 接著在 `main.c` 新增 workqueue ```c struct workqueue_struct *khttp_wq; ``` 並且分別在 init 及 exit 的時候分別加上 `alloc_workqueue` 跟 `destroy_workqueue` ```c khttp_wq = alloc_workqueue(MODULE_NAME, WQ_UNBOUND, 0); ``` ```c destroy_workqueue(khttp_wq); ``` 在 `http_server.c` 裡把原本的 worker 改成建立 work item 並 push 到 workqueue 裡 而當 daemon 停止的時候,也要去釋放所有 work item 以及對應的 socket ```diff while (!kthread_should_stop()) { int err = kernel_accept(param->listen_socket, &socket, 0); if (err < 0) { if (signal_pending(current)) break; pr_err("kernel_accept() error: %d\n", err); continue; } - worker = kthread_run(http_server_worker, socket, KBUILD_MODNAME); - if (IS_ERR(worker)) { - pr_err("can't create more worker process\n"); + if (unlikely(!(work = create_work(socket)))) { + printk(KERN_ERR MODULE_NAME + ": create work error, connection closed\n"); + kernel_sock_shutdown(socket, SHUT_RDWR); + sock_release(socket); continue; } + queue_work(khttp_wq, work); } +daemon.is_stopped = true; +free_work(); ``` 使用 htstress 測量效能 | type | requests/sec | | -------- | -------- | | original | 76858.575 | | CMWQ | 176263.634 | 可以看到光是引入 CMWQ 效能就提昇了一倍多