Try   HackMD

2023q1 Homework7 (ktcp)

contributed by < ItisCaleb >

開發環境

$ gcc --version
gcc (GCC) 12.2.1 20230201

$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  12
  On-line CPU(s) list:   0-11
Vendor ID:               GenuineIntel
  Model name:            12th Gen Intel(R) Core(TM) i5-12400
    CPU family:          6
    Model:               151
    Thread(s) per core:  2
    Core(s) per socket:  6
    Socket(s):           1
    Stepping:            2
    CPU(s) scaling MHz:  14%
    CPU max MHz:         5600.0000
    CPU min MHz:         800.0000
    BogoMIPS:            4993.00

CMWQ

CMWQ document
在原本的實作中,分別有 MT(multi threaded) wq 及 ST(single threaded) wq

在 MT wq 的狀況下,每個 wq 就必須去維護跟 CPU 核心數量相同的 worker thread,在核心及 wq 越來越多的狀況下,很容易導致預設的 32K pid 用完
而 ST wq 則是只有一個 worker thread

這兩者之間的共同的問題就是每個 wq 都會去維護自己的 worker thread,使得一個 MT wq 在每個核心上一次只能同時執行一件工作,而 ST wq 則是在整個系統上
除了因為有太多 thread 在 idle 而造成資源的浪費外, concurrency 的程度也不理想

CMWQ 就是被設計出來解決上述問題,同時也提供跟原本 workqueue API 相容的 API

Concurrency Managed Workqueue (cmwq) is a reimplementation of wq with focus on the following goals.

  • Maintain compatibility with the original workqueue API.
  • Use per-CPU unified worker pools shared by all wq to provide flexible level of concurrency on demand without wasting a lot of resource.
  • Automatically regulate worker pool and level of concurrency so that the API users don’t need to worry about such details.

引入 CMWQ

commit ce98dde: Introduce CMWQ to rewrite worker
先新增兩個新的 struct
http_service 是用來存放 daemon 停止的狀態以及 list 的 head
khttp 則是做為 CMWQ 的 work item,內嵌 list 是為了在停止 daemon 之後可以去釋放所有 work item

struct http_service {
    bool is_stopped;
    struct list_head worker;
};

struct khttp {
    struct socket *sock;
    struct list_head list;
    struct work_struct khttp_work;
};

接著在 main.c 新增 workqueue

struct workqueue_struct *khttp_wq;

並且分別在 init 及 exit 的時候分別加上 alloc_workqueuedestroy_workqueue

khttp_wq = alloc_workqueue(MODULE_NAME, WQ_UNBOUND, 0);
destroy_workqueue(khttp_wq);

http_server.c 裡把原本的 worker 改成建立 work item 並 push 到 workqueue 裡
而當 daemon 停止的時候,也要去釋放所有 work item 以及對應的 socket

while (!kthread_should_stop()) {
    int err = kernel_accept(param->listen_socket, &socket, 0);
    if (err < 0) {
        if (signal_pending(current))
            break;
        pr_err("kernel_accept() error: %d\n", err);
        continue;
    }
-   worker = kthread_run(http_server_worker, socket, KBUILD_MODNAME);
-   if (IS_ERR(worker)) {
-      pr_err("can't create more worker process\n");
+   if (unlikely(!(work = create_work(socket)))) {
+      printk(KERN_ERR MODULE_NAME
+          ": create work error, connection closed\n");
+      kernel_sock_shutdown(socket, SHUT_RDWR);
+      sock_release(socket);
       continue;
    }
+   queue_work(khttp_wq, work);
}
+daemon.is_stopped = true;
+free_work();

使用 htstress 測量效能

type requests/sec
original 76858.575
CMWQ 176263.634

可以看到光是引入 CMWQ 效能就提昇了一倍多