###### tags: `Linux` # 2023q1 Homework7 (ktcp) contributed by < [`shhung`](https://github.com/shhung/khttpd) > ## 實驗環境 ``` shhung@shhung-linux:~$ gcc --version gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. shhung@shhung-linux:~$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 43 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: AuthenticAMD Model name: AMD Ryzen 5 2400G with Radeon Vega Graphics CPU family: 23 Model: 17 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 0 Frequency boost: enabled CPU max MHz: 3600.0000 CPU min MHz: 1600.0000 BogoMIPS: 7186.57 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc r ep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp _legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit w dt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ss bd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni x saveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nr ip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold a vic v_vmsave_vmload vgif overflow_recov succor smca sev sev_es Virtualization features: Virtualization: AMD-V Caches (sum of all): L1d: 128 KiB (4 instances) L1i: 256 KiB (4 instances) L2: 2 MiB (4 instances) L3: 4 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-7 Vulnerabilities: Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Not affected Retbleed: Mitigation; untrained return thunk; SMT vulnerable Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Retpolines, IBPB conditional, STIBP disabled, RSB filling, PBRSB-eIB RS Not affected Srbds: Not affected Tsx async abort: Not affected ``` ## kecho CMWQ的優勢在於 1. 拆開 workqueue 與 worker-pools,可單純地將 work 放入 queue 中不必在意如何分配 worker 去執行,根據設定的 flags 決定如何分配,適時的做切換,減少 worker 的 idle 情況,讓系統使用率提升 2. 若任務長時間佔用系統資源 (或有 blocking 的情況產生),CMWQ 會動態建立新的執行緒並分配給其他的 CPU 執行,避免過多的執行緒產生 3. 使不同的任務之間能被更彈性的執行 (所有的 workqueue 共享),會根據不同的優先級執行 實際執行過 [kthread-based kecho](https://github.com/OscarShiang/kecho/tree/kthread_impl) 也能看到實驗結果與使用 CMWQ 的 kecho 的效能差異 ## khttpd 引入 CMWQ 原本的 khttpd ```shell requests: 100000 good requests: 100000 [100%] bad requests: 0 [0%] socket errors: 0 [0%] seconds: 3.346 requests/sec: 29888.897 ``` 引入後的結果 ```shell requests: 100000 good requests: 100000 [100%] bad requests: 0 [0%] socket errors: 0 [0%] seconds: 1.848 requests/sec: 54116.156 ``` ### 引入方法 參考 [kecho](https://github.com/sysprog21/kecho) 裡的 CMWQ 實作,將 CMWQ 的機制引入 khttpd 中,以下列出使用到的 api ```c struct workqueue_struct *alloc_workqueue(const char *fmt, unsigned int flags, int max_active, ...) //allocate a workqueue bool queue_work(struct workqueue_struct *wq, struct work_struct *work) //queue work on a workqueue struct worker *create_worker(struct worker_pool *pool) //create a new workqueue worker void destroy_workqueue(struct workqueue_struct *wq) //safely terminate a workqueue ``` 將 CMWQ 所定義的結構 `work_struct` ,與 linked-list 的 `list_head` 方法將執行的 worker 打包成以下的結構 ```c struct khttpd { struct socket *sock; struct list_head list; struct work_struct khttpd_work; }; ``` 如此一來便能將原本 `kthread` 的方法簡單改變成 CMWQ 的方式實現 ### 程式碼改動 #### http_server.h ```diff +struct http_service { + bool is_stopped; + struct list_head worker; +}; + +struct khttpd { + struct socket *sock; + struct list_head list; + struct work_struct khttpd_work; +}; ``` #### main.c 在 `khttpd_init` 中加入 `alloc_workqueue()` 配置 workqueue 以等待 work 被加入 在 `khttpd_exit` 加入 `destroy_workqueue()` 釋放 workqueue 的資源 #### http_server.c `http_server_worker` 改成透過 `container_of()` 從 `work_struct` 獲得需要的資訊: ```diff - static int http_server_worker(void *arg) + static void http_server_worker(struct work_struct *work) ``` 新增 `create_work` 為 work 配置資源使用, `free_work` 釋放所有 work 的資源並關閉連線 ```c static struct work_struct *create_work(struct socket *sk) { struct khttpd *work; if (!(work = kmalloc(sizeof(struct khttpd), GFP_KERNEL))) return NULL; work->sock = sk; INIT_WORK(&work->khttpd_work, http_server_worker); list_add(&work->list, &daemon.worker); return &work->khttpd_work; } static void free_work(void) { struct khttpd *l, *tar; /* cppcheck-suppress uninitvar */ list_for_each_entry_safe (tar, l, &daemon.worker, list) { kernel_sock_shutdown(tar->sock, SHUT_RDWR); flush_work(&tar->khttpd_work); sock_release(tar->sock); kfree(tar); } } ```