###### tags: `Linux`
# 2023q1 Homework7 (ktcp)
contributed by < [`shhung`](https://github.com/shhung/khttpd) >
## 實驗環境
```
shhung@shhung-linux:~$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
shhung@shhung-linux:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 43 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 5 2400G with Radeon Vega Graphics
CPU family: 23
Model: 17
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Stepping: 0
Frequency boost: enabled
CPU max MHz: 3600.0000
CPU min MHz: 1600.0000
BogoMIPS: 7186.57
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush
mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc r
ep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor
ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp
_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit w
dt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ss
bd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni x
saveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nr
ip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold a
vic v_vmsave_vmload vgif overflow_recov succor smca sev sev_es
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 128 KiB (4 instances)
L1i: 256 KiB (4 instances)
L2: 2 MiB (4 instances)
L3: 4 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-7
Vulnerabilities:
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Mitigation; untrained return thunk; SMT vulnerable
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, STIBP disabled, RSB filling, PBRSB-eIB
RS Not affected
Srbds: Not affected
Tsx async abort: Not affected
```
## kecho
CMWQ的優勢在於
1. 拆開 workqueue 與 worker-pools,可單純地將 work 放入 queue 中不必在意如何分配 worker 去執行,根據設定的 flags 決定如何分配,適時的做切換,減少 worker 的 idle 情況,讓系統使用率提升
2. 若任務長時間佔用系統資源 (或有 blocking 的情況產生),CMWQ 會動態建立新的執行緒並分配給其他的 CPU 執行,避免過多的執行緒產生
3. 使不同的任務之間能被更彈性的執行 (所有的 workqueue 共享),會根據不同的優先級執行
實際執行過 [kthread-based kecho](https://github.com/OscarShiang/kecho/tree/kthread_impl) 也能看到實驗結果與使用 CMWQ 的 kecho 的效能差異
## khttpd 引入 CMWQ
原本的 khttpd
```shell
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
socket errors: 0 [0%]
seconds: 3.346
requests/sec: 29888.897
```
引入後的結果
```shell
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
socket errors: 0 [0%]
seconds: 1.848
requests/sec: 54116.156
```
### 引入方法
參考 [kecho](https://github.com/sysprog21/kecho) 裡的 CMWQ 實作,將 CMWQ 的機制引入 khttpd 中,以下列出使用到的 api
```c
struct workqueue_struct *alloc_workqueue(const char *fmt, unsigned int flags, int max_active, ...) //allocate a workqueue
bool queue_work(struct workqueue_struct *wq, struct work_struct *work) //queue work on a workqueue
struct worker *create_worker(struct worker_pool *pool) //create a new workqueue worker
void destroy_workqueue(struct workqueue_struct *wq) //safely terminate a workqueue
```
將 CMWQ 所定義的結構 `work_struct` ,與 linked-list 的 `list_head` 方法將執行的 worker 打包成以下的結構
```c
struct khttpd {
struct socket *sock;
struct list_head list;
struct work_struct khttpd_work;
};
```
如此一來便能將原本 `kthread` 的方法簡單改變成 CMWQ 的方式實現
### 程式碼改動
#### http_server.h
```diff
+struct http_service {
+ bool is_stopped;
+ struct list_head worker;
+};
+
+struct khttpd {
+ struct socket *sock;
+ struct list_head list;
+ struct work_struct khttpd_work;
+};
```
#### main.c
在 `khttpd_init` 中加入 `alloc_workqueue()` 配置 workqueue 以等待 work 被加入
在 `khttpd_exit` 加入 `destroy_workqueue()` 釋放 workqueue 的資源
#### http_server.c
`http_server_worker` 改成透過 `container_of()` 從 `work_struct` 獲得需要的資訊:
```diff
- static int http_server_worker(void *arg)
+ static void http_server_worker(struct work_struct *work)
```
新增 `create_work` 為 work 配置資源使用, `free_work` 釋放所有 work 的資源並關閉連線
```c
static struct work_struct *create_work(struct socket *sk)
{
struct khttpd *work;
if (!(work = kmalloc(sizeof(struct khttpd), GFP_KERNEL)))
return NULL;
work->sock = sk;
INIT_WORK(&work->khttpd_work, http_server_worker);
list_add(&work->list, &daemon.worker);
return &work->khttpd_work;
}
static void free_work(void)
{
struct khttpd *l, *tar;
/* cppcheck-suppress uninitvar */
list_for_each_entry_safe (tar, l, &daemon.worker, list) {
kernel_sock_shutdown(tar->sock, SHUT_RDWR);
flush_work(&tar->khttpd_work);
sock_release(tar->sock);
kfree(tar);
}
}
```