owned this note
owned this note
Published
Linked with GitHub
# ktcp
contributed by < `chiacyu` >
## [CMWQ](https://www.kernel.org/doc/html/latest/core-api/workqueue.html) 解讀
從文章的描述可以看到作者主要提到幾個問題
- 原本的workqueue無法在多個不同的 `CPU` 核之間互相搬移任務
- 原本的 Multi-thread workqueue 必須保持跟 CPU 核心一樣數量的 worker 可能會造成資源的浪費
- Work item之間必須彼此競爭可能導致更多的延遲
透過 CMWQ 希望能夠作到除了能兼容原先的實做之外還做了一些修改包括
- 將 worker pool 共享給所有的 workerqueue
- 將 worker pool 裡worker的數量維持基礎水位避免過多worker佔用系統資源
- 當 work item 佔用太多時間,scheduler會介入換另一個work item可以被服務
利用 `kecho` 裡面的 `bench` 來做一下測試:以下是 `user-echo-server.c` 的結果

`kecho` 的結果如下

可以看到除了在 `kernel space` 執行外, CMWQ也帶來時粉顯著的
---
## CPU scheduler and workqueue/CMWQ
---
## 於 ktcp 中導入 CMWQ
首先可以先看還未引入 `cmwq` 時的執行效果
```c
0 requests
10000 requests
20000 requests
30000 requests
40000 requests
50000 requests
60000 requests
70000 requests
80000 requests
90000 requests
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
socket errors: 0 [0%]
seconds: 2.232
requests/sec: 44807.565
Complete
```
### 引入 cmwq
首先需要新增幾個資料結構來進行後續操作
```c
struct khttp {
struct socket *sock;
struct list_head list;
struct work_struct khttp_work;
};
```
透過 `khttp` 資料結構來紀錄
- 連接的 `socket` 位址
- `list` 結構為鏈結串列之節點
- `khttp_work` 針對 `workqueue` 的 單一 `worker`
```c
struct khttp_server_service
{
bool is_stopped;
struct list_head worker;
};
```
透過 `khttp_server_service` 來紀錄
- `is_stopped` 來紀錄整個 `server` 目前的狀態
- `worker` 來作為紀錄 `worker` 鏈結串列的首部節點
```c
static struct work_struct *create_work(struct socket *sk)
{
struct khttp *work;
if (!(work = kmalloc(sizeof(struct khttp), GFP_KERNEL)))
return NULL;
work->sock = sk;
INIT_WORK(&work->khttp_work, http_server_worker);
list_add(&work->list, &daemon.worker);
return &work->khttp_work;
}
```
透過 `create_work()` 當不同的客戶端進行連線的時候,新增一個 `thread` 透過 `list_add `並將其串到 `daemon.worker` 節點的後方
- [INIT_WORK](https://elixir.bootlin.com/linux/latest/source/tools/testing/selftests/rcutorture/formal/srcu-cbmc/src/workqueues.h#L76)
```c
static void free_work(void)
{
struct khttp *l, *tar;
/* cppcheck-suppress uninitvar */
list_for_each_entry_safe (tar, l, &daemon.worker, list) {
kernel_sock_shutdown(tar->sock, SHUT_RDWR);
flush_work(&tar->khttp_work);
sock_release(tar->sock);
kfree(tar);
}
}
```
接著需要 `free_work()` 來將所有資源釋放,透過 `list_for_each_entry_safe` 走訪每一個 `struct khttp *work`
- `kernel_sock_shutdown` 來關閉該 `work` 所監聽的 `socket`
- `flush_work` 將 `work_struct` 清空
- `sock_release` 將已經關閉的 `socket` 釋放
接著來看如何處理每一個客戶端的連線處理
```c
static void http_server_worker(struct work_struct *work)
{
struct khttp *worker = container_of(work, struct khttp, khttp_work);
char *buf;
struct http_parser parser;
struct http_parser_settings setting = {
.on_message_begin = http_parser_callback_message_begin,
.on_url = http_parser_callback_request_url,
.on_header_field = http_parser_callback_header_field,
.on_header_value = http_parser_callback_header_value,
.on_headers_complete = http_parser_callback_headers_complete,
.on_body = http_parser_callback_body,
.on_message_complete = http_parser_callback_message_complete};
struct http_request request;
struct socket *socket = worker->sock;
allow_signal(SIGKILL);
allow_signal(SIGTERM);
buf = kzalloc(RECV_BUFFER_SIZE, GFP_KERNEL);
if (!buf) {
pr_err("can't allocate memory!\n");
}
request.socket = socket;
http_parser_init(&parser, HTTP_REQUEST);
parser.data = &request;
while (!daemon.is_stopped) {
int ret;
memset(buf, 0, RECV_BUFFER_SIZE - 1);
ret = http_server_recv(socket, buf, RECV_BUFFER_SIZE - 1);
if (ret <= 0) {
if (ret)
pr_err("recv error: %d\n", ret);
break;
}
http_parser_execute(&parser, &setting, buf, ret);
if (request.complete && !http_should_keep_alive(&parser))
break;
memset(buf, 0, RECV_BUFFER_SIZE);
}
kernel_sock_shutdown(socket, SHUT_RDWR);
sock_release(socket);
kfree(buf);
}
```
透過 `http_server_worker()` 來處理每個客戶端的連線
- 透過 container_of(work, struct khttp, khttp_work) 來從 `work` 裡面找到目標 `thread`
- struct socket *socket = worker->sock 來取得該客戶所連接的 `socket`
- 檢查 `daemon.is_stopped` 若服務已停止則關閉該 `socket` 並釋放 `buf`
```c
int http_server_daemon(void *arg)
{
struct socket *socket;
struct work_struct *work;
struct http_server_param *param = (struct http_server_param *) arg;
allow_signal(SIGKILL);
allow_signal(SIGTERM);
INIT_LIST_HEAD(&daemon.worker);
while (!kthread_should_stop()) {
int err = kernel_accept(param->listen_socket, &socket, 0);
if (err < 0) {
if (signal_pending(current))
break;
pr_err("kernel_accept() error: %d\n", err);
continue;
}
if (unlikely(!(work = create_work(socket)))) {
printk(KERN_ERR "khttp : create work error, connection closed\n");
kernel_sock_shutdown(socket, SHUT_RDWR);
sock_release(socket);
continue;
}
/* start server worker */
queue_work(khttp_wq, work);
}
printk("khttp : daemon shutdown in progress...\n");
daemon.is_stopped = true;
free_work();
return 0;
}
```
透過 `http_server_daemon()` 來啟動 server daemon
- 透過 `INIT_LIST_HEAD` 將 `daemon.worker` 節點初始化之後新增的客戶端可以透過 `list_add()` 加入鏈結串列中
- 透過 `kthread_should_stop()` 判斷執行緒是否在執行中,若是尚未結束則透過 `kernel_accept` 建立新的連線
- 建立新的 `socket` 後透過 `create_work` 來建立新的執行緒處理新的連線
- 最後透過 `queue_work()` 啟動 `workqueue`
最後我們需要將 `server` 註冊進 Linux 系統模組中
```c
static int __init khttpd_init(void)
{
int err = open_listen_socket(port, backlog, &listen_socket);
if (err < 0) {
pr_err("can't open listen socket\n");
return err;
}
param.listen_socket = listen_socket;
khttp_wq = alloc_workqueue("khttp_wq", WQ_UNBOUND, 0);
http_server = kthread_run(http_server_daemon, ¶m, KBUILD_MODNAME);
if (IS_ERR(http_server)) {
pr_err("can't start http server daemon\n");
close_listen_socket(listen_socket);
return PTR_ERR(http_server);
}
return 0;
}
```
在 `khttpd_init()`的時候
- 透過 `open_listen_socket()` 來監聽目標 `socket` 客戶可以透過該 `socket` 來建立連線
- 透過 `alloc_workqueue()` 來創造並啟動 `workqueue`
修改完之後的效能表現如下
```c
0 requests
10000 requests
20000 requests
30000 requests
40000 requests
50000 requests
60000 requests
70000 requests
80000 requests
90000 requests
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
socket errors: 0 [0%]
seconds: 1.377
requests/sec: 72631.558
Complete
```
---
## 引入 RCU 來管理客戶端
關於 RCU 的相關資訊可以查看 [Linux 核心設計: RCU 同步機制](https://hackmd.io/@sysprog/linux-rcu) 跟 [What is RCU, Fundamentally?](https://lwn.net/Articles/262464/) 最適合 RCU 的場景為, 「讀取很頻繁,寫入較少,且嚴格要求資料一致性」, 因此初步引入 RCU 來管理客戶鍊結串列
在 `create_work()` 中使用 `list_add_rcu()` 來加入新的客戶端
```c
static struct work_struct *create_work(struct socket *sk)
{
struct khttp *work;
if (!(work = kmalloc(sizeof(struct khttp), GFP_KERNEL)))
return NULL;
work->sock = sk;
INIT_WORK(&work->khttp_work, http_server_worker);
list_add_rcu(&work->list, &daemon.worker);
return &work->khttp_work;
}
```
在 `free_work()` 中 使用 `list_for_each_entry_rcu()` 來走訪鍊結串列並將其一一釋放
```c
static void free_work(void)
{
struct khttp *l, *tar;
/* cppcheck-suppress uninitvar */
rcu_read_lock();
list_for_each_entry_rcu (tar, &daemon.worker, list) {
kernel_sock_shutdown(tar->sock, SHUT_RDWR);
flush_work(&tar->khttp_work);
sock_release(tar->sock);
kfree(tar);
}
rcu_read_unlock();
}
```
但執行結果卻不如預期,因此需要好好運用 `ftrace` 等工具來進行分析
```c
0 requests
10000 requests
20000 requests
30000 requests
40000 requests
50000 requests
60000 requests
70000 requests
80000 requests
90000 requests
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
socket errors: 0 [0%]
seconds: 1.668
requests/sec: 59951.823
Complete
```
## ftrace 追蹤程式運行狀態
`ftrace` 是 Linux kernel 提供的追蹤機制,相關的內容可以參考 [Debugging the kernel using Ftrace - part 1](https://lwn.net/Articles/365835/) 跟 [Debugging the kernel using Ftrace - part 2](https://lwn.net/Articles/366796/) 還有 "Demystifying the Linux CPU Scheduler" 的第六章也可以看到相關的敘述
首先看看目前的系統是否有提供 `ftrace` 的功能
```c
cat /boot/config-`uname -r` | grep CONFIG_HAVE_FUNCTION_TRACER
```
如果看到下列內容代表 `ftrace` 在該版本中可以使用
```
CONFIG_HAVE_FUNCTION_TRACER=y
```
接著可以到 `/sys/kernel/debug/tracing` 印出以下內容
```c
root@chiacyu-msi:/sys/kernel/debug/tracing# ls
available_events max_graph_depth stack_max_size
available_filter_functions options stack_trace
available_tracers per_cpu stack_trace_filter
buffer_percent printk_formats synthetic_events
buffer_size_kb README timestamp_mode
buffer_total_size_kb saved_cmdlines trace
current_tracer saved_cmdlines_size trace_clock
dynamic_events saved_tgids trace_marker
dyn_ftrace_total_info set_event trace_marker_raw
enabled_functions set_event_notrace_pid trace_options
error_log set_event_pid trace_pipe
events set_ftrace_filter trace_stat
free_buffer set_ftrace_notrace tracing_cpumask
function_profile_enabled set_ftrace_notrace_pid tracing_max_latency
hwlat_detector set_ftrace_pid tracing_on
instances set_graph_function tracing_thresh
kprobe_events set_graph_notrace uprobe_events
kprobe_profile snapshot uprobe_profile
```
ftrace 的使用方式是透過 `ehco` 寫入來進行互動,可以先查看 `available_filter_functions` 的內容,其中紀錄了目前 `ftrace` 可以追蹤的函式。
但是在需要先將 `khttp.ko` 透過 註冊進核心模組,之後就可以看到
```c
root@chiacyu-msi:/sys/kernel/debug/tracing# cat available_filter_functions | grep khttp
parse_url_char.part.0 [khttpd]
http_message_needs_eof [khttpd]
http_should_keep_alive [khttpd]
http_parser_execute [khttpd]
http_method_str [khttpd]
http_status_str [khttpd]
http_parser_init [khttpd]
http_parser_settings_init [khttpd]
http_errno_name [khttpd]
http_errno_description [khttpd]
http_parser_url_init [khttpd]
http_parser_parse_url [khttpd]
http_parser_pause [khttpd]
http_body_is_final [khttpd]
http_parser_version [khttpd]
http_parser_set_max_header_size [khttpd]
http_parser_callback_header_field [khttpd]
http_parser_callback_headers_complete [khttpd]
http_parser_callback_request_url [khttpd]
http_parser_callback_message_begin [khttpd]
http_parser_callback_body [khttpd]
http_server_recv.constprop.0 [khttpd]
http_server_worker [khttpd]
http_parser_callback_header_value [khttpd]
http_server_daemon [khttpd]
http_server_send.isra.0 [khttpd]
http_parser_callback_message_complete [khttpd]
```
我們可以撰寫一個 `shellscript` 來設定 `ftrace`
- `max_graph_depth` 可以設定測量函式的深度
- `current_tracer` 會紀錄使用的量測項目,這邊設定為 `function_graph`
- `set_graph_function` 則設定欲觀察的程式,在此為 `http_server_worker`
```c
#!/bin/bash
TRACE_DIR=/sys/kernel/debug/tracing
echo > $TRACE_DIR/set_ftrace_filter
echo > $TRACE_DIR/current_tracer
echo nop > $TRACE_DIR/current_tracer
echo function_graph > $TRACE_DIR/current_tracer
# depth of the function calls
echo 1 > max_graph_depth
echo http_server_worker > $TRACE_DIR/set_graph_function
echo 1 > $TRACE_DIR/tracing_on
./htstress -n 100 -c 1 -t 4 http://localhost:8081/
echo 0 > $TRACE_DIR/tracing_on
```
執行完之後可以來看看 `trace` 裡面的內容
```c
root@chiacyu-msi:/sys/kernel/debug/tracing# cat trace | head -20
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
10) | http_server_worker [khttpd]() {
10) | kernel_sigaction() {
10) 0.140 us | _raw_spin_lock_irq();
10) 0.110 us | _raw_spin_unlock_irq();
10) 1.130 us | }
10) | kernel_sigaction() {
10) 0.110 us | _raw_spin_lock_irq();
10) 0.120 us | _raw_spin_unlock_irq();
10) 0.620 us | }
10) | kmem_cache_alloc_trace() {
10) 0.110 us | __cond_resched();
10) 0.100 us | should_failslab();
10) 1.190 us | }
10) 0.130 us | http_parser_init [khttpd]();
10) | http_server_recv.constprop.0 [khttpd]() {
10) | kernel_recvmsg() {
```
接著可以將 `max_graph_depth` 的數字增加來看看結果
```c
root@chiacyu-msi:/sys/kernel/debug/tracing# cat trace | head -300
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
5) | http_server_worker [khttpd]() {
5) | kernel_sigaction() {
5) 0.220 us | _raw_spin_lock_irq();
5) 0.150 us | _raw_spin_unlock_irq();
5) 0.951 us | }
5) | kernel_sigaction() {
5) 0.100 us | _raw_spin_lock_irq();
5) 0.170 us | _raw_spin_unlock_irq();
5) 0.540 us | }
5) | kmem_cache_alloc_trace() {
5) 0.090 us | __cond_resched();
5) 0.090 us | should_failslab();
5) 0.990 us | }
5) 0.100 us | http_parser_init [khttpd]();
5) | http_server_recv.constprop.0 [khttpd]() {
5) | kernel_recvmsg() {
5) | sock_recvmsg() {
5) | security_socket_recvmsg() {
5) 0.700 us | apparmor_socket_recvmsg();
5) 0.890 us | }
5) | inet_recvmsg() {
5) 1.550 us | tcp_recvmsg();
5) 1.821 us | }
5) 3.081 us | }
5) 3.251 us | }
5) 3.441 us | }
5) | kernel_sock_shutdown() {
5) | inet_shutdown() {
5) | lock_sock_nested() {
5) 0.090 us | __cond_resched();
5) 0.100 us | _raw_spin_lock_bh();
5) | _raw_spin_unlock_bh() {
5) 0.100 us | __local_bh_enable_ip();
5) 0.260 us | }
5) 0.770 us | }
5) | tcp_shutdown() {
5) | tcp_set_state() {
5) 0.100 us | inet_sk_state_store();
5) 0.290 us | }
5) | tcp_send_fin() {
5) 1.840 us | __alloc_skb();
5) 0.100 us | sk_forced_mem_schedule();
5) 0.400 us | tcp_current_mss();
5) + 43.448 us | __tcp_push_pending_frames();
5) + 46.338 us | }
5) + 46.968 us | }
5) | sock_def_wakeup() {
5) 0.100 us | __rcu_read_lock();
5) 0.100 us | __rcu_read_unlock();
5) 0.490 us | }
5) | release_sock() {
5) 0.090 us | _raw_spin_lock_bh();
5) | __release_sock() {
5) 0.130 us | _raw_spin_unlock_bh();
5) 4.471 us | tcp_v4_do_rcv();
5) 0.090 us | __cond_resched();
5) 0.090 us | _raw_spin_lock_bh();
5) 5.271 us | }
5) 0.100 us | tcp_release_cb();
5) | _raw_spin_unlock_bh() {
5) 0.090 us | __local_bh_enable_ip();
5) 0.260 us | }
5) 6.171 us | }
5) + 54.929 us | }
5) + 55.199 us | }
5) | sock_release() {
5) | inet_release() {
5) 0.110 us | ip_mc_drop_socket();
5) | tcp_close() {
5) | lock_sock_nested() {
5) 0.080 us | __cond_resched();
5) 0.090 us | _raw_spin_lock_bh();
5) 0.130 us | _raw_spin_unlock_bh();
5) 0.670 us | }
5) | __tcp_close() {
5) 0.150 us | __sk_mem_reclaim();
5) 0.090 us | _raw_write_lock_bh();
5) 0.120 us | _raw_write_unlock_bh();
5) 0.100 us | _raw_spin_lock();
5) 0.100 us | __release_sock();
5) 1.141 us | inet_csk_destroy_sock();
5) 0.090 us | _raw_spin_unlock();
5) 0.090 us | __local_bh_enable_ip();
5) 2.761 us | }
5) | release_sock() {
5) 0.090 us | _raw_spin_lock_bh();
5) 0.120 us | tcp_release_cb();
5) 0.130 us | _raw_spin_unlock_bh();
5) 0.740 us | }
5) | sk_free() {
5) 1.270 us | __sk_free();
5) 1.470 us | }
5) 6.561 us | }
5) 7.011 us | }
5) 0.110 us | module_put();
5) | iput() {
5) 0.080 us | _raw_spin_lock();
5) 0.110 us | _raw_spin_unlock();
5) | evict() {
5) | inode_wait_for_writeback() {
5) 0.120 us | _raw_spin_lock();
5) 0.171 us | __inode_wait_for_writeback();
5) 0.100 us | _raw_spin_unlock();
5) 0.741 us | }
5) | truncate_inode_pages_final() {
5) 0.100 us | truncate_inode_pages_range();
5) 0.300 us | }
5) | clear_inode() {
5) 0.090 us | _raw_spin_lock_irq();
5) 0.090 us | _raw_spin_unlock_irq();
5) 0.450 us | }
5) 0.090 us | _raw_spin_lock();
5) 0.180 us | wake_up_bit();
5) 0.090 us | _raw_spin_unlock();
5) | destroy_inode() {
5) 1.000 us | __destroy_inode();
5) 0.140 us | call_rcu();
5) 1.480 us | }
5) 4.121 us | }
5) 4.981 us | }
5) + 12.492 us | }
5) 0.280 us | kfree();
5) + 76.673 us | }
```
可以看到在 `__tcp_push_pending_frames` 花了最久的時間。
## 檢查是否提供 `keep-Alive` 功能
在測試之前需要先充分的了解 HTTP request 的格式, 詳細資料可以參考 [HTTP Messages](https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages)。 HTTP 的 request 可以分成三的部份
- Method : 定義要求資料的形式,如 `GET`, `POST` 等等
- Request target : 要求的資料位置,通常是以 `URL` 形式
- HTTP version : HTTP 的版本
因此我們在掛載 `khttp` 之後輸入 `telnet localhost 8081`, 分別輸入 `GET / HTTP/1.0` 跟 `GET / HTTP/1.1`
```c
(base) chiacyu@chiacyu-msi:~$ telnet localhost 8081
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.0
HTTP/1.1 200 OK
Server: khttpd
Content-Type: text/plain
Content-Length: 12
Connection: Close
Hello World!
Connection closed by foreign host.
```
```c
(base) chiacyu@chiacyu-msi:~$ telnet localhost 8081
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
HTTP/1.1 200 OK
Server: khttpd
Content-Type: text/plain
Content-Length: 12
Connection: Keep-Alive
Hello World!
```
可以看到目前的 `khttp` 目前有提供 `Keep Alive` 的功能
## 使用 `timer` 主動中斷超時連線
由於目前的 `khttp` 沒有提供 `timer` 的機制來中斷連線,這個部份可以參考 [sehttpd](https://github.com/sysprog21/sehttpd) 的實作方式。
`sehttpd` 是透過一個 `priority queue` 的方式來管理所有連線。其中 `priority queue` 的結構是一個 `min heap` 其中透過 `prio_queue_min()` 取出最接近 `deadline` 的連線。
```c
static inline void *prio_queue_min(prio_queue_t *ptr)
{
return prio_queue_is_empty(ptr) ? NULL : ptr->priv[1];
}
```
原本 `sehttpd` 裡面更新時間的方法為透過 `gettimeofday()` 的方式來獲取目前系統的時間,再轉換成 `ms` 的單位。
```c
static void time_update()
{
struct timeval tv;
int rc UNUSED = gettimeofday(&tv, NULL);
assert(rc == 0 && "time_update: gettimeofday error");
current_msec = tv.tv_sec * 1000 + tv.tv_usec / 1000;
}
```
很遺憾的是在 `kernel space` 並沒有辦法直接使用 `gettimeofday()` 需要透過別的方式得到目前的系統時間。 這邊使用 [ktime_get_real()](https://docs.kernel.org/core-api/timekeeping.html), 在透過 ktime_to_ms() 轉換成 `ms` 的格式。
```c
static void time_update(void)
{
ktime_t kt = ktime_get_real();
current_msec = ktime_to_ms(kt);
}
```
接著在 `http_server_daemon()` 裡面透過 `timer_init()` 將 `timer` 初始化,接著透過 `handle_expired_timers()` 來找出所有超過截止時間的連線,並一一將其釋放。
```c
int http_server_daemon(void *arg)
{
struct socket *socket;
struct work_struct *work;
struct http_server_param *param = (struct http_server_param *) arg;
allow_signal(SIGKILL);
allow_signal(SIGTERM);
timer_init();
INIT_LIST_HEAD(&daemon.worker);
while (!kthread_should_stop()) {
int time = find_timer();
pr_info("wait time = %d\n", time);
handle_expired_timers();
int err = kernel_accept(param->listen_socket, &socket, 0);
...
...
```
在測試的時候遇到一個問題,就是 `http_server_daemon()` 會停留在 `kernel_accept()` 的部份而不會回到迴圈的開始,發現的原因是透過 `dmesg` 查看時並沒有看到 `"wait time = %d\n"` 持續被輸出,且超過時間的客戶連線也沒有順利被關閉。 翻找資料的時候看到 [Risheng1128](https://hackmd.io/@Risheng/linux2022-ktcp/https%3A%2F%2Fhackmd.io%2F%40Risheng%2Flinux2022-khttpd) 同學的報告才知道需要將 `socket` 改成 non-blocking 的方式。 詳細可以看這個 [commit](https://github.com/chiacyu/khttpd/commit/6c22e405c250058298224c023ce84d85dc5d9335)
```c
@@ -247,7 +248,8 @@ int http_server_daemon(void *arg)
pr_info("wait time = %d\n", time);
handle_expired_timers();
int err = kernel_accept(param->listen_socket, &socket, 0);
// int err = kernel_accept(param->listen_socket, &socket, 0);
int err = kernel_accept(param->listen_socket, &socket, SOCK_NONBLOCK);
if (err < 0) {
if (signal_pending(current))
break;
...
...
```
在 [kernel_accept](https://www.kernel.org/doc/html/v5.6/networking/kapi.html) 的頁面中可以看到,`int kernel_accept(struct socket * sock, struct socket ** newsock, int flags)` 函式需要透過三個參數,第一個參數為目前監聽的 `socket`, 第二個為要建立的新連線的 `socket`, 最後一個則為 `flag` 來設定 `socket` 的相關屬性。
>flags must be SOCK_CLOEXEC, SOCK_NONBLOCK or 0. If it fails, newsock is guaranteed to be NULL. Returns 0 or an error.
所以需要把第三個參數內容改成 `SOCK_NONBLOCK`
接著透過 `./htstress -n 10000 http://localhost:8081/` 來進行測試可以從 `dmesg` 中看到 `timer` 如預期的運作。
```c
[26107.946917] khttpd: handle_expired_timers() node->deleted: free node of socket 637491968
[26107.946968] khttpd: add_timer: prio_queue_insert successfully
[26107.946972] khttpd: requested_url = /
[26107.947031] khttpd: add_timer: prio_queue_insert successfully
[26107.947035] khttpd: requested_url = /
[26107.947044] khttpd: handle_expired_timers() node->deleted: free node of socket 1661477824
[26107.947094] khttpd: add_timer: prio_queue_insert successfully
[26107.947098] khttpd: requested_url = /
[26107.947108] khttpd: handle_expired_timers() node->deleted: free node of socket 1660977984
[26107.947154] khttpd: add_timer: prio_queue_insert successfully
[26107.947158] khttpd: requested_url = /
[26107.947168] khttpd: handle_expired_timers() node->deleted: free node of socket 1226516224
```
## 實做 [directory listing]()的功能
在 `kernel space` 有提供 `int iterate_dir(struct file *file, struct dir_context *ctx)` 函式可以使用。 關於 [int iterate_dir()](https://elixir.bootlin.com/linux/latest/source/fs/readdir.c#L40) 的定義需要輸入兩個參數,分別是 `struct file *file` 與 `struct dir_context *ctx`。
在 `kernel space` 裡面要開啟檔案需要透過不同的函式,這邊透過 [filp_open(const char *filename, int flags, umode_t mode)](https://elixir.bootlin.com/linux/latest/source/fs/open.c#L1315) 來回傳一個 `struct file` 的指針。
在這邊先指定打開 `"/"` root的檔案位置。再來可以看看
[`struct dir_context *`](https://elixir.bootlin.com/linux/v4.8/source/include/linux/fs.h#L1644) 的結構。透過 `typedef int (*filldir_t)(struct dir_context *, const char *, int, loff_t, u64,
unsigned);` 來定義 `callback function`. 這邊先定義出 `printdir()` 來作為 `callback function`。當 `iterate_dir()`被執行的時候會呼叫 `printdir()`。
```c
static int printdir(struct dir_context *ctx, const char *name, int namlen,
loff_t offset, u64 ino, unsigned int d_type) {
if (strcmp(name, ".") ==0 || strcmp(name, "..") == 0 ){
return 0;
}
pr_info("Filename : %s\n", name);
return 0;
}
void list_directory(void)
{
char *path = "/";
struct dir_context ctx = {.actor = &printdir};
struct file *fp = filp_open(path, O_DIRECTORY, S_IRWXU | S_IRWXG | S_IRWXO);
if (IS_ERR(fp)) {
printk("Open file error\n");
}
iterate_dir(fp, &ctx);
return;
}
```
執行出來的結果為下圖,可以看到成功印出 `root` 裡面的檔案內容,接著要把內容轉換成 `http` 的資料格式。
```bash=
[ 2662.325454] khttpd: Filename : dev
[ 2662.325455] khttpd: Filename : cdrom
[ 2662.325455] khttpd: Filename : boot
[ 2662.325456] khttpd: Filename : proc
[ 2662.325456] khttpd: Filename : lib32
[ 2662.325457] khttpd: Filename : var
[ 2662.325457] khttpd: Filename : snap
[ 2662.325457] khttpd: Filename : mnt
[ 2662.325458] khttpd: Filename : etc
[ 2662.325458] khttpd: Filename : sbin
[ 2662.325458] khttpd: Filename : opt
[ 2662.325459] khttpd: Filename : lib64
[ 2662.325459] khttpd: Filename : sys
[ 2662.325459] khttpd: Filename : media
[ 2662.325460] khttpd: Filename : lib
[ 2662.325460] khttpd: Filename : tmp
[ 2662.325460] khttpd: Filename : libx32
[ 2662.325461] khttpd: Filename : root
[ 2662.325461] khttpd: Filename : swapfile
[ 2662.325461] khttpd: Filename : run
[ 2662.325462] khttpd: Filename : bin
[ 2662.325462] khttpd: Filename : home
[ 2662.325462] khttpd: Filename : srv
[ 2662.325463] khttpd: Filename : lost+found
[ 2662.325463] khttpd: Filename : usr
```
`Http` response 的資料格式可以參考 [http response](https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages)。修改完成程式碼之後可以透過瀏覽器測試。
```c
static int printdir(struct dir_context *ctx,
const char *name,
int namlen,
loff_t offset,
u64 ino,
unsigned int d_type)
{
char *buf = kmalloc(BUFFER_SIZE, GFP_KERNEL);
struct http_request *request = container_of(ctx, struct http_request, ctx);
if (strcmp(name, ".") == 0 || strcmp(name, "..") == 0) {
return 0;
}
snprintf(buf, BUFFER_SIZE, "<li><a href=/%s/>%s</a></li>", name, name);
http_server_send(request->socket, buf, BUFFER_SIZE);
return 0;
}
static void list_directory_info(struct http_request *request)
{
pr_info("Into : list_directory_info()\n");
char *response = kmalloc(BUFFER_SIZE, GFP_KERNEL);
if (request->method != HTTP_GET) {
response = HTTP_RESPONSE_501;
http_server_send(request->socket, response, strlen(response));
kfree(response);
}
char *path = "/";
request->ctx.actor = &printdir;
struct file *fp = filp_open(path, O_RDONLY, 0);
if (IS_ERR(fp)) {
pr_err("Open file error\n");
}
snprintf(response, BUFFER_SIZE, "HTTP/1.1 200 OK \r\n%s%s%s",
"Server: localhost\r\n", "Content-Type: text/html\r\n",
"Keep-Alive: timeout=5, max=999\r\n\r\n");
http_server_send(request->socket, response, BUFFER_SIZE);
memset(response, '\0', BUFFER_SIZE);
snprintf(response, BUFFER_SIZE,
"<!DOCTYPE html><html><head><title>Page "
"Title</title></head><body><ul>");
http_server_send(request->socket, response, BUFFER_SIZE);
memset(response, '\0', BUFFER_SIZE);
iterate_dir(fp, &(request->ctx));
snprintf(response, BUFFER_SIZE, "</ul></body></html>");
http_server_send(request->socket, response, BUFFER_SIZE);
kfree(response);
return;
}
```
打開瀏覽器在 `URL` 中輸入 `http://localhost:8081`如果成功可以看到畫面如下:

但目前還沒有辦法實踐回應功能,來試著引入 `WWWROOT` 功能來達成。透過 `#define DEFAULT_ROOT "/"` 來定義預設的檔案位置,再來可以透過 `module_param` 巨集來在 `insmod` 的時候定義 `WWWROOT` 個變數。詳細的使用方法可以看 [The Linux Kernel Module Programming Guide : 4.5 Passing Command Line Arguments to a Module](https://sysprog21.github.io/lkmpg/#passing-command-line-arguments-to-a-module)
```c
#define DEFAULT_ROOT "/"
...
extern char *WWWROOT = DEFAULT_ROOT;
module_param(WWWROOT, charp, 0000);
...
```
這邊在 `khttp_server_service` 裡面新增一個 `char *root` 來儲存 `WWWROOT` 的內容。這邊先將 `struct khttp_server_service daemon` 宣告為 `extern`。 接下來在 `khttpd_init()` 中將 `WWWROOT` 的內容指派給 `daemon.root`。之後在 `list_directory_info()` 可以取得 `WWWROOT`的內容。
```c
struct khttp_server_service {
bool is_stopped;
struct list_head worker;
char *root;
};
extern struct khttp_server_service daemon;
```
```c
static int __init khttpd_init(void)
{
int err = open_listen_socket(port, backlog, &listen_socket);
if (err < 0) {
pr_err("can't open listen socket\n");
return err;
}
param.listen_socket = listen_socket;
daemon.root = WWWROOT;
khttp_wq = alloc_workqueue("khttp_wq", WQ_UNBOUND, 0);
http_server = kthread_run(http_server_daemon, ¶m, KBUILD_MODNAME);
if (IS_ERR(http_server)) {
pr_err("can't start http server daemon\n");
close_listen_socket(listen_socket);
return PTR_ERR(http_server);
}
return 0;
}
```
```c
static void list_directory_info(struct http_request *request)
{
pr_info("Into : list_directory_info()\n");
char *response = kmalloc(BUFFER_SIZE, GFP_KERNEL);
if (request->method != HTTP_GET) {
response = HTTP_RESPONSE_501;
http_server_send(request->socket, response, strlen(response));
kfree(response);
}
char *path = daemon.root;
...
...
```
接著當使用者在點擊資料夾的過程會透過 `request_url` 來改變目標位置。原本預設的 `request_url` 是 `/`。當點擊 `home`這個資料夾時 `request_url` 會變成 `/home`。 再來還需要判斷開啟的檔案內容是資料夾還是一般檔案。可以透過 `inode` 來判斷檔案的屬性。其中 `inode` 的結構可以參考 [fs.h](https://elixir.bootlin.com/linux/latest/source/include/linux/fs.h#L603)
```c
struct inode {
umode_t i_mode;
unsigned short i_opflags;
kuid_t i_uid;
kgid_t i_gid;
unsigned int i_flags;
...
```
可以透過巨集 `S_ISREG(m)`, `S_ISDIR(m)` 來判斷檔案的類型,其中要填入的參數則是 `imode`, 因此可以判定當 `S_ISDIR(m)` 為真時表示目前開啟的檔案為目錄格式。
```c
#define S_ISREG(m) (((m) & S_IFMT) == S_IFREG)
#define S_ISDIR(m) (((m) & S_IFMT) == S_IFDIR)
```
先新增一個 `inode` 的結構來取得 `struct file *fp` 的 `inode` 內容。再來對 `inode` 中的 `i_mode` 元素進行判斷。
```c
struct inode *inode = fp->f_inode;
if (S_ISDIR(inode->i_mode)) {
snprintf(response, BUFFER_SIZE,
"<!DOCTYPE html><html><head><title>Directory"
"</title></head><body><ul>");
http_server_send(request->socket, response, BUFFER_SIZE);
memset(response, '\0', BUFFER_SIZE);
iterate_dir(fp, &(request->ctx));
...
...
} else if (S_ISREG(inode->i_mode)) {
snprintf(response, BUFFER_SIZE,
"<!DOCTYPE html><html><head>"
...
```
如果打開的檔案是 `regular file` 的話需要把檔案的內容讀取進 `buffer` 再回傳,在 `kernel space` 讀取檔案需要透過 `kernel_read` 相關的說明可以看 [fs.h](https://elixir.bootlin.com/linux/latest/source/include/linux/fs.h#L2605)。
```c
...
} else if (S_ISREG(inode->i_mode)) {
snprintf(response, BUFFER_SIZE,
"<!DOCTYPE html><html><head><title>Regular"
" File</title></head><body><p>");
http_server_send(request->socket, response, BUFFER_SIZE);
memset(response, '\0', BUFFER_SIZE);
int ret = kernel_read(fp, response, fp->f_inode->i_size, 0);
http_server_send(request->socket, response, ret);
...
```
之後打開網頁瀏覽器之後就可以就可以透過點擊資料夾來進行互動,當讀到文字檔的時候也可以看到文字檔的內容呈現在瀏覽器上。

## 處理 MIME type 檔案