chiacyu
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    # ktcp contributed by < `chiacyu` > ## [CMWQ](https://www.kernel.org/doc/html/latest/core-api/workqueue.html) 解讀 從文章的描述可以看到作者主要提到幾個問題 - 原本的workqueue無法在多個不同的 `CPU` 核之間互相搬移任務 - 原本的 Multi-thread workqueue 必須保持跟 CPU 核心一樣數量的 worker 可能會造成資源的浪費 - Work item之間必須彼此競爭可能導致更多的延遲 透過 CMWQ 希望能夠作到除了能兼容原先的實做之外還做了一些修改包括 - 將 worker pool 共享給所有的 workerqueue - 將 worker pool 裡worker的數量維持基礎水位避免過多worker佔用系統資源 - 當 work item 佔用太多時間,scheduler會介入換另一個work item可以被服務 利用 `kecho` 裡面的 `bench` 來做一下測試:以下是 `user-echo-server.c` 的結果 ![](https://i.imgur.com/pzo5U80.png) `kecho` 的結果如下 ![](https://i.imgur.com/2YvfPtQ.png) 可以看到除了在 `kernel space` 執行外, CMWQ也帶來時粉顯著的 --- ## CPU scheduler and workqueue/CMWQ --- ## 於 ktcp 中導入 CMWQ 首先可以先看還未引入 `cmwq` 時的執行效果 ```c 0 requests 10000 requests 20000 requests 30000 requests 40000 requests 50000 requests 60000 requests 70000 requests 80000 requests 90000 requests requests: 100000 good requests: 100000 [100%] bad requests: 0 [0%] socket errors: 0 [0%] seconds: 2.232 requests/sec: 44807.565 Complete ``` ### 引入 cmwq 首先需要新增幾個資料結構來進行後續操作 ```c struct khttp { struct socket *sock; struct list_head list; struct work_struct khttp_work; }; ``` 透過 `khttp` 資料結構來紀錄 - 連接的 `socket` 位址 - `list` 結構為鏈結串列之節點 - `khttp_work` 針對 `workqueue` 的 單一 `worker` ```c struct khttp_server_service { bool is_stopped; struct list_head worker; }; ``` 透過 `khttp_server_service` 來紀錄 - `is_stopped` 來紀錄整個 `server` 目前的狀態 - `worker` 來作為紀錄 `worker` 鏈結串列的首部節點 ```c static struct work_struct *create_work(struct socket *sk) { struct khttp *work; if (!(work = kmalloc(sizeof(struct khttp), GFP_KERNEL))) return NULL; work->sock = sk; INIT_WORK(&work->khttp_work, http_server_worker); list_add(&work->list, &daemon.worker); return &work->khttp_work; } ``` 透過 `create_work()` 當不同的客戶端進行連線的時候,新增一個 `thread` 透過 `list_add `並將其串到 `daemon.worker` 節點的後方 - [INIT_WORK](https://elixir.bootlin.com/linux/latest/source/tools/testing/selftests/rcutorture/formal/srcu-cbmc/src/workqueues.h#L76) ```c static void free_work(void) { struct khttp *l, *tar; /* cppcheck-suppress uninitvar */ list_for_each_entry_safe (tar, l, &daemon.worker, list) { kernel_sock_shutdown(tar->sock, SHUT_RDWR); flush_work(&tar->khttp_work); sock_release(tar->sock); kfree(tar); } } ``` 接著需要 `free_work()` 來將所有資源釋放,透過 `list_for_each_entry_safe` 走訪每一個 `struct khttp *work` - `kernel_sock_shutdown` 來關閉該 `work` 所監聽的 `socket` - `flush_work` 將 `work_struct` 清空 - `sock_release` 將已經關閉的 `socket` 釋放 接著來看如何處理每一個客戶端的連線處理 ```c static void http_server_worker(struct work_struct *work) { struct khttp *worker = container_of(work, struct khttp, khttp_work); char *buf; struct http_parser parser; struct http_parser_settings setting = { .on_message_begin = http_parser_callback_message_begin, .on_url = http_parser_callback_request_url, .on_header_field = http_parser_callback_header_field, .on_header_value = http_parser_callback_header_value, .on_headers_complete = http_parser_callback_headers_complete, .on_body = http_parser_callback_body, .on_message_complete = http_parser_callback_message_complete}; struct http_request request; struct socket *socket = worker->sock; allow_signal(SIGKILL); allow_signal(SIGTERM); buf = kzalloc(RECV_BUFFER_SIZE, GFP_KERNEL); if (!buf) { pr_err("can't allocate memory!\n"); } request.socket = socket; http_parser_init(&parser, HTTP_REQUEST); parser.data = &request; while (!daemon.is_stopped) { int ret; memset(buf, 0, RECV_BUFFER_SIZE - 1); ret = http_server_recv(socket, buf, RECV_BUFFER_SIZE - 1); if (ret <= 0) { if (ret) pr_err("recv error: %d\n", ret); break; } http_parser_execute(&parser, &setting, buf, ret); if (request.complete && !http_should_keep_alive(&parser)) break; memset(buf, 0, RECV_BUFFER_SIZE); } kernel_sock_shutdown(socket, SHUT_RDWR); sock_release(socket); kfree(buf); } ``` 透過 `http_server_worker()` 來處理每個客戶端的連線 - 透過 container_of(work, struct khttp, khttp_work) 來從 `work` 裡面找到目標 `thread` - struct socket *socket = worker->sock 來取得該客戶所連接的 `socket` - 檢查 `daemon.is_stopped` 若服務已停止則關閉該 `socket` 並釋放 `buf` ```c int http_server_daemon(void *arg) { struct socket *socket; struct work_struct *work; struct http_server_param *param = (struct http_server_param *) arg; allow_signal(SIGKILL); allow_signal(SIGTERM); INIT_LIST_HEAD(&daemon.worker); while (!kthread_should_stop()) { int err = kernel_accept(param->listen_socket, &socket, 0); if (err < 0) { if (signal_pending(current)) break; pr_err("kernel_accept() error: %d\n", err); continue; } if (unlikely(!(work = create_work(socket)))) { printk(KERN_ERR "khttp : create work error, connection closed\n"); kernel_sock_shutdown(socket, SHUT_RDWR); sock_release(socket); continue; } /* start server worker */ queue_work(khttp_wq, work); } printk("khttp : daemon shutdown in progress...\n"); daemon.is_stopped = true; free_work(); return 0; } ``` 透過 `http_server_daemon()` 來啟動 server daemon - 透過 `INIT_LIST_HEAD` 將 `daemon.worker` 節點初始化之後新增的客戶端可以透過 `list_add()` 加入鏈結串列中 - 透過 `kthread_should_stop()` 判斷執行緒是否在執行中,若是尚未結束則透過 `kernel_accept` 建立新的連線 - 建立新的 `socket` 後透過 `create_work` 來建立新的執行緒處理新的連線 - 最後透過 `queue_work()` 啟動 `workqueue` 最後我們需要將 `server` 註冊進 Linux 系統模組中 ```c static int __init khttpd_init(void) { int err = open_listen_socket(port, backlog, &listen_socket); if (err < 0) { pr_err("can't open listen socket\n"); return err; } param.listen_socket = listen_socket; khttp_wq = alloc_workqueue("khttp_wq", WQ_UNBOUND, 0); http_server = kthread_run(http_server_daemon, &param, KBUILD_MODNAME); if (IS_ERR(http_server)) { pr_err("can't start http server daemon\n"); close_listen_socket(listen_socket); return PTR_ERR(http_server); } return 0; } ``` 在 `khttpd_init()`的時候 - 透過 `open_listen_socket()` 來監聽目標 `socket` 客戶可以透過該 `socket` 來建立連線 - 透過 `alloc_workqueue()` 來創造並啟動 `workqueue` 修改完之後的效能表現如下 ```c 0 requests 10000 requests 20000 requests 30000 requests 40000 requests 50000 requests 60000 requests 70000 requests 80000 requests 90000 requests requests: 100000 good requests: 100000 [100%] bad requests: 0 [0%] socket errors: 0 [0%] seconds: 1.377 requests/sec: 72631.558 Complete ``` --- ## 引入 RCU 來管理客戶端 關於 RCU 的相關資訊可以查看 [Linux 核心設計: RCU 同步機制](https://hackmd.io/@sysprog/linux-rcu) 跟 [What is RCU, Fundamentally?](https://lwn.net/Articles/262464/) 最適合 RCU 的場景為, 「讀取很頻繁,寫入較少,且嚴格要求資料一致性」, 因此初步引入 RCU 來管理客戶鍊結串列 在 `create_work()` 中使用 `list_add_rcu()` 來加入新的客戶端 ```c static struct work_struct *create_work(struct socket *sk) { struct khttp *work; if (!(work = kmalloc(sizeof(struct khttp), GFP_KERNEL))) return NULL; work->sock = sk; INIT_WORK(&work->khttp_work, http_server_worker); list_add_rcu(&work->list, &daemon.worker); return &work->khttp_work; } ``` 在 `free_work()` 中 使用 `list_for_each_entry_rcu()` 來走訪鍊結串列並將其一一釋放 ```c static void free_work(void) { struct khttp *l, *tar; /* cppcheck-suppress uninitvar */ rcu_read_lock(); list_for_each_entry_rcu (tar, &daemon.worker, list) { kernel_sock_shutdown(tar->sock, SHUT_RDWR); flush_work(&tar->khttp_work); sock_release(tar->sock); kfree(tar); } rcu_read_unlock(); } ``` 但執行結果卻不如預期,因此需要好好運用 `ftrace` 等工具來進行分析 ```c 0 requests 10000 requests 20000 requests 30000 requests 40000 requests 50000 requests 60000 requests 70000 requests 80000 requests 90000 requests requests: 100000 good requests: 100000 [100%] bad requests: 0 [0%] socket errors: 0 [0%] seconds: 1.668 requests/sec: 59951.823 Complete ``` ## ftrace 追蹤程式運行狀態 `ftrace` 是 Linux kernel 提供的追蹤機制,相關的內容可以參考 [Debugging the kernel using Ftrace - part 1](https://lwn.net/Articles/365835/) 跟 [Debugging the kernel using Ftrace - part 2](https://lwn.net/Articles/366796/) 還有 "Demystifying the Linux CPU Scheduler" 的第六章也可以看到相關的敘述 首先看看目前的系統是否有提供 `ftrace` 的功能 ```c cat /boot/config-`uname -r` | grep CONFIG_HAVE_FUNCTION_TRACER ``` 如果看到下列內容代表 `ftrace` 在該版本中可以使用 ``` CONFIG_HAVE_FUNCTION_TRACER=y ``` 接著可以到 `/sys/kernel/debug/tracing` 印出以下內容 ```c root@chiacyu-msi:/sys/kernel/debug/tracing# ls available_events max_graph_depth stack_max_size available_filter_functions options stack_trace available_tracers per_cpu stack_trace_filter buffer_percent printk_formats synthetic_events buffer_size_kb README timestamp_mode buffer_total_size_kb saved_cmdlines trace current_tracer saved_cmdlines_size trace_clock dynamic_events saved_tgids trace_marker dyn_ftrace_total_info set_event trace_marker_raw enabled_functions set_event_notrace_pid trace_options error_log set_event_pid trace_pipe events set_ftrace_filter trace_stat free_buffer set_ftrace_notrace tracing_cpumask function_profile_enabled set_ftrace_notrace_pid tracing_max_latency hwlat_detector set_ftrace_pid tracing_on instances set_graph_function tracing_thresh kprobe_events set_graph_notrace uprobe_events kprobe_profile snapshot uprobe_profile ``` ftrace 的使用方式是透過 `ehco` 寫入來進行互動,可以先查看 `available_filter_functions` 的內容,其中紀錄了目前 `ftrace` 可以追蹤的函式。 但是在需要先將 `khttp.ko` 透過 註冊進核心模組,之後就可以看到 ```c root@chiacyu-msi:/sys/kernel/debug/tracing# cat available_filter_functions | grep khttp parse_url_char.part.0 [khttpd] http_message_needs_eof [khttpd] http_should_keep_alive [khttpd] http_parser_execute [khttpd] http_method_str [khttpd] http_status_str [khttpd] http_parser_init [khttpd] http_parser_settings_init [khttpd] http_errno_name [khttpd] http_errno_description [khttpd] http_parser_url_init [khttpd] http_parser_parse_url [khttpd] http_parser_pause [khttpd] http_body_is_final [khttpd] http_parser_version [khttpd] http_parser_set_max_header_size [khttpd] http_parser_callback_header_field [khttpd] http_parser_callback_headers_complete [khttpd] http_parser_callback_request_url [khttpd] http_parser_callback_message_begin [khttpd] http_parser_callback_body [khttpd] http_server_recv.constprop.0 [khttpd] http_server_worker [khttpd] http_parser_callback_header_value [khttpd] http_server_daemon [khttpd] http_server_send.isra.0 [khttpd] http_parser_callback_message_complete [khttpd] ``` 我們可以撰寫一個 `shellscript` 來設定 `ftrace` - `max_graph_depth` 可以設定測量函式的深度 - `current_tracer` 會紀錄使用的量測項目,這邊設定為 `function_graph` - `set_graph_function` 則設定欲觀察的程式,在此為 `http_server_worker` ```c #!/bin/bash TRACE_DIR=/sys/kernel/debug/tracing echo > $TRACE_DIR/set_ftrace_filter echo > $TRACE_DIR/current_tracer echo nop > $TRACE_DIR/current_tracer echo function_graph > $TRACE_DIR/current_tracer # depth of the function calls echo 1 > max_graph_depth echo http_server_worker > $TRACE_DIR/set_graph_function echo 1 > $TRACE_DIR/tracing_on ./htstress -n 100 -c 1 -t 4 http://localhost:8081/ echo 0 > $TRACE_DIR/tracing_on ``` 執行完之後可以來看看 `trace` 裡面的內容 ```c root@chiacyu-msi:/sys/kernel/debug/tracing# cat trace | head -20 # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 10) | http_server_worker [khttpd]() { 10) | kernel_sigaction() { 10) 0.140 us | _raw_spin_lock_irq(); 10) 0.110 us | _raw_spin_unlock_irq(); 10) 1.130 us | } 10) | kernel_sigaction() { 10) 0.110 us | _raw_spin_lock_irq(); 10) 0.120 us | _raw_spin_unlock_irq(); 10) 0.620 us | } 10) | kmem_cache_alloc_trace() { 10) 0.110 us | __cond_resched(); 10) 0.100 us | should_failslab(); 10) 1.190 us | } 10) 0.130 us | http_parser_init [khttpd](); 10) | http_server_recv.constprop.0 [khttpd]() { 10) | kernel_recvmsg() { ``` 接著可以將 `max_graph_depth` 的數字增加來看看結果 ```c root@chiacyu-msi:/sys/kernel/debug/tracing# cat trace | head -300 # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 5) | http_server_worker [khttpd]() { 5) | kernel_sigaction() { 5) 0.220 us | _raw_spin_lock_irq(); 5) 0.150 us | _raw_spin_unlock_irq(); 5) 0.951 us | } 5) | kernel_sigaction() { 5) 0.100 us | _raw_spin_lock_irq(); 5) 0.170 us | _raw_spin_unlock_irq(); 5) 0.540 us | } 5) | kmem_cache_alloc_trace() { 5) 0.090 us | __cond_resched(); 5) 0.090 us | should_failslab(); 5) 0.990 us | } 5) 0.100 us | http_parser_init [khttpd](); 5) | http_server_recv.constprop.0 [khttpd]() { 5) | kernel_recvmsg() { 5) | sock_recvmsg() { 5) | security_socket_recvmsg() { 5) 0.700 us | apparmor_socket_recvmsg(); 5) 0.890 us | } 5) | inet_recvmsg() { 5) 1.550 us | tcp_recvmsg(); 5) 1.821 us | } 5) 3.081 us | } 5) 3.251 us | } 5) 3.441 us | } 5) | kernel_sock_shutdown() { 5) | inet_shutdown() { 5) | lock_sock_nested() { 5) 0.090 us | __cond_resched(); 5) 0.100 us | _raw_spin_lock_bh(); 5) | _raw_spin_unlock_bh() { 5) 0.100 us | __local_bh_enable_ip(); 5) 0.260 us | } 5) 0.770 us | } 5) | tcp_shutdown() { 5) | tcp_set_state() { 5) 0.100 us | inet_sk_state_store(); 5) 0.290 us | } 5) | tcp_send_fin() { 5) 1.840 us | __alloc_skb(); 5) 0.100 us | sk_forced_mem_schedule(); 5) 0.400 us | tcp_current_mss(); 5) + 43.448 us | __tcp_push_pending_frames(); 5) + 46.338 us | } 5) + 46.968 us | } 5) | sock_def_wakeup() { 5) 0.100 us | __rcu_read_lock(); 5) 0.100 us | __rcu_read_unlock(); 5) 0.490 us | } 5) | release_sock() { 5) 0.090 us | _raw_spin_lock_bh(); 5) | __release_sock() { 5) 0.130 us | _raw_spin_unlock_bh(); 5) 4.471 us | tcp_v4_do_rcv(); 5) 0.090 us | __cond_resched(); 5) 0.090 us | _raw_spin_lock_bh(); 5) 5.271 us | } 5) 0.100 us | tcp_release_cb(); 5) | _raw_spin_unlock_bh() { 5) 0.090 us | __local_bh_enable_ip(); 5) 0.260 us | } 5) 6.171 us | } 5) + 54.929 us | } 5) + 55.199 us | } 5) | sock_release() { 5) | inet_release() { 5) 0.110 us | ip_mc_drop_socket(); 5) | tcp_close() { 5) | lock_sock_nested() { 5) 0.080 us | __cond_resched(); 5) 0.090 us | _raw_spin_lock_bh(); 5) 0.130 us | _raw_spin_unlock_bh(); 5) 0.670 us | } 5) | __tcp_close() { 5) 0.150 us | __sk_mem_reclaim(); 5) 0.090 us | _raw_write_lock_bh(); 5) 0.120 us | _raw_write_unlock_bh(); 5) 0.100 us | _raw_spin_lock(); 5) 0.100 us | __release_sock(); 5) 1.141 us | inet_csk_destroy_sock(); 5) 0.090 us | _raw_spin_unlock(); 5) 0.090 us | __local_bh_enable_ip(); 5) 2.761 us | } 5) | release_sock() { 5) 0.090 us | _raw_spin_lock_bh(); 5) 0.120 us | tcp_release_cb(); 5) 0.130 us | _raw_spin_unlock_bh(); 5) 0.740 us | } 5) | sk_free() { 5) 1.270 us | __sk_free(); 5) 1.470 us | } 5) 6.561 us | } 5) 7.011 us | } 5) 0.110 us | module_put(); 5) | iput() { 5) 0.080 us | _raw_spin_lock(); 5) 0.110 us | _raw_spin_unlock(); 5) | evict() { 5) | inode_wait_for_writeback() { 5) 0.120 us | _raw_spin_lock(); 5) 0.171 us | __inode_wait_for_writeback(); 5) 0.100 us | _raw_spin_unlock(); 5) 0.741 us | } 5) | truncate_inode_pages_final() { 5) 0.100 us | truncate_inode_pages_range(); 5) 0.300 us | } 5) | clear_inode() { 5) 0.090 us | _raw_spin_lock_irq(); 5) 0.090 us | _raw_spin_unlock_irq(); 5) 0.450 us | } 5) 0.090 us | _raw_spin_lock(); 5) 0.180 us | wake_up_bit(); 5) 0.090 us | _raw_spin_unlock(); 5) | destroy_inode() { 5) 1.000 us | __destroy_inode(); 5) 0.140 us | call_rcu(); 5) 1.480 us | } 5) 4.121 us | } 5) 4.981 us | } 5) + 12.492 us | } 5) 0.280 us | kfree(); 5) + 76.673 us | } ``` 可以看到在 `__tcp_push_pending_frames` 花了最久的時間。 ## 檢查是否提供 `keep-Alive` 功能 在測試之前需要先充分的了解 HTTP request 的格式, 詳細資料可以參考 [HTTP Messages](https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages)。 HTTP 的 request 可以分成三的部份 - Method : 定義要求資料的形式,如 `GET`, `POST` 等等 - Request target : 要求的資料位置,通常是以 `URL` 形式 - HTTP version : HTTP 的版本 因此我們在掛載 `khttp` 之後輸入 `telnet localhost 8081`, 分別輸入 `GET / HTTP/1.0` 跟 `GET / HTTP/1.1` ```c (base) chiacyu@chiacyu-msi:~$ telnet localhost 8081 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET / HTTP/1.0 HTTP/1.1 200 OK Server: khttpd Content-Type: text/plain Content-Length: 12 Connection: Close Hello World! Connection closed by foreign host. ``` ```c (base) chiacyu@chiacyu-msi:~$ telnet localhost 8081 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET / HTTP/1.1 HTTP/1.1 200 OK Server: khttpd Content-Type: text/plain Content-Length: 12 Connection: Keep-Alive Hello World! ``` 可以看到目前的 `khttp` 目前有提供 `Keep Alive` 的功能 ## 使用 `timer` 主動中斷超時連線 由於目前的 `khttp` 沒有提供 `timer` 的機制來中斷連線,這個部份可以參考 [sehttpd](https://github.com/sysprog21/sehttpd) 的實作方式。 `sehttpd` 是透過一個 `priority queue` 的方式來管理所有連線。其中 `priority queue` 的結構是一個 `min heap` 其中透過 `prio_queue_min()` 取出最接近 `deadline` 的連線。 ```c static inline void *prio_queue_min(prio_queue_t *ptr) { return prio_queue_is_empty(ptr) ? NULL : ptr->priv[1]; } ``` 原本 `sehttpd` 裡面更新時間的方法為透過 `gettimeofday()` 的方式來獲取目前系統的時間,再轉換成 `ms` 的單位。 ```c static void time_update() { struct timeval tv; int rc UNUSED = gettimeofday(&tv, NULL); assert(rc == 0 && "time_update: gettimeofday error"); current_msec = tv.tv_sec * 1000 + tv.tv_usec / 1000; } ``` 很遺憾的是在 `kernel space` 並沒有辦法直接使用 `gettimeofday()` 需要透過別的方式得到目前的系統時間。 這邊使用 [ktime_get_real()](https://docs.kernel.org/core-api/timekeeping.html), 在透過 ktime_to_ms() 轉換成 `ms` 的格式。 ```c static void time_update(void) { ktime_t kt = ktime_get_real(); current_msec = ktime_to_ms(kt); } ``` 接著在 `http_server_daemon()` 裡面透過 `timer_init()` 將 `timer` 初始化,接著透過 `handle_expired_timers()` 來找出所有超過截止時間的連線,並一一將其釋放。 ```c int http_server_daemon(void *arg) { struct socket *socket; struct work_struct *work; struct http_server_param *param = (struct http_server_param *) arg; allow_signal(SIGKILL); allow_signal(SIGTERM); timer_init(); INIT_LIST_HEAD(&daemon.worker); while (!kthread_should_stop()) { int time = find_timer(); pr_info("wait time = %d\n", time); handle_expired_timers(); int err = kernel_accept(param->listen_socket, &socket, 0); ... ... ``` 在測試的時候遇到一個問題,就是 `http_server_daemon()` 會停留在 `kernel_accept()` 的部份而不會回到迴圈的開始,發現的原因是透過 `dmesg` 查看時並沒有看到 `"wait time = %d\n"` 持續被輸出,且超過時間的客戶連線也沒有順利被關閉。 翻找資料的時候看到 [Risheng1128](https://hackmd.io/@Risheng/linux2022-ktcp/https%3A%2F%2Fhackmd.io%2F%40Risheng%2Flinux2022-khttpd) 同學的報告才知道需要將 `socket` 改成 non-blocking 的方式。 詳細可以看這個 [commit](https://github.com/chiacyu/khttpd/commit/6c22e405c250058298224c023ce84d85dc5d9335) ```c @@ -247,7 +248,8 @@ int http_server_daemon(void *arg) pr_info("wait time = %d\n", time); handle_expired_timers(); int err = kernel_accept(param->listen_socket, &socket, 0); // int err = kernel_accept(param->listen_socket, &socket, 0); int err = kernel_accept(param->listen_socket, &socket, SOCK_NONBLOCK); if (err < 0) { if (signal_pending(current)) break; ... ... ``` 在 [kernel_accept](https://www.kernel.org/doc/html/v5.6/networking/kapi.html) 的頁面中可以看到,`int kernel_accept(struct socket * sock, struct socket ** newsock, int flags)` 函式需要透過三個參數,第一個參數為目前監聽的 `socket`, 第二個為要建立的新連線的 `socket`, 最後一個則為 `flag` 來設定 `socket` 的相關屬性。 >flags must be SOCK_CLOEXEC, SOCK_NONBLOCK or 0. If it fails, newsock is guaranteed to be NULL. Returns 0 or an error. 所以需要把第三個參數內容改成 `SOCK_NONBLOCK` 接著透過 `./htstress -n 10000 http://localhost:8081/` 來進行測試可以從 `dmesg` 中看到 `timer` 如預期的運作。 ```c [26107.946917] khttpd: handle_expired_timers() node->deleted: free node of socket 637491968 [26107.946968] khttpd: add_timer: prio_queue_insert successfully [26107.946972] khttpd: requested_url = / [26107.947031] khttpd: add_timer: prio_queue_insert successfully [26107.947035] khttpd: requested_url = / [26107.947044] khttpd: handle_expired_timers() node->deleted: free node of socket 1661477824 [26107.947094] khttpd: add_timer: prio_queue_insert successfully [26107.947098] khttpd: requested_url = / [26107.947108] khttpd: handle_expired_timers() node->deleted: free node of socket 1660977984 [26107.947154] khttpd: add_timer: prio_queue_insert successfully [26107.947158] khttpd: requested_url = / [26107.947168] khttpd: handle_expired_timers() node->deleted: free node of socket 1226516224 ``` ## 實做 [directory listing]()的功能 在 `kernel space` 有提供 `int iterate_dir(struct file *file, struct dir_context *ctx)` 函式可以使用。 關於 [int iterate_dir()](https://elixir.bootlin.com/linux/latest/source/fs/readdir.c#L40) 的定義需要輸入兩個參數,分別是 `struct file *file` 與 `struct dir_context *ctx`。 在 `kernel space` 裡面要開啟檔案需要透過不同的函式,這邊透過 [filp_open(const char *filename, int flags, umode_t mode)](https://elixir.bootlin.com/linux/latest/source/fs/open.c#L1315) 來回傳一個 `struct file` 的指針。 在這邊先指定打開 `"/"` root的檔案位置。再來可以看看 [`struct dir_context *`](https://elixir.bootlin.com/linux/v4.8/source/include/linux/fs.h#L1644) 的結構。透過 `typedef int (*filldir_t)(struct dir_context *, const char *, int, loff_t, u64, unsigned);` 來定義 `callback function`. 這邊先定義出 `printdir()` 來作為 `callback function`。當 `iterate_dir()`被執行的時候會呼叫 `printdir()`。 ```c static int printdir(struct dir_context *ctx, const char *name, int namlen, loff_t offset, u64 ino, unsigned int d_type) { if (strcmp(name, ".") ==0 || strcmp(name, "..") == 0 ){ return 0; } pr_info("Filename : %s\n", name); return 0; } void list_directory(void) { char *path = "/"; struct dir_context ctx = {.actor = &printdir}; struct file *fp = filp_open(path, O_DIRECTORY, S_IRWXU | S_IRWXG | S_IRWXO); if (IS_ERR(fp)) { printk("Open file error\n"); } iterate_dir(fp, &ctx); return; } ``` 執行出來的結果為下圖,可以看到成功印出 `root` 裡面的檔案內容,接著要把內容轉換成 `http` 的資料格式。 ```bash= [ 2662.325454] khttpd: Filename : dev [ 2662.325455] khttpd: Filename : cdrom [ 2662.325455] khttpd: Filename : boot [ 2662.325456] khttpd: Filename : proc [ 2662.325456] khttpd: Filename : lib32 [ 2662.325457] khttpd: Filename : var [ 2662.325457] khttpd: Filename : snap [ 2662.325457] khttpd: Filename : mnt [ 2662.325458] khttpd: Filename : etc [ 2662.325458] khttpd: Filename : sbin [ 2662.325458] khttpd: Filename : opt [ 2662.325459] khttpd: Filename : lib64 [ 2662.325459] khttpd: Filename : sys [ 2662.325459] khttpd: Filename : media [ 2662.325460] khttpd: Filename : lib [ 2662.325460] khttpd: Filename : tmp [ 2662.325460] khttpd: Filename : libx32 [ 2662.325461] khttpd: Filename : root [ 2662.325461] khttpd: Filename : swapfile [ 2662.325461] khttpd: Filename : run [ 2662.325462] khttpd: Filename : bin [ 2662.325462] khttpd: Filename : home [ 2662.325462] khttpd: Filename : srv [ 2662.325463] khttpd: Filename : lost+found [ 2662.325463] khttpd: Filename : usr ``` `Http` response 的資料格式可以參考 [http response](https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages)。修改完成程式碼之後可以透過瀏覽器測試。 ```c static int printdir(struct dir_context *ctx, const char *name, int namlen, loff_t offset, u64 ino, unsigned int d_type) { char *buf = kmalloc(BUFFER_SIZE, GFP_KERNEL); struct http_request *request = container_of(ctx, struct http_request, ctx); if (strcmp(name, ".") == 0 || strcmp(name, "..") == 0) { return 0; } snprintf(buf, BUFFER_SIZE, "<li><a href=/%s/>%s</a></li>", name, name); http_server_send(request->socket, buf, BUFFER_SIZE); return 0; } static void list_directory_info(struct http_request *request) { pr_info("Into : list_directory_info()\n"); char *response = kmalloc(BUFFER_SIZE, GFP_KERNEL); if (request->method != HTTP_GET) { response = HTTP_RESPONSE_501; http_server_send(request->socket, response, strlen(response)); kfree(response); } char *path = "/"; request->ctx.actor = &printdir; struct file *fp = filp_open(path, O_RDONLY, 0); if (IS_ERR(fp)) { pr_err("Open file error\n"); } snprintf(response, BUFFER_SIZE, "HTTP/1.1 200 OK \r\n%s%s%s", "Server: localhost\r\n", "Content-Type: text/html\r\n", "Keep-Alive: timeout=5, max=999\r\n\r\n"); http_server_send(request->socket, response, BUFFER_SIZE); memset(response, '\0', BUFFER_SIZE); snprintf(response, BUFFER_SIZE, "<!DOCTYPE html><html><head><title>Page " "Title</title></head><body><ul>"); http_server_send(request->socket, response, BUFFER_SIZE); memset(response, '\0', BUFFER_SIZE); iterate_dir(fp, &(request->ctx)); snprintf(response, BUFFER_SIZE, "</ul></body></html>"); http_server_send(request->socket, response, BUFFER_SIZE); kfree(response); return; } ``` 打開瀏覽器在 `URL` 中輸入 `http://localhost:8081`如果成功可以看到畫面如下: ![](https://hackmd.io/_uploads/SJyjTQOYh.png) 但目前還沒有辦法實踐回應功能,來試著引入 `WWWROOT` 功能來達成。透過 `#define DEFAULT_ROOT "/"` 來定義預設的檔案位置,再來可以透過 `module_param` 巨集來在 `insmod` 的時候定義 `WWWROOT` 個變數。詳細的使用方法可以看 [The Linux Kernel Module Programming Guide : 4.5 Passing Command Line Arguments to a Module](https://sysprog21.github.io/lkmpg/#passing-command-line-arguments-to-a-module) ```c #define DEFAULT_ROOT "/" ... extern char *WWWROOT = DEFAULT_ROOT; module_param(WWWROOT, charp, 0000); ... ``` 這邊在 `khttp_server_service` 裡面新增一個 `char *root` 來儲存 `WWWROOT` 的內容。這邊先將 `struct khttp_server_service daemon` 宣告為 `extern`。 接下來在 `khttpd_init()` 中將 `WWWROOT` 的內容指派給 `daemon.root`。之後在 `list_directory_info()` 可以取得 `WWWROOT`的內容。 ```c struct khttp_server_service { bool is_stopped; struct list_head worker; char *root; }; extern struct khttp_server_service daemon; ``` ```c static int __init khttpd_init(void) { int err = open_listen_socket(port, backlog, &listen_socket); if (err < 0) { pr_err("can't open listen socket\n"); return err; } param.listen_socket = listen_socket; daemon.root = WWWROOT; khttp_wq = alloc_workqueue("khttp_wq", WQ_UNBOUND, 0); http_server = kthread_run(http_server_daemon, &param, KBUILD_MODNAME); if (IS_ERR(http_server)) { pr_err("can't start http server daemon\n"); close_listen_socket(listen_socket); return PTR_ERR(http_server); } return 0; } ``` ```c static void list_directory_info(struct http_request *request) { pr_info("Into : list_directory_info()\n"); char *response = kmalloc(BUFFER_SIZE, GFP_KERNEL); if (request->method != HTTP_GET) { response = HTTP_RESPONSE_501; http_server_send(request->socket, response, strlen(response)); kfree(response); } char *path = daemon.root; ... ... ``` 接著當使用者在點擊資料夾的過程會透過 `request_url` 來改變目標位置。原本預設的 `request_url` 是 `/`。當點擊 `home`這個資料夾時 `request_url` 會變成 `/home`。 再來還需要判斷開啟的檔案內容是資料夾還是一般檔案。可以透過 `inode` 來判斷檔案的屬性。其中 `inode` 的結構可以參考 [fs.h](https://elixir.bootlin.com/linux/latest/source/include/linux/fs.h#L603) ```c struct inode { umode_t i_mode; unsigned short i_opflags; kuid_t i_uid; kgid_t i_gid; unsigned int i_flags; ... ``` 可以透過巨集 `S_ISREG(m)`, `S_ISDIR(m)` 來判斷檔案的類型,其中要填入的參數則是 `imode`, 因此可以判定當 `S_ISDIR(m)` 為真時表示目前開啟的檔案為目錄格式。 ```c #define S_ISREG(m) (((m) & S_IFMT) == S_IFREG) #define S_ISDIR(m) (((m) & S_IFMT) == S_IFDIR) ``` 先新增一個 `inode` 的結構來取得 `struct file *fp` 的 `inode` 內容。再來對 `inode` 中的 `i_mode` 元素進行判斷。 ```c struct inode *inode = fp->f_inode; if (S_ISDIR(inode->i_mode)) { snprintf(response, BUFFER_SIZE, "<!DOCTYPE html><html><head><title>Directory" "</title></head><body><ul>"); http_server_send(request->socket, response, BUFFER_SIZE); memset(response, '\0', BUFFER_SIZE); iterate_dir(fp, &(request->ctx)); ... ... } else if (S_ISREG(inode->i_mode)) { snprintf(response, BUFFER_SIZE, "<!DOCTYPE html><html><head>" ... ``` 如果打開的檔案是 `regular file` 的話需要把檔案的內容讀取進 `buffer` 再回傳,在 `kernel space` 讀取檔案需要透過 `kernel_read` 相關的說明可以看 [fs.h](https://elixir.bootlin.com/linux/latest/source/include/linux/fs.h#L2605)。 ```c ... } else if (S_ISREG(inode->i_mode)) { snprintf(response, BUFFER_SIZE, "<!DOCTYPE html><html><head><title>Regular" " File</title></head><body><p>"); http_server_send(request->socket, response, BUFFER_SIZE); memset(response, '\0', BUFFER_SIZE); int ret = kernel_read(fp, response, fp->f_inode->i_size, 0); http_server_send(request->socket, response, ret); ... ``` 之後打開網頁瀏覽器之後就可以就可以透過點擊資料夾來進行互動,當讀到文字檔的時候也可以看到文字檔的內容呈現在瀏覽器上。 ![](https://hackmd.io/_uploads/B1r7rLH93.png) ## 處理 MIME type 檔案

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully