劉家成
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    # ktcp ## 簡易伺服器 研讀 [Linux 核心設計: 針對事件驅動的 I/O 模型演化](https://hackmd.io/@sysprog/linux-io-model/https%3A%2F%2Fhackmd.io%2F%40sysprog%2Fevent-driven-server) 時,難以理解 epoll 操作的行為,因此嘗試實做一個使用 epoll 的簡易伺服器: 首先,和前面提到的 tcp 伺服器一樣,使用 listen 來監聽是否有用戶要建立連線。 接下來要註冊 fd ,一開始建立了一個 tcp 連線的 socket ,這個 socket 被用來作為建立連線用,因此先透過: ```c int epoll_fd = epoll_create(MAX_EVENTS); if (epoll_ctl(epoll_fd, EPOLL_CTL_ADD ,socket_fd, &ev) < 0) { printf("epoll_ctl error!\n"); return -1; } ``` 進行註冊,接下來當這個 fd 有事件發生時,`epoll_wait`就會停止 block 讓程式繼續執行。 在主要迴圈中則可以看到 ```c int nfds = epoll_wait(epoll_fd, events, MAX_EVENTS,-1); ``` nfds 就是收到的事件數,此處因為我實驗的連接數小,沒有出現過 nfds>1 的情況過,但可能同時發生很多事件時會發生? 當有備註冊的 fd 發生事件時,就會透過 epoll_wait 回傳,舉例來說,如果是要進行連代表前面的 socket_fd 有事件發生,則可以透過: ```c connfd = accept(socket_fd, (struct sockaddr*)&clientAddr, &len); ev.events = EPOLLIN; ev.data.fd = connfd; if(epoll_ctl(epoll_fd,EPOLL_CTL_ADD, connfd , &ev) == -1) { printf("error"); return -1; } ``` 實際上就是取出已經建立的連線,並使用 epoll_ctl 來加入監聽名單,這樣未來這個連線有新動作時就可以被間聽到。 而如果不是 socket_fd 有事件發生,代表是有已經建立連接的 client 要連接,使用 read、write 做讀寫即可。 這樣就可以一次監聽並處理多個連線,假如不使用此種方法(包括 select 等),有可能就必須創建多執行緒,並讓每個執行緒去對特定連線進行監聽,一旦 client 增加很可能就必須創建大量執行緒工作。 [github](https://github.com/fatcatorange/basic_server/commit/1506e062aa5716fb01b2c2360164bacb750fd928) ## 以 eBPF 追蹤 HTTP 封包 參考 [學長寫的教學](https://hackmd.io/@0xff07/r1f4B8aGI#Appendix-C) ,先寫出一個類似的程式進行追蹤: ```c from bcc import BPF prog = """ #include <uapi/linux/ptrace.h> int probe_handler(struct pt_regs *ctx) { u64 ts = bpf_ktime_get_ns(); bpf_trace_printk("Enter http_server_worker at %llu\\n", ts); return 0; } """ b = BPF(text=prog) b.attach_kprobe(event="http_server_worker", fn_name="probe_handler") b.trace_print() ``` 此處我是以 `http_server_worker`作為目標,這個函式出現在: ```c worker = kthread_run(http_server_worker, socket, KBUILD_MODNAME); ``` 主要功能就是在 accept 後創建執行緒來服務該用戶。 可以發現,當有用戶開始連接時,如果有執行 bcc 程式,就可以攔截到事件並執行一些行為: client: ```shell telnet localhost 1999 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. ``` test.py: ``` b' khttpd-190263 [004] ...21 94274.459276: bpf_trace_printk: Enter http_server_worker at 94273956630619' b'' ``` 我原先希望透過教學檢查 fib_read 的方法來檢測建立執行緒的成本,但產生了一個問題: 在 bpf 中,我應是要加入要檢測的函式,然而執行緒建立的函式如下: ```c worker = kthread_run(http_server_worker, socket, KBUILD_MODNAME); ``` 我一開始選擇監測 http_server_worker ,但會產生一個問題,就是 http_server_worker 只有在斷開連線時才會結束(也就是 client 端關閉連線時),因此這樣透過: ```python from bcc import BPF prog = """ #include <uapi/linux/ptrace.h> BPF_HASH(start, u64, u64); int probe_handler(struct pt_regs *ctx) { u64 ts = bpf_ktime_get_ns(); u64 pid = bpf_get_current_pid_tgid(); start.update(&pid, &ts); return 0; } int ret_handler (struct pt_regs *ctx) { u64 ts = bpf_ktime_get_ns(); u64 pid = bpf_get_current_pid_tgid(); u64 *tsp = (start.lookup(&pid)); if (tsp != 0) { bpf_trace_printk("duration: %llu\\n", ts - *tsp); start.delete(&pid); } return 0; } """ b = BPF(text=prog) b.attach_kprobe(event="http_server_worker", fn_name="probe_handler") b.attach_kretprobe(event="http_server_worker", fn_name="ret_handler") b.trace_print() ``` 算出的時間應是連線的持續時間,而非建立執行緒的時間。 而如果改以 kthread_run 為目標,因為 kthread_run 不只出現在這個 module ,因此可能會擷取到一些不相干的東西。 我想到的方法是,在 kthread_run 的前後各加入一個空的函式,例如: ```c fun1() worker = kthread_run(http_server_worker, socket, KBUILD_MODNAME); fun2() ``` 然後程式紀錄 fun1 的返回時間和 fun2 的進入時間,儘管會稍微有些誤差,但應可大致估計建立成本和其佔整個建立連線行為的時間比例。 但並沒有按照預期執行: ``` kthread_start_check(); worker = kthread_run(http_server_worker, socket, KBUILD_MODNAME); kthread_end_check(); ``` 程式此處只執行了 start_check 的部份。 後來發現,kthread_end_check 不知為何是等到 kthread_run 裡面的函式執行完才會執行? :::info 後來發現,很多函式雖然可以讀到,但是沒辦法檢測,不知道為什麼? 如果讀取某些函式,會出現如下錯誤: ```shell cannot attach kprobe, probe entry may not exist ``` 但如果讀取一些其他自己寫的函式,不會出現這個錯誤,換句話說,應該是有成功設定斷點,然而實際上經過檢查,除了 `http_server_worker` 這個函式外,其他函式雖然沒有出現錯誤,但 attach_kprobe 也沒有成功擷取(有透過 printk 檢查到函式確實有執行)。 ::: 此處發現一個非常神奇的事情,如果使用普通的呼叫,則沒辦法擷取到,但如果透過 kthread_run 來執行該函式,就可以擷取到? 翻閱一些教學文件或教學,應是 system call 才會被擷取到?因此如果要擷取,我決定透過 kthread_run 建立兩個執行緒,第一個用來包裝讓程式執行,第二個用來執行 `kthread_run(http_server_worker, socket, KBUILD_MODNAME);` 並透過: ```python from bcc import BPF code = """ #include <uapi/linux/ptrace.h> BPF_HASH(start, u32, u64); int probe_handler(struct pt_regs *ctx) { u64 ts = bpf_ktime_get_ns(); u32 tgid = bpf_get_current_pid_tgid(); start.update(&tgid, &ts); return 0; } int end_function(struct pt_regs *ctx) { u64 ts = bpf_ktime_get_ns(); u32 tgid = bpf_get_current_pid_tgid(); u64 *start_ts = start.lookup(&tgid); if (start_ts) { bpf_trace_printk("duration: %llu\\n", ts - *start_ts); start.delete(&tgid); } return 0; } """ b = BPF(text = code) b.attach_kprobe(event = 'my_thread_run', fn_name = 'probe_handler') b.attach_kretprobe(event = 'my_thread_run', fn_name = 'end_function') while True: try: print("listen..") res = b.trace_fields() except ValueError: print(res) continue print(res[5].decode("UTF-8")) ``` 來檢測執行緒建立成本。 首先先執行程式,並將資料寫入 output.txt: ``` sudo python3 test.py >> output.txt ``` 接下來透過作業說明前半部份提到的方法進行測試: ```shell ab -n 10000 -c -10000 -k http://127.0.0.1:8081/ ``` 但發現好像不能同時建立這麼多執行緒,同時產生大約只能 5000: ![image](https://hackmd.io/_uploads/BkMGcYl70.png) 透過 curl 慢慢發送 1000 個訊息: ```bash #!/bin/bash for ((i=1; i<=1000; i++)) do curl -s -o /dev/null http://localhost:8081/ done ``` ![image](https://hackmd.io/_uploads/Hyq2hYgm0.png) 結果似乎更接近作業說明中的結果。 改為 10000 次: ![image](https://hackmd.io/_uploads/r1KppKlXA.png) ## 引入 CMWQ 此處想將原先使用 kthread_run 的部份改以 CMWQ 執行,首先先配置一個 workqueue: ```c khttpd_wq = alloc_workqueue("khttpd", 0, 0); ``` 在 server.c 中,先創建一個 daemon_list 作為這個 workqueue 的開頭。 ```c INIT_LIST_HEAD(&daemon_list.head); ``` 接下來,在 server.c 中,參考 [kecho](https://github.com/sysprog21/kecho/blob/master/echo_server.c) ,先使用 create_work 創建工作: ```c if (unlikely(!(work = create_work(socket)))) { printk(KERN_ERR ": create work error, connection closed\n"); kernel_sock_shutdown(socket, SHUT_RDWR); sock_release(socket); continue; } ``` 在 create_work 中,根據傳入的 socket 建立一個 work。 ```c static struct work_struct *create_work(struct socket *sk) { struct http_request *work; if (!(work = kmalloc(sizeof(struct http_request), GFP_KERNEL))) return NULL; work->socket = sk; INIT_WORK(&work->khttpd_work, http_server_worker); list_add(&work->node, &daemon_list.head); return &work->khttpd_work; } ``` 已經透過 list_add 將工作加入 list,此處要注意的是,和使用 kthread_run 運行不同, kthread_run 可以傳入一個 void 型態的指標,因此可以在 kthread_run 時指定要傳入的參數,之後再轉型即可。 但使用 workqueue 時,程式執行時就是傳入一個 `struct work_struct `,也因此可以將一些需要的參數全部寫在一個 struct 內,並讓這個 struct 包含 `struct work_struct `這個成員: ```c struct http_request { struct socket *socket; enum http_method method; char request_url[128]; int complete; struct list_head node; struct work_struct khttpd_work; }; ``` 當傳入時,透過 container_of 即可使用整個 struct ,並使用裡面的參數,舉例來說,原本 `http_server_worker` 是傳入一個 void 指標,這代表的是 socket ,使用 workqueue 時,就可以透過: ```c struct socket *socket = container_of(w, struct http_request, khttpd_work)->socket; ``` 來取得 socket 的指標。 值得一提的是,在 create_work 中,有一段程式碼: ```c list_add(&work->node, &daemon_list.head); ``` 因為之前作業 6 的 ksort 有使用到 queue_work ,但似乎沒有用到這個部份,因此我嘗試移除這行指令,執行結果則完全相同。 因為 create_work 應該只是幫這個工作初始化,真正加入應是等到: ```c queue_work(khttpd_wq, work); ``` 下方是有無使用 cmwq 的差距,上方為單純使用 kthread_run: ``` ./htstress http://localhost:8081 -t 3 -c 20 -n 200000 0 requests 20000 requests 40000 requests 60000 requests 80000 requests 100000 requests 120000 requests 140000 requests 160000 requests 180000 requests requests: 200000 good requests: 200000 [100%] bad requests: 0 [0%] socket errors: 0 [0%] seconds: 2.419 requests/sec: 82688.125 ``` 下方為使用 cmwq ,可以發現request/sec 成長了超過一倍。 ``` lhost:8081 -t 3 -c 20 -n 200000 0 requests 20000 requests 40000 requests 60000 requests 80000 requests 100000 requests 120000 requests 140000 requests 160000 requests 180000 requests requests: 200000 good requests: 200000 [100%] bad requests: 0 [0%] socket errors: 0 [0%] seconds: 1.011 requests/sec: 197856.619 ``` 此部份的 [commit](https://github.com/fatcatorange/khttpd/commit/2288902a8a13b1aa93df578284062ed486fe3e38) ## 實作 directory listing 功能: 要加入這個功能,要修改 `http_server_response` ,原本的 `http_server_response` 只會檢查是不是用 get ,是的話回傳一個 HTTP_RESPONSE_200 (代表成功) ,內容是 hello world。 而這個函式被使用在: ```c static int http_parser_callback_message_complete(http_parser *parser) { struct http_request *request = parser->data; http_server_response(request, http_should_keep_alive(parser)); request->complete = 1; return 0; } ``` 而這個函式被綁定在: ```c struct http_parser_settings setting = { .on_message_begin = http_parser_callback_message_begin, .on_url = http_parser_callback_request_url, .on_header_field = http_parser_callback_header_field, .on_header_value = http_parser_callback_header_value, .on_headers_complete = http_parser_callback_headers_complete, .on_body = http_parser_callback_body, .on_message_complete = http_parser_callback_message_complete}; ``` 當程式執行 http_parser_execute(&parser, &setting, buf, ret); 時,就會根據解析執行對應的函式,以這個例子來說,解析完整個 http 請求時執行 `http_parser_callback_message_complete` 在顯示目錄部份,參考[學長作法](https://hackmd.io/@sysprog/BkSW8Z2Bn) ,遍歷需要的目錄,具體內容為: ```c static _Bool tracedir(struct dir_context *dir_context, const char *name, int namelen, loff_t offset, u64 ino, unsigned int d_type) { printk("%s\n", name); if (strcmp(name, ".") && strcmp(name, "..")) { struct http_request *request = container_of(dir_context, struct http_request, dir_context); char buf[SEND_BUFFER_SIZE] = {0}; snprintf(buf, SEND_BUFFER_SIZE, "<tr><td><a href=\"%s\">%s</a></td></tr>\r\n", name, name); http_server_send(request->socket, buf, strlen(buf)); printk("%s\n", buf); } return 1; } ``` 此處是遍歷到每個檔案時該做的事,會向用戶端傳送一個 table 的其中一格,包含該檔案的名字。 接下來將 dir_context.actor 指定為該函式,代表遍歷時執行該函式。 ```c static bool handle_directory(struct http_request *request) { struct file *fp; char buf[SEND_BUFFER_SIZE] = {0}; request->dir_context.actor = tracedir; if (request->method != HTTP_GET) { snprintf(buf, SEND_BUFFER_SIZE, "HTTP/1.1 501 Not Implemented\r\n%s%s%s%s", "Content-Type: text/plain\r\n", "Content-Length: 19\r\n", "Connection: Close\r\n", "501 Not Implemented\r\n"); http_server_send(request->socket, buf, strlen(buf)); return false; } snprintf(buf, SEND_BUFFER_SIZE, "HTTP/1.1 200 OK\r\n%s%s%s", "Connection: Keep-Alive\r\n", "Content-Type: text/html\r\n", "Keep-Alive: timeout=5, max=1000\r\n\r\n"); http_server_send(request->socket, buf, strlen(buf)); snprintf(buf, SEND_BUFFER_SIZE, "%s%s%s%s", "<html><head><style>\r\n", "body{font-family: monospace; font-size: 15px;}\r\n", "td {padding: 1.5px 6px;}\r\n", "</style></head><body><table>\r\n"); http_server_send(request->socket, buf, strlen(buf)); fp = filp_open("/home/jason/linux-2024/khttpd/khttpd", O_RDONLY | O_DIRECTORY, 0); if (IS_ERR(fp)) { printk("open file failed"); return false; } iterate_dir(fp, &request->dir_context); snprintf(buf, SEND_BUFFER_SIZE, "</table></body></html>\r\n"); http_server_send(request->socket, buf, strlen(buf)); filp_close(fp, NULL); return true; } ``` 然後將 http_server_response 改為執行 `handle_dicretory` 即可。 ```c static int http_server_response(struct http_request *request, int keep_alive) { // pr_info("requested_url = %s\n", request->request_url); int ret = handle_directory(request); if (ret > 0) return -1; return 0; } ``` 然而目前遇到一個問題,使用瀏覽器輸入 `127.0.0.1:8081` 後,雖然可以接收到目錄: ![image](https://hackmd.io/_uploads/Syq9N_EmR.png) 但瀏覽器持續在讀取,似乎是還在等待資料 ![image](https://hackmd.io/_uploads/S13kBuEmA.png) ## 指定開啟目錄 一樣透過 `module_param` ,在載入模組時指定路徑即可: ```c module_param_string(WWWROOT, WWWROOT, PATH_SIZE, 0); .. daemon_list.dir_path = WWWROOT; ``` 將路徑部份替換為 `daemon_list.dir_path`: ```c fp = filp_open(daemon_list.dir_path, O_RDONLY | O_DIRECTORY, 0); ``` ## 根據路徑開啟檔案 首先,必須先判斷目前路徑是目錄還是檔案,因此可透過: ```c S_ISDIR(fp->f_inode->i_mode) ``` 和 ```c S_ISREG(fp->f_inode->i_mode) ``` 來判斷是檔案或目錄。 假如是目錄,則使用跟之前一樣的方法,透過 `iterate_dir`來對目錄進行遍歷: ```c if (S_ISDIR(fp->f_inode->i_mode)) { char buf[SEND_BUFFER_SIZE] = {0}; snprintf(buf, SEND_BUFFER_SIZE, "HTTP/1.1 200 OK\r\n%s%s%s", "Connection: Keep-Alive\r\n", "Content-Type: text/html\r\n", "Keep-Alive: timeout=5, max=1000\r\n\r\n"); http_server_send(request->socket, buf, strlen(buf)); snprintf(buf, SEND_BUFFER_SIZE, "%s%s%s%s", "<html><head><style>\r\n", "body{font-family: monospace; font-size: 15px;}\r\n", "td {padding: 1.5px 6px;}\r\n", "</style></head><body><table>\r\n"); http_server_send(request->socket, buf, strlen(buf)); iterate_dir(fp, &request->dir_context); snprintf(buf, SEND_BUFFER_SIZE, "</table></body></html>\r\n"); http_server_send(request->socket, buf, strlen(buf)); } ``` 要注意的是,如果是檔案的話,要先取得檔案大小並分配空間: ```c char *read_data = kmalloc(fp->f_inode->i_size, GFP_KERNEL); ``` 然後透過: ```c kernel_read(fp, buf, fp->f_inode->i_size, 0); ``` 來讀取該檔案。 目前當點選目錄時,就可以進入更內層目錄,點選檔案則會顯示內容: ![image](https://hackmd.io/_uploads/BJBaAp4QC.png) 但目前有個問題,當進入深層的目錄時,再點選資料夾或目錄會找不到檔案,這是因為路徑是透過組合成的,假如有個檔案的位置是 ../khttpd/bcc/FAQ.txt ,組合出的路徑會是 ../khttpd/FAQ.txt。 ### 修正問題 主要問題是在回傳時設定的 `href` 錯誤: ```c snprintf(buf, SEND_BUFFER_SIZE, "<tr><td><a href=\"%s\">%s</a></td></tr>\r\n", name, name); ``` 這裡的 name 是這個檔案的名稱,如果直接把拿來當路徑就會發生前面提到的狀況,我們需要把這個名稱和前面的路徑組合: ```c strcpy(des,request->request_url); strcat(des, "/"); strcat(des, name); ``` 需要注意的是,如果是第一層目錄,原本 `request->request_url` 就是 `/`,因此要排除這個情況,完整程式碼如下: ```c char *des = kmalloc(strlen(request->request_url) + strlen(name) + 2,GFP_KERNEL); if(strcmp(request->request_url, "/") != 0) { strcpy(des,request->request_url); strcat(des, "/"); strcat(des, name); } else { strcpy(des,name); } snprintf(buf, SEND_BUFFER_SIZE, "<tr><td><a href=\"%s\">%s</a></td></tr>\r\n", des, name); ``` 完成後,已經可以順利點擊目錄進入更深層的檔案: ![image](https://hackmd.io/_uploads/rkhzKMPQR.png) ### 回前頁功能 目前想回到上一頁只能透過瀏覽器選擇回上頁完成,這裡嘗試增加一個選項來完成,原本只有在目前名稱不是 `..` 或 `.` 才會執行,稍微進行修改,讓 `..` 可以進入: ```diff= -if (strcmp(name, ".") && strcmp(name, "..")) +if (strcmp(name, ".") ) ``` 但這會出現幾個問題,首先,在第一頁不需要這個按鈕,另外,這個 `..` 在目錄中不一定是在最上面,但顯示給使用者的界面中應該要在最上面: ![image](https://hackmd.io/_uploads/BJ29K7DXA.png) 目前想法是,假如目前路徑不是 `\` ,則直接插入一個 `..` ,並參考學長作法,如果網址最後面試 `\`,則直接去掉該欄: ```c static int http_parser_callback_request_url(http_parser *parser, const char *p, size_t len) { struct http_request *request = parser->data; if(p[len-1] == '/') len--; strncat(request->request_url, p, len); return 0; } ``` 在 iterate_dir 前進行: ```c if(strcmp(request->request_url, "")) snprintf(buf, SEND_BUFFER_SIZE, "<tr><td><a href=\"%s%s\">%s</a></td></tr>\r\n", request->request_url, "/..", ".."); http_server_send(request->socket, buf, strlen(buf)); iterate_dir(fp, &request->dir_context); ``` 完成後,`..` 就會出現在目錄最上方: ![image](https://hackmd.io/_uploads/rJlFWVDXC.png) ## 檢測效能 因為加入了遍歷目錄的行為,伺服器回覆速度一定會變慢,使用之前資料比較的話: 目前: ``` requests: 200000 good requests: 200000 [100%] bad requests: 0 [0%] socket errors: 0 [0%] seconds: 14.281 requests/sec: 14004.417 ``` 之前: ``` requests: 200000 good requests: 200000 [100%] bad requests: 0 [0%] socket errors: 0 [0%] seconds: 1.011 requests/sec: 197856.619 ``` 嘗試使用 Ftrace 來檢測: 先檢查可以被檢測的函式: ```shell sudo cat ls /sys/kernel/debug/tracing/available_filter_functions | grep khttpd ``` ```shell parse_url_char [khttpd] http_message_needs_eof [khttpd] http_should_keep_alive [khttpd] http_parser_execute [khttpd] http_method_str [khttpd] http_status_str [khttpd] http_parser_init [khttpd] .. ``` 按照[作業說明](https://hackmd.io/@sysprog/linux2024-ktcp/%2F%40sysprog%2Flinux2024-ktcp-c#%E4%BD%BF%E7%94%A8-Ftrace-%E8%A7%80%E5%AF%9F-kHTTPd),透過一個 shell 檢測是哪部份花了最多時間: ```shell #!/bin/bash TRACE_DIR=/sys/kernel/debug/tracing # clear echo 0 > $TRACE_DIR/tracing_on echo > $TRACE_DIR/set_graph_function echo > $TRACE_DIR/set_ftrace_filter echo nop > $TRACE_DIR/current_tracer # setting echo function_graph > $TRACE_DIR/current_tracer echo 3 > $TRACE_DIR/max_graph_depth echo http_server_worker > $TRACE_DIR/set_graph_function # execute echo 1 > $TRACE_DIR/tracing_on ./htstress localhost:8081 -n 2000 echo 0 > $TRACE_DIR/tracing_on ``` 下方為結果,可以發現在 `http_parser_callback_message_complete` 花費了非常多時間: ```shell http_parser_execute [khttpd]() { 12) 0.090 us | http_parser_callback_message_begin [khttpd](); 12) 0.105 us | parse_url_char [khttpd](); 12) 0.098 us | http_parser_callback_request_url [khttpd](); 12) 0.072 us | http_parser_callback_header_field [khttpd](); 12) 0.070 us | http_parser_callback_header_value [khttpd](); 12) 0.064 us | http_parser_callback_headers_complete [khttpd](); 12) 0.066 us | http_message_needs_eof [khttpd](); 12) 0.069 us | http_should_keep_alive [khttpd](); 12) ! 345.720 us | http_parser_callback_message_complete [khttpd](); 12) ! 347.870 us | } ``` 這個函式會呼叫 `http_server_response` ,而在叫深層的地方會呼叫 handle_directory,更內部會再呼叫 `tracedir` ,因為推測可能是在遍歷目錄產生的成本,嘗試把 `max_graph_depth` 調整的更深,讓他可以檢測到 `tracedir` 的結果。 檢查後,就可以很明顯的發現該處確實是最大的開銷: ``` 20) 4.146 us | _printk(); 20) 7.140 us | filp_open(); 20) + 33.563 us | http_server_send.isra.0 [khttpd](); 20) + 21.726 us | http_server_send.isra.0 [khttpd](); 20) + 21.099 us | http_server_send.isra.0 [khttpd](); 20) ! 545.853 us | iterate_dir(); 20) + 13.296 us | http_server_send.isra.0 [khttpd](); 20) + 11.487 us | kernel_sock_shutdown(); 20) 2.934 us | filp_close(); 20) ! 663.411 us | } 20) ! 663.823 us | } 20) ! 668.401 us | } ``` 目前導致這麼差的效能的主因應是每次都要去讀取檔案,如果可以暫存檔案內容,如果有其他用戶呼叫相同內容時可以直接讀取暫存的內容,應該可以減輕一些負擔。 ### 引入快取機制 此處使用 linux kernel 的 hashtable 寫出簡易的 hash_insert 和 hash_check: ```c DEFINE_READ_MOSTLY_HASHTABLE(ht, 8); void init_hash_table (void) { hash_init(ht); } void hash_insert (const char *request, char *data) { char *insert_data = kmalloc(strlen(data) + 1, GFP_KERNEL); memcpy(insert_data, data, strlen(data) + 1); u32 key = jhash(request, strlen(request), 0); struct hash_content *content = kmalloc(sizeof(struct hash_content) , GFP_KERNEL); content->data = kmalloc(strlen(data) + 1, GFP_KERNEL); content->request = kmalloc(strlen(request) + 1, GFP_KERNEL); memcpy(content->data, data, strlen(data) + 1); memcpy(content->request, request, strlen(data) + 1); hash_add(ht, &content->node, key); } void hash_check (const char *request) { u32 key = jhash(request, strlen(request), 0); struct hash_content *now; rcu_read_lock(); hash_for_each_possible(ht, now, node, key) { if (strcmp(request, now->request) == 0) { printk("now request: %s\n",request); } } rcu_read_unlock(); } ``` `hash_insert` 傳入兩個值,分別是 request 的 url 和要儲存的資料,將 request 透過 jhash 轉為一個 hash 的 key ,並透過這個 key 插入資料。 而在 `hash_check` 則是檢查 request 的 url , 一樣將其轉為 key 後使用 hash_for_each_possible 來檢查該 key 的 list 是否包含該資料。 考慮到流量大時可能遍歷到一半切換到其他執行緒,結果目前正在遍歷的節點被其他執行緒刪除的情況,而寫入又只存在於第一次載入該目錄,將其修改為 hash_for_each_possible_rcu 會比較好? 這是目前的 hash_check ,檢查目前的 request_url 是否有暫存,若有的話會回傳 true ,沒有的話回傳 false ,並執行插入動作: ```c void hash_insert(const char *request, char *data) { u32 original_key = jhash(request, strlen(request), 0); u8 key = (u8) (original_key % 256); struct hash_content *content = kmalloc(sizeof(struct hash_content), GFP_KERNEL); content->data = kmalloc(strlen(data) + 1, GFP_KERNEL); content->request = kmalloc(strlen(request) + 1, GFP_KERNEL); memcpy(content->data, data, strlen(data) + 1); printk("finished input data"); memcpy(content->request, request, strlen(request) + 1); printk("finished copy request"); hash_add_rcu(ht, &content->node, key); printk("finished hash add"); } bool hash_check(const char *request) { u32 original_key = jhash(request, strlen(request), 0); u8 key = (u8) (original_key % 256); struct hash_content *now; rcu_read_lock(); hash_for_each_possible_rcu(ht, now, node, key) { if (strcmp(request, now->request) == 0) { rcu_read_unlock(); printk("find request!: %s %s\n", request,now->data); return true; } } rcu_read_unlock(); printk("finished hash check"); return false; } ``` 在 `http_server.c` 中: ```c if(!hash_check(request->request_url)) hash_insert(request->request_url, buf); ``` 現在的目標就是把 trace_dir 中產生的各種 `html` 標籤暫存起來,這樣如果有用戶再次訪問這個界面,就可以直接給他暫存的資料,而不需要再次透過 `trace_dir` 遍歷目錄。 目前的問題是,如果將這些標籤全部存在一個字串內,萬一該目錄下的檔案很多,我沒辦法確定要多長的字串才能處理,所以這裡我的想法是,我透過 linked-list 來存放每一筆資料: 在 `http_request` 中加入一個用來紀錄目錄檔案的 tag_list: ```diff struct http_request { struct socket *socket; enum http_method method; char request_url[128]; int complete; struct dir_context dir_context; struct list_head node; struct work_struct khttpd_work; + struct list_head *tag_list; }; ``` 首先先檢查目前訪問的目錄是否有人訪問過,若沒有則開始透過 iterate_dir 訪問,若有則會將該目錄資料的 list_head 存入 head: ```c if(!hash_check(request->request_url,&head)) { head = kmalloc(sizeof(struct list_head), GFP_KERNEL); INIT_LIST_HEAD(head); request->tag_list = head; iterate_dir(fp, &request->dir_context); hash_insert(request->request_url, head); } ``` 在 `trace_dir` 中,將拼接完成的標籤加入 list 中: ```c snprintf(buf, SEND_BUFFER_SIZE, "<tr><td><a href=\"%s\">%s</a></td></tr>\r\n", des, name); struct tag_content *content = kmalloc(sizeof(struct tag_content), GFP_KERNEL); INIT_LIST_HEAD(&content->tag_list); strncpy(content->url, buf, strlen(buf)); list_add_tail(&content->tag_list, request->tag_list); ``` 當透過 `trace_dir` 遍歷完目錄後,request->tag_list 就會有一個完整的 list ,每個節點中內容如下: ```c struct tag_content { struct list_head tag_list; //用於連接鏈結串列 char url[SEND_BUFFER_SIZE]; // "<tr><td><a href=\"%s\">%s</a></td></tr>\r\n", des, name); }; ``` 接下來將這個 list 的 head 存入 hash 中: ```c hash_insert(request->request_url, head); ``` 假如後續還有用戶再次來訪這個頁面,因為已經有存放在 hash 中了,可以直接透過 hash 內的 head 透過: ```c list_for_each_entry(now_content, head, tag_list) { http_server_send(request->socket, now_content->url, strlen(now_content->url)); } ``` 來發送訊息,省去了再次遍歷的時間。 完整的修改連結: [commit 05c3622](https://github.com/fatcatorange/khttpd/commit/dada9ea2862f09f03fc3e0874298526f629f5e21) 最後是刪除 hash 的函式: ```c void hash_clear(void ) { struct hash_content *entry = NULL; struct hlist_node *tmp = NULL; struct tag_content *now; struct tag_content *tag_temp; unsigned int bucket; hash_for_each_safe(ht, bucket, tmp, entry, node) { list_for_each_entry_safe(now, tag_temp, entry->head, tag_list ) { list_del(&now->tag_list); kfree(now); } hash_del(&entry->node); kfree(entry); } } ``` 透過 `hash_for_each` 來遍歷整個 hash table,而因為每個 hash_content 內是存那些 THTML 標籤字串的 head,因此再透過 `list_for_each_safe` 來清除每個節點,這部份不用考慮 race condition ,因為只有在卸載模組會使用。 ### 實驗檢查效能是否提昇: 以下為實驗結果: ```shell ./htstress http://localhost:8081 -t 3 -c 20 -n 200000 0 requests 20000 requests 40000 requests 60000 requests 80000 requests 100000 requests 120000 requests 140000 requests 160000 requests 180000 requests requests: 200000 good requests: 200000 [100%] bad requests: 0 [0%] socket errors: 0 [0%] seconds: 5.873 requests/sec: 34051.496 ``` 比對引入 hash 的快取機制前: ``` requests/sec: 14004.417 ``` 可以發現,速度提昇了超過 1 倍,大幅強化了伺服器效能。 ## 加入 timer 來中斷逾時連結 ## Cserv ### hook 有時可能需要修改一下 system call, cserv 中有一種寫法: ```c ssize_t read(int fd, void *buf, size_t count) { ssize_t n; while ((n = real_sys_read(fd, buf, count)) < 0) { if (EINTR == errno) continue; if (!fd_not_ready()) return -1; if (add_fd_event(fd, EVENT_READABLE, event_rw_callback, current_coro())) return -2; schedule_timeout(READ_TIMEOUT); del_fd_event(fd, EVENT_READABLE); if (is_wakeup_by_timeout()) { errno = ETIME; return -3; } } return n; } ``` 這個 `fd` 已經先被設定成 nonblocking ,所以他只會去看有沒有東西可以讀,沒有的話會

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully