K09: sehttpd

tags: `linux2022`

主講人: jserv / 課程討論區: 2022 年系統軟體課程

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

返回「Linux 核心設計」課程進度表

解說錄影

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

預期目標

學習 Linux 核心設計: 針對事件驅動的 I/O 模型演化和對應的 Linux 系統呼叫;
建構高效事件驅動的 web 伺服器，過程中深度體會 Linux 系統呼叫、行程和執行緒的運作機制;
學習透過 eBPF 進行作業系統層級的動態分析，並針對 web 伺服器發展特定的分析工具;

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

seHTTPd

seHTTPd 是個高效的 web 伺服器，涵蓋並行處理、I/O 事件模型、epoll、React pattern，和 Web 伺服器在事件驅動架構的考量，可參見高效 Web 伺服器開發。

預先準備的套件: (eBPF 作為後續分析使用)

$ sudo apt install wget
$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4052245BD4284CDD
$ echo "deb https://repo.iovisor.org/apt/$(lsb_release -cs) $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/iovisor.list
$ sudo apt-get update
$ sudo apt-get install bcc-tools linux-headers-$(uname -r)
$ sudo apt install apache2-utils

取得程式碼和編譯:

$ git clone https://github.com/sysprog21/sehttpd
$ cd sehttpd
$ make

預期可見 sehttpd 這個執行檔。接著透過內建的 test suite 來測試:

$ make check

對 seHTTPd 進行壓力測試

首先，我們可用「古典」的方法，透過 Apache Benching tool 對 seHTTPd 進行壓力測試。在一個終端機視窗執行以下命令:

$ ./sehttpd

切換到網頁瀏覽器，開啟網址 http://127.0.0.1:8081/ 應在網頁瀏覽器畫面中見到以下輸出:

Welcome!
If you see this page, the seHTTPd web server is successfully working.

然後在另一個終端機視窗執行以下命令:

$ ab -n 10000 -c 500 -k http://127.0.0.1:8081/

參考程式輸出: (數值若跟你的測試結果有顯著出入，實屬正常)

Server Software:        seHTTPd
Server Hostname:        127.0.0.1
Server Port:            8081

Document Path:          /
Document Length:        241 bytes

Concurrency Level:      500
Time taken for tests:   0.927 seconds
Complete requests:      10000
Failed requests:        0
Keep-Alive requests:    10000
Total transferred:      4180000 bytes
HTML transferred:       2410000 bytes
Requests per second:    10784.81 [#/sec] (mean)
Time per request:       46.361 [ms] (mean)
Time per request:       0.093 [ms] (mean, across all concurrent requests)
Transfer rate:          4402.39 [Kbytes/sec] received

留意到上述幾項:

-k 參數: 表示 "Enable the HTTP KeepAlive feature"，也就是在一個 HTTP session 中執行多筆請求
-c 參數: 表示 concurrency，即同時要下達請求的數量
-n 參數: 表示壓力測試過程中，期望下達的請求總量

關於輸出結果，請詳閱 ab - Apache HTTP server benchmarking tool 說明。

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

需要注意的是，ab 無法有效反映出多執行緒的特性 (ab 自身就消耗單核 100% 的運算量)，因此我們才會在 khttpd 提供 htstress.c，後者提供 -t 選項，能夠依據測試環境的有效 CPU 個數進行分配。

ab - Apache HTTP server benchmarking tool 的實作從今日的 GNU/Linux 或 FreeBSD 來說，算是過時且未能反映系統特性，除了 htstress，尚可使用 wrk，該專案的訴求是

wrk is a modern HTTP benchmarking tool capable of generating significant load when run on a single multi-core CPU. It combines a multithreaded design with scalable event notification systems such as epoll and kqueue.

另一個可測試 HTTP 伺服器負載的工具是 httperf。

例外處理

倘若你將 seHTTPd 執行後，不立刻關閉，隨即較長時間的等待和重新用上述 ab 多次測試 (變更 -n 和 -c 參數的指定數值) 後，可能會遇到以下錯誤狀況 (部分)

Segmentation fault;
顯示訊息 [ERROR] (src/http.c:253: errno: Resource temporarily unavailable) rc != 0

可用 $ grep -r log_err 搜尋原始程式碼，以得知現有的例外處理機制 (注意: 裡頭存在若干缺失，切勿「舉燭」)。

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

以 eBPF 追蹤 HTTP 封包

首先，研讀 Linux 核心設計: 透過 eBPF 觀察作業系統行為以理解核心動態追蹤機制，後者允許我們使用非侵入式的方式，不用去修改我們的作業系統核心內部，不用去修改我們的應用程式，也不用去修改我們的業務程式碼或者任何系統配置，就可快速高效地精確獲取我們想要的資訊。

在 seHTTPd 原始程式碼的 ebpf 目錄提供簡易的 HTTP 封包分析工具，就是建構在 eBPF 的基礎之上，並透過 IO Visor 提供的工具來運作。

概念示意圖:

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

使用方式: (預先在另一個終端機視窗執行 $ ./sehttpd)

$ cd ebpf
$ sudo python http-parse-sample.py

注意: 這個工具預設監控 eth0 這個網路介面 (network interface)。倘若你的預設網路介面不是 eth0，你需要依據 ip 工具的輸出，決定監控哪個網路介面。舉例來說，在某台 GNU/Linux 機器上執行以下命令:

$ ip link

你會見到若干輸出，如果你的環境裡頭已執行 Docker，輸出數量會很可觀，但不用擔心，排除 lo, tun, virbr, docker, br-, veth 開頭的輸出，然後就剩下 enp5s0 這樣的網路介面 (端視你的網路硬體而有不同)，於是可將上述命令改為:

$ sudo python http-parse-sample.py -i enp5s0

然後打開網頁瀏覽器，多次存取和刷新 http://127.0.0.1:8081/ 網址，然後你應可在上述執行 Python 程式的終端機見到類似以下的輸出:

TCP src port '51670' TCP dst port '8081'     
¢GET / HTTP/1.1
IP hdr length '20'
IP src '192.168.50.97' IP dst '61.70.212.51'
TCP src port '8081' TCP dst port '51670'
ÌHTTP/1.1 304 Not Modified

關於上述程式運作的概況，可參考 Appendix C。

透過 eBPF 追蹤 fibdrv 核心模組的運作機制，可參見 0xff07 的共筆

對應的程式碼

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Thread Pool 和並行議題

現行 seHTTPd 運用 non-blocking I/O 和 I/O multiplexor 機制，在單執行緒仍可有效能表現，但若我們想更充分運用硬體特性，勢必要透過多執行緒和克服相關並行議題。請事先研讀並行和多執行緒程式設計

引入 thread pool 的設計，需要非常小心，否則容易致使反效果 —— 分配更多的執行緒，但最終的伺服器效能還不如單執行緒為基礎的實作。應充分研讀 An Introduction to Lock-Free Programming，以掌握 lockless/lock-free 程式設計背景知識。

多執行緒的程式不容易除錯，可善用 ThreadSanitizer 這項已整合到 gcc/clang。修改 Makefile，在 LDFLAGS = 後方新增下列:

# ThreadSanitizer
CFLAGS += -fsanitize=thread
LDFLAGS += -fsanitize=thread

之後 $ make clean all 即可編譯出支援 ThreadSanitizer 的 seHTTPd 執行檔。

使用 computed goto 改寫 switch-case

在 Computed goto for efficient dispatch tables 一文提到：因為 switch 還是會有判斷數值的處理，所以在執行效率上 goto 還是會比使用 switch-case 來得好，而且使用 goto 也可以避免使用 switch-case 帶來 branch miss 的損失。

搭配 GNU extension 中提供的 Labels as Values，在 http_parser.c:http_parse_request_line 中的加入 conditions 作為 label address table 讓其可用 state 來查表，並直接利用 goto 跳躍到對應的程式碼。

const void *conditions[] = {&&c_start,
                            &&c_method,
                            &&c_spaces_before_uri,
                            &&c_after_slash_in_uri,
                            &&c_http,
                            &&c_http_H,
                            &&c_http_HT,
                            &&c_http_HTT,
                            &&c_http_HTTP,
                            &&c_first_major_digit,
                            &&c_major_digit,
                            &&c_first_minor_digit,
                            &&c_minor_digit,
                            &&c_spaces_after_digit,
                            &&c_almost_done};

接著將 switch 的部分取代為 goto

goto *conditions[state];

接著利用 perf 觀察使用 computed goto 改寫後關於 branch prediction 的表現:

$ sudo record -e 
    branch-misses:u,branch-instructions:u ./sehttpd

$ ./htstress -n 100000 -c 1 -t 4 http://localhost:8081/

requests:      100000
good requests: 100000 [100%]
bad requests:  0 [0%]
socker errors: 0 [0%]
seconds:       3.839
requests/sec:  26047.527

$ sudo perf report
Available samples
12K branch-misses:u
14K branch-instructions:u

使用 switch-case 的版本：

$ sudo record -e 
    branch-misses:u,branch-instructions:u ./sehttpd

$ ./htstress -n 100000 -c 1 -t 4 http://localhost:8081/

requests:      100000
good requests: 100000 [100%]
bad requests:  0 [0%]
socker errors: 0 [0%]
seconds:       4.000
requests/sec:  24997.963

$ sudo perf report
Available samples
13K branch-misses:u
14K branch-instructions:u

可見 computed goto 版本的伺服器因為在 branch miss 的數量較使用 switch-case 版本少，所以執行的時間得到改善。

接著因為在宣告 state 與 conditions 時裡面的名稱有高度的重疊，所以改以 X macro 的方式簡化宣告的程式碼

#define state_code(X)                                                     \
    X(_start), X(_method), X(_spaces_before_uri), X(_after_slash_in_uri), \
        X(_http), X(_http_H), X(_http_HT), X(_http_HTT), X(_http_HTTP),   \
        X(_first_major_digit), X(_major_digit), X(_first_minor_digit),    \
        X(_minor_digit), X(_spaces_after_digit), X(_almost_done)
        
#define define_enum(name, code) enum { code(enum_entry) } name
#define enum_entry(entry) s##entry

#define define_label_array(name, code) void *name[] = {code(label_entry)}
#define label_entry(entry) &&c##entry

利用巨集組合 enumerate 的變數與 label 並將 macro 作為變數傳到 state_code 這個 macro 中，展開後就會得到與上方宣告一樣的結果

設計 dispatch() 的 macro 來進行各種情況的程式碼跳躍。

#define dispatch(i)                            \
    do {                                       \
        if (i >= r->last)                      \
            interrupt_parse();                 \
        p = (uint8_t *) &r->buf[pi % MAX_BUF]; \
        ch = *p;                               \
        goto *conditions[state];               \
    } while (0)

因為將 for 迴圈的功能也實作在 dispatch 中，所以使用 computed goto 進行 dispatch 之前，也要處理字串迭代的更新。

但考慮到在進入 for 迴圈的時候並不會將 pi 遞增，所以為了讓進入迴圈前與進入迴圈後 dispatch 的行爲與原先 for 迴圈的實作一致，此處將 pi 作為巨集參數的參數帶入，讓其可在二種不同的情況做不同的處理。

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

自我檢查清單

在高效 Web 伺服器開發提到 epoll 的兩種工作模式 (level trigger vs. edge trigger)，對照 seHTTPd 原始程式碼，解釋 epoll 工作模式的設定和在 web 伺服器實作的考量點
$\to$ 搭配程式碼實驗並說明

提示: 參考實驗程式碼: test_epoll_lt_and_et
seHTTPd 內部為何有 timer，考量點和具體作用為何？timer 為何用到 priority queue 呢？能否在 Linux 核心原始程式碼找到類似的用法？
lock-free/lockless 的 thread pool 的效益為何？在高並行的應用場域 (如 web 伺服器)，可以如何發揮 thread pool 的效益呢？
解釋前述 http-parse-sample.py 運作機制，以及 eBPF 程式在 Linux 核心內部分析封包的優勢為何？

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

作業要求

回答上述「自我檢查清單」的所有問題，需要附上對應的參考資料和必要的程式碼，以第一手材料 (包含自己設計的實驗) 為佳

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

如果你在 2022 年 4 月 19 日前，已從 GitHub sysprog21/sehttpd 進行 fork，請依據 Alternatives to forking into the same account 一文，對舊的 repository 做對應處置，然後重新 fork

在 GitHub 上 fork sehttpd，目標是修正 seHTTPd 的執行時期的缺失，提升效能和穩健度 (robustness)，需要充分落實以下:
- 修改 htstress.c 並放入 git repository，學習 wrk 和 httperf 的部分特徵，強化對 seHTTPd 伺服器的測試 (除了效能，也該涵蓋正確性測試，包含檔案存取)，特別是 concurrency 和 multi-threading 等議題;
- 強化 seHTTPd 實作的例外處理，得以長時間持續運作，應修正各式程式邏輯、記憶體管理，和 I/O 事件模型處理相關的缺失;
- 參考上述 eBPF 範例程式，實作專門動態分析 seHTTPd 執行過程中 Linux 核心內部事件、系統呼叫，關鍵操作耗費的時間等等的工具;
- 引入 thread pool 到 seHTTPd 探討對 web 伺服器的效能的影響，過程中應該充分量化各因素，尤其是在多執行緒的環境中 non-blocking I/O 搭配事件驅動程式設計的實作;
- 提供基本的 directory listing 功能: 可指定 WWWROOT，例如 httpd
- 以 sendfile 系統呼叫改寫 src/http.c 裡頭傳遞檔案內容的實作 (對應到 serve_static 函式)，思考在高並行的環境中，檔案系統 I/O 的開銷及改進空間;
- 嘗試以 io_uring 改寫 seHTTPd，並比較原本使用 epoll 的處理效率
- 用 $ grep -r TODO * 找出 seHTTPd 的待做事項，予以列出並充分闡述具體執行方法，例如涉及動態記憶體管理、長期閒置連線的處理機制等等。
  
  mimalloc 的運用是個值得留意的改進方向，可參考共筆 mimalloc 實作機制和改善和 mimalloc 實驗
- 複習 lab0 裡頭針對 Valgrind 和建構其上的 Massif 工具，用以分析 seHTTPd 的記憶體開銷，設計實驗讓 seHTTPd 得以長時間運作，並搭配各式工作負載 (work load)。
在你的開發環境中安裝 Nginx 伺服器並比較由你強化後的 seHTTPd 和 Nginx 的效能差異，試圖解釋針對 Linux 的改進空間

繳交方式

編輯 Homework6 作業區共筆，將你的觀察、上述要求的解說、應用場合探討，以及各式效能改善過程，善用 gnuplot 製圖，紀錄於新建立的共筆

截止日期

May 10, 2022 (含) 之前

越早在 GitHub 上有動態、越早接受 code review，評分越高

K09: sehttpd

tags: linux2022

Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More → 預期目標

Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More → seHTTPd

對 seHTTPd 進行壓力測試

例外處理

Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More → 以 eBPF 追蹤 HTTP 封包

Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More → Thread Pool 和並行議題

使用 computed goto 改寫 switch-case

Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More → 自我檢查清單

Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More → 作業要求

繳交方式

截止日期

作業觀摩

Read more

單一指令處理器 (OISC)

從 CPU cache coherence 談 Linux spinlock 可擴展能力議題

淺談 Microkernel 設計和真實世界中的應用

並行程式設計: 概念

tags: `linux2022`

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

預期目標

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

seHTTPd

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

以 eBPF 追蹤 HTTP 封包

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Thread Pool 和並行議題

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

自我檢查清單

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

作業要求