2019q1 Homework5 (daemon)

# 2019q1 Homework5 (daemon) contributed by < `jeffcarl67` > ## 環境 * 4.19.36-1-MANJARO * gcc (GCC) 8.3.0 ## 缺點輸入時 fastecho 會讀取到多餘的字元，例如在我的電腦上輸入 a 時，在 `dmesg` 中顯示 ``` [ 6526.225454] fastecho: get request = a \xfd\xf3O\xd9▒\xe0\xd1&%tU ``` 可以看到字串 `\xfd\xf3O\xd9▒\xe0\xd1&%tU` 並非由使用者輸入，再看到輸出語句 ```c= printk(MODULE_NAME ": send request = %s\n", buf); ``` 由於參數 `%s` 會使函式 `printk` 持續輸出字元直到遇到 `'\0'` 為止，因此猜測多餘字元可能是在 ```c= buf = kmalloc(BUF_SIZE, GFP_KERNEL); ``` 取得的記憶體中殘留的資料，解決的思路參看[kernel_recvmsg](https://www.kernel.org/doc/htmldocs/networking/API-kernel-recvmsg.html) 有以下描述 > The returned value is the total number of bytes received, or an error. 由於函式 `kernel_recvmsg` 在未出錯時會返回讀取的 byte 數量，因此加入如下敘述: ```c= if (length > 0 && length < size) buf[length] = '\0'; ``` 於每次讀取的字串後加入 `'\0'` 即可避免函式 `printk` 輸出多餘字元，然而此時會出現 `fastecho` 回傳的字元被下一次的輸入覆蓋的現象，推測原因在於 `fastecho` 回傳的字元中不包含換行符號 `'\n'`，在呼叫函式 `send_request` 前於緩衝區尾端加入一個 `'\n'` 字元可以修正輸出被覆蓋的問題，如下： ```c= int length; ... while (!kthread_should_stop()) { ... length = strlen(buf); if (length < BUF_SIZE - 1) { buf[length] = '\n'; buf[length + 1] = '\0'; } res = send_request(sock, buf, strlen(buf)); ... ``` ## 測試 ### 訊息回傳利用單一執行緒的 [python 腳本](https://github.com/jeffcarl67/kecho/blob/master/echo_test.py)測試 `fastecho` 在只有一個 client 時在不同請求次數下所消耗的時間，所得結果如下： ![](https://i.imgur.com/3jTnixP.png) 可以發現 `fastecho` 的表現十分穩定，消耗的時間與次數成正相關，而查看 `dmesg` 會看到 `fastecho` 的行為大致符合預期 ### client 連接測試 `fastecho` 在有大量 client 的情況下處理連線的效能，在這個實驗中每個 client 只會傳送一次並擷取一次訊息就關閉連線，使用的工具修改自 [htstress](https://github.com/arut/htstress)，修改後的版本在此 [htstress.c](https://github.com/jeffcarl67/kecho/blob/master/htstress.c) 由於 `htstress` 原先設計用來測試 http server 的效能，通常當 http server 傳送完資料後會關閉連線，而 `fastecho` 並不會主動關閉連線，導致使用原本版本的 `htsress` 時會發現 `htsress` 與 `fastecho` 在完成第一次資料傳送後同時阻塞等待對方傳送資料，因此修改 `htsress` 使其在完成一次 echo 後就關閉連線，奇怪的是在原本版本中會觀察到在完成第一次 echo 後函式 `kernel_sendmsg` 會回傳 0 導致 kernel thread 結束，目前還未找到原因以下為 `hstress` 的命令參數： ``` Usage: htstress [options] [http://]hostname[:port]/path Options: -n, --number total number of requests (0 for inifinite, Ctrl-C to abort) -c, --concurrency number of concurrent connections -t, --threads number of threads (set this to the number of CPU cores) -d, --debug debug HTTP response --help display this message ``` 使用命令 `./htstress -n 10000 -c 1 -t 1 localhost:12345` 執行後得到如下結果 ``` requests: 10000 good requests: 10000 [100%] bad requests: 0 [0%] seconds: 0.721 requests/sec: 13870.280 ``` 而使用命令 `./htstress -n 10000 -c 1 -t 4 localhost:12345` 執行後得到如下結果 ``` requests: 10000 good requests: 10000 [100%] bad requests: 0 [0%] seconds: 1.675 requests/sec: 5971.672 ``` 可以發現在多執行緒下多個 client 同時連線時效能反而下降了，可見 `fastecho` 對多執行緒下的情況處理能力依然不夠完善 ## workqueue 參考 [kernelhttp](https://github.com/oiz5201618/kernelhttp) 以類似的方式加入 `workqueue`，之後以命令 `./htstress -n 10000 -c 10 -t 4 localhost:12345` 測試使用 `workqueue` 後效能是否有變化，結果如下 ``` requests: 10000 good requests: 10000 [100%] bad requests: 0 [0%] seconds: 0.470 requests/sec: 21296.896 ``` 可以發現改為使用 `workqueue` 後效能顯著上升，推測原因為 `workqueue` 能更好的利用多核心 ## kernelhttp 修改 `server_npy.c` 將原本的函式 `int myserver(void)` 變為 `int http_server_daemon(void *arg)` 並使其在 `fastecho` 載入後成為 kernel thread，執行的功能與原先相同，修改後的版本見 [server_npy.c](https://github.com/jeffcarl67/kecho/blob/master/server_npy.c) 使用 [ApacheBench](https://httpd.apache.org/docs/2.4/programs/ab.html) 測試 `fastecho` 中 `kernelhttp` 的效能，以參數 `ab -n 10000 -c 10 http://localhost:8888/` 得到如下結果： ``` Server Software: Server Hostname: localhost Server Port: 8888 Document Path: / Document Length: 12 bytes Concurrency Level: 10 Time taken for tests: 0.509 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 980000 bytes HTML transferred: 120000 bytes Requests per second: 19662.20 [#/sec] (mean) Time per request: 0.509 [ms] (mean) Time per request: 0.051 [ms] (mean, across all concurrent requests) Transfer rate: 1881.73 [Kbytes/sec] received ```