sched-ext 研究
- userspace tools and scheduler implementation : scx
- kernel branch : sched_ext
Building & Setup
若你的電腦作業系統版本是 ubuntu 22.04 ,可以先按照 sched_ext/.github/workflows/test-kernel.yml 的流程安裝試試看,由於我進行實驗的 host machine 是 ubuntu 20.04 focal 版本,許多套件上有衝突或過舊的問題,因此以下採取許多 workaround 。
首先取得原始程式碼
接著先利用 make
命令搭配參數編譯,不要直接使用 vng --build
,它產生的 .config
檔案和我們所需的不符合。
產生預設的 .config
檔案
接著編輯所需的選項
要開啟的選項可以參考 Linux 核心設計: Scheduler(7): sched_ext 的設置,特別注意關於 BTF 的選項一定要開啟,否則後續編譯 bpftool 會出錯。
可以透過以下命令觀察
CONFIG_DEBUG_INFO_BTF=y
是必要的,其他可以斟酌選取。
到此處如果都沒有問題,就可以編譯核心了
編譯完成後,我們要來編譯 userspace tool 也就是 scx 專案。首先也是先取得專案程式碼 (不要放在 /sched_ext
目錄底下,建議回到家目錄再做 git clone
) 。
應該先參閱 scx : Build & Install 來得知需要先安裝的工具,特別注意此處因為我的作業系統版本為 ubuntu 20.04 focal , bpftool 的版本太舊會造成編譯時的問題,因此我先回到 sched_ext
目錄底下,編譯新版本的 bpftool 。
接著按照以下命令
到此處如果都成功,接下來可以新增一個 shell 的 helper function 如下
加入以下這段
完成之後可以嘗試利用你編譯的核心來啟動 kvm
進入虛擬機器之後進行以下測試
應該可以看見類似以下的結果
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
這樣就代表成功執行 scx 了!
另外也可以參考 Righi 大大的部落格 Getting started with sched-ext development 。
搭建 sched_ext enabled kernel 於 host machine
由於實驗的重現需要運行 Benchmark ,在虛擬環境當中進行不適合,所以我建構一台基於 ubuntu 24.04 的實體 host machine ,搭配 Nvidia GeForce GTX 1080 顯示卡,並依照 scx/INSTALL.md/ubuntu 之方法使 kernel 支援 scx 。
開機後可以利用以下命令觀察當前 scx
服務是否正在運行,預設是 scx_rustland
會先運行。
我們可以透過修改 /etc/default/scx
的內容來改變要使用的排程器,例如要使用 scx_lavd
的話將 /etc/default/scx
進行以下修改
之後利用以下命令重啟 scx
服務並檢查 scx
狀態
講者使用來做 profiling 的工具主要是 vapormark ,當中量測 FPS 的工具是 MangoHud 要另外安裝。安裝完成之後可以先測試 MangoHud 能否在 Steam 遊戲當中運作並確認 GPU 在系統當中的狀態,之後在 Steam 遊戲的進階設置當中加上 mangohud %command%
選項,再開啟遊戲時可以見到以下畫面(我在此處運行的遊戲是 Layers of Fears )。
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
量測實驗比較
利用 Mangohud 與 ginsight 量測並將結果做圖。
開啟 Linux 核心預設排程器
首先將 scx
服務關閉,使得系統重新使用預設的排程器。
FPS
Linux kernel default
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
scx_lavd scheduler
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
CPU load
Linux kernel default
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
scx_lavd scheduler
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
GPU load
Linux kernel default



scx_lavd scheduler



RAM used
Linux kernel default



scx_lavd scheduler



FPS 的量測結果並不如演講內容所說使用 scx_lavd
時會較為平穩,雖然 Low 1% 的數值是 scx_lavd
較高,但整體作圖後的結果沒有比較平穩, CPU load 和 GPU load 相較下就有明顯的差異,不管是 CPU 還是 GPU load 在使用 scx_lavd
震盪較為明顯。詢問過演講講者 Changwoo Min 後,他的回覆是兩者在處理此遊戲上的性能都已經夠好,但 scx_lavd
的 Low 1% FPS 更高是符合預期的,若想看到明顯的差距和遊戲的選擇、電腦性能還有背景是否有其他高負載任務正在執行有關。
我運行另一款遊戲 Left 4 Dead 2 的結果如下 (ubuntu 24.04 上多數 Steam 遊戲都無法運行,只能選擇這些較為老舊的遊戲) 同時在背景播放五個 youtube 影片搭配觀看 nba 直播,以 MangoHud 量測遊戲 FPS 的同時也利用 schedmon
紀錄並分析系統排程情況, scx_lavd
和預設排程器表現分別如下
scx_lavd
FPS


CPU load


GPU load


default_scheduler
FPS


CPU load


GPU load


分析結果依然是系統預設的排程器 low 1% FPS 更高,此處和講者說的不同,另外使用 schedmon
來分析排程器對於不同 task 和整個系統的影響,此分析工具會佔用較久的時間,以下只呈現部分結果,整體圖片太多不適合放在此處。
scx_lavd
- system wide task stats data

- system wide idle stats data

- steam's wait_time, sched_delay, and runtimes

- IPC:CSteamEngin[7888/7664]'s wait_time, sched_delay, and runtimes

- Distribution of waiters

- Distribution of wakers

- 40-percentile of waker-waiter

default_scheduler
- system wide task stats data

- system wide idle stats data

- steam[4324/0]'s wait_time, sched_delay, and runtimes

- IPC:CSteamEngin[15020/14894] 's wait_time, sched_delay, and runtimes

- Distribution of waiters

- Distribution of wakers

- 40-percentile of waker-waiter
