sched-ext 研究

  • userspace tools and scheduler implementation : scx
  • kernel branch : sched_ext

Building & Setup

若你的電腦作業系統版本是 ubuntu 22.04 ,可以先按照 sched_ext/.github/workflows/test-kernel.yml 的流程安裝試試看,由於我進行實驗的 host machine 是 ubuntu 20.04 focal 版本,許多套件上有衝突或過舊的問題,因此以下採取許多 workaround 。

首先取得原始程式碼

$ git clone git@github.com:sched-ext/sched_ext.git
$ cd sched_ext

接著先利用 make 命令搭配參數編譯,不要直接使用 vng --build ,它產生的 .config 檔案和我們所需的不符合。

產生預設的 .config 檔案

$ make CC=clang-19 LLVM=1 defconfig

接著編輯所需的選項

$ make CC=clang-19 LLVM=1 menuconfig

要開啟的選項可以參考 Linux 核心設計: Scheduler(7): sched_ext 的設置,特別注意關於 BTF 的選項一定要開啟,否則後續編譯 bpftool 會出錯。
可以透過以下命令觀察

$ cat .config | grep BTF
CONFIG_DEBUG_INFO_BTF=y
CONFIG_PAHOLE_HAS_SPLIT_BTF=y
CONFIG_DEBUG_INFO_BTF_MODULES=y
CONFIG_MODULE_ALLOW_BTF_MISMATCH=y
CONFIG_PROBE_EVENTS_BTF_ARGS=y

CONFIG_DEBUG_INFO_BTF=y 是必要的,其他可以斟酌選取。
到此處如果都沒有問題,就可以編譯核心了

$ make CC=clang-19 LLVM=1 -j$(nproc)

編譯完成後,我們要來編譯 userspace tool 也就是 scx 專案。首先也是先取得專案程式碼 (不要放在 /sched_ext 目錄底下,建議回到家目錄再做 git clone) 。

$ git clone git@github.com:sched-ext/scx.git
$ cd scx

應該先參閱 scx : Build & Install 來得知需要先安裝的工具,特別注意此處因為我的作業系統版本為 ubuntu 20.04 focal , bpftool 的版本太舊會造成編譯時的問題,因此我先回到 sched_ext 目錄底下,編譯新版本的 bpftool 。

$ make CC=clang-19 -j $(nproc) -C tools/bpf/bpftool/

接著按照以下命令

$ meson setup --wipe build -Dkernel_headers=../sched_ext/usr/include --prefix ~ -D bpftool=./../sched_ext/tools/bpf/bpftool/bpftool
$ meson compile -C build -j 1

到此處如果都成功,接下來可以新增一個 shell 的 helper function 如下

$ sudo vim ~/.bashrc

加入以下這段

# Helper to test scx scheduler
scx() {
    sudo tmux new-session \; split-window -v \; send-keys -t 0 "$*" Enter
}

完成之後可以嘗試利用你編譯的核心來啟動 kvm

$ vng -v -n user --config sched-ext.config --rw

進入虛擬機器之後進行以下測試

$ cd ../scx
$ scx sudo ./build/scheds/rust/scx_rustland/debug/scx_rustland

應該可以看見類似以下的結果

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

這樣就代表成功執行 scx 了!
另外也可以參考 Righi 大大的部落格 Getting started with sched-ext development

重現 Optimizing Scheduler for Linux Gaming

搭建 sched_ext enabled kernel 於 host machine

由於實驗的重現需要運行 Benchmark ,在虛擬環境當中進行不適合,所以我建構一台基於 ubuntu 24.04 的實體 host machine ,搭配 Nvidia GeForce GTX 1080 顯示卡,並依照 scx/INSTALL.md/ubuntu 之方法使 kernel 支援 scx 。
開機後可以利用以下命令觀察當前 scx 服務是否正在運行,預設是 scx_rustland 會先運行。

$ sudo systemctl status scx.service

我們可以透過修改 /etc/default/scx 的內容來改變要使用的排程器,例如要使用 scx_lavd 的話將 /etc/default/scx 進行以下修改

# List of scx_schedulers: scx_central scx_flatcg scx_lavd scx_layered scx_nest scx_pair scx_qmap scx_rlfifo scx_rustland scx_rusty scx_simple scx_userland
-SCX_SCHEDULER=scx_rustland
+SCX_SCHEDULER=scx_lavd

# Set custom flags for each scheduler, below is an example of how to use
#SCX_FLAGS='-s 10000 -n'

之後利用以下命令重啟 scx 服務並檢查 scx 狀態

$ sudo systemctl restart scx.service
$ sudo systemctl status scx.service
● scx.service - Start scx_scheduler
     Loaded: loaded (/usr/lib/systemd/system/scx.service; enabled; preset: enab>
     Active: active (running) since Wed 2024-05-15 09:57:44 CST; 5s ago
   Main PID: 12051 (scx_lavd)
      Tasks: 1 (limit: 38278)
     Memory: 11.5M (peak: 23.4M)
        CPU: 61ms
     CGroup: /system.slice/scx.service
             └─12051 scx_lavd
...

安裝 profiling tools

講者使用來做 profiling 的工具主要是 vapormark ,當中量測 FPS 的工具是 MangoHud 要另外安裝。安裝完成之後可以先測試 MangoHud 能否在 Steam 遊戲當中運作並確認 GPU 在系統當中的狀態,之後在 Steam 遊戲的進階設置當中加上 mangohud %command% 選項,再開啟遊戲時可以見到以下畫面(我在此處運行的遊戲是 Layers of Fears )。

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

量測實驗比較

利用 Mangohud 與 ginsight 量測並將結果做圖。

開啟 Linux 核心預設排程器

首先將 scx 服務關閉,使得系統重新使用預設的排程器。

$ sudo systemctl disable scx.service
Removed "/etc/systemd/system/multi-user.target.wants/scx.service"

FPS

Linux kernel default

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

scx_lavd scheduler

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

CPU load

Linux kernel default

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

scx_lavd scheduler

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

GPU load

Linux kernel default

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

scx_lavd scheduler

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

scx_lavd-ginsight-gpu_load-violin

RAM used

Linux kernel default
lof_result-ginsight-ram_used-cdf
lof_result-ginsight-ram_used-ts
lof_result-ginsight-ram_used-violin

scx_lavd scheduler
scx_lavd-ginsight-ram_used-cdf
scx_lavd-ginsight-ram_used-ts
scx_lavd-ginsight-ram_used-violin

FPS 的量測結果並不如演講內容所說使用 scx_lavd 時會較為平穩,雖然 Low 1% 的數值是 scx_lavd 較高,但整體作圖後的結果沒有比較平穩, CPU load 和 GPU load 相較下就有明顯的差異,不管是 CPU 還是 GPU load 在使用 scx_lavd 震盪較為明顯。詢問過演講講者 Changwoo Min 後,他的回覆是兩者在處理此遊戲上的性能都已經夠好,但 scx_lavd 的 Low 1% FPS 更高是符合預期的,若想看到明顯的差距和遊戲的選擇、電腦性能還有背景是否有其他高負載任務正在執行有關。

我運行另一款遊戲 Left 4 Dead 2 的結果如下 (ubuntu 24.04 上多數 Steam 遊戲都無法運行,只能選擇這些較為老舊的遊戲) 同時在背景播放五個 youtube 影片搭配觀看 nba 直播,以 MangoHud 量測遊戲 FPS 的同時也利用 schedmon 紀錄並分析系統排程情況, scx_lavd 和預設排程器表現分別如下

scx_lavd

FPS
scx_lavd_heavy-ginsight-fps-ts
scx_lavd_heavy-ginsight-fps-cdf

CPU load
scx_lavd_heavy-ginsight-cpu_load-ts
scx_lavd_heavy-ginsight-cpu_load-cdf

GPU load
scx_lavd_heavy-ginsight-gpu_load-ts
scx_lavd_heavy-ginsight-gpu_load-cdf

default_scheduler

FPS
default_sched_heavy-ginsight-fps-ts
default_sched_heavy-ginsight-fps-cdf

CPU load
default_sched_heavy-ginsight-cpu_load-ts
default_sched_heavy-ginsight-cpu_load-cdf

GPU load
default_sched_heavy-ginsight-gpu_load-ts
default_sched_heavy-ginsight-gpu_load-cdf

分析結果依然是系統預設的排程器 low 1% FPS 更高,此處和講者說的不同,另外使用 schedmon 來分析排程器對於不同 task 和整個系統的影響,此分析工具會佔用較久的時間,以下只呈現部分結果,整體圖片太多不適合放在此處。

scx_lavd

  • system wide task stats data
    image
  • system wide idle stats data
    image
  • steam's wait_time, sched_delay, and runtimes
    image
  • IPC:CSteamEngin[7888/7664]'s wait_time, sched_delay, and runtimes
    image
  • Distribution of waiters
    image
  • Distribution of wakers
    image
  • 40-percentile of waker-waiter
    image

default_scheduler

  • system wide task stats data
    image
  • system wide idle stats data
    image
  • steam[4324/0]'s wait_time, sched_delay, and runtimes
    image
  • IPC:CSteamEngin[15020/14894] 's wait_time, sched_delay, and runtimes
    image
  • Distribution of waiters
    image
  • Distribution of wakers
    image
  • 40-percentile of waker-waiter
    image