# 在 Raspberry Pi 3 Model B 上使用 eBPF 與 perf 要可以使用 2020 年 8 月的 Raspberry Pi OS (舊稱 Raspbian) 中,已經搭配 5.4 版的 Linux 核心,這表示對 eBPF 的支援更加齊全。這篇以 Raspberry Pi Model B 為硬體,簡單介紹如何在 2020-08 釋出的 Raspberry Pi OS 為基礎,使用 eBPF 與 `perf`。 [TOC] ## Part 1:安裝 Raspberry Pi OS 因為 Raspbery Pi OS 預設的核心組態 (kernel config) 並沒有啟動 eBPF 相關的選項,所以要想辦法生一個新的核心出來。目前有查到幾個作法: 1. 在 Raspberry Pi OS 上編譯一個新的核心:官方有相關[教學](https://www.raspberrypi.org/documentation/linux/kernel/building.md)。 2. 在 Ubuntu 上交叉編譯:[同樣一份](https://www.raspberrypi.org/documentation/linux/kernel/building.md)官方教學中也有提到如何交叉編譯核心。 3. Buildroot:[這裡](https://www.linuxembedded.fr/2019/05/bcc-integration-into-buildroot/)有一個把 BCC 直接一起編進去的例子。不過目前還在嘗試中。 接下來會使用的是第 1. 個方法。首先裝一個能動的 Raspberry Pi OS。 ### Step 1:燒錄映像檔 去[官方網站](https://www.raspberrypi.org/downloads/raspberry-pi-os/)找到合適的映像檔,並且燒錄到 SD 卡中。 > 如果要直接在 Raspberry Pi 上進行編譯,建議記憶卡可以用 8G 或以上的,畢竟還要下載核心的原始程式碼。 燒錄的方法可以參考官方網站的 [*Installing operating system images*](https://www.raspberrypi.org/documentation/installation/installing-images/README.md) 中,*Writing the image* 一節。我使用的是 *Raspberry Pi OS (32-bit) Lite*,2020-08-20 的版本 (可以參考 [release notes](http://downloads.raspberrypi.org/raspios_armhf/release_notes.txt) 上面的資訊)。 > 映像檔的選擇看個人。如果是需要使用桌面環境的,比如說你想要直接把 Raspberry Pi 接上螢幕或鍵盤操作,那麼可以考慮用桌面版。如果是想要用 SSH 遠端連線,反正桌面環境也看不到,所以既可以考慮用 Lite。 以下會用 Lite 來示範,並且使用 headless 的方式進行設定 (不用外接鍵盤或顯示器)。下面的步驟可以搭配官方的 [*Setting up a Raspberry Pi headless*](https://www.raspberrypi.org/documentation/configuration/wireless/headless.md) 一文,以及 [*Remote Access*](https://www.raspberrypi.org/documentation/remote-access/README.md) 一文中的 [*SSH (Secure Shell)*](https://www.raspberrypi.org/documentation/remote-access/ssh/README.md) 章節。 ### Step 2:Wifi > 接下來會使用 headless 的方式設置 Raspberry Pi,也就是在沒有在 Raspberry Pi 上外接顯示器、鍵盤等硬體的設置方式。 > > 如果自己有習慣的使用方式 (比如說習慣直接使用 VNC 跟 Raspberry Pi 的桌面環境、或是習慣外接鍵盤顯示器等等),那也可以跳過下面這些步驟,直接開始編譯核心的部分。總之只要可以裝上去用就可以。 在 SD 卡裡面的最上層中,新增一個 `wpa_supplicant.conf` 的檔案內容: ```= ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev update_config=1 country=TW network={ ssid="<Name of your wireless LAN>" psk="<Password for your wireless LAN>" } ``` 其中,`ssid` 是預設 WiFi 的名稱,`psk` 填 WiFi 密碼。這可以用參考 [*Setting up a wireless LAN via the command line*](https://www.raspberrypi.org/documentation/configuration/wireless/wireless-cli.md) 一文中的 *Adding the network details to the Raspberry Pi* 一節來看如何加密使用 `wpa_passphrase` 加密。 ### Step 3:啟動 SSH 在 SD 卡最上層新增一個名為 `ssh` 的空白檔案。 ### Step 4:找到 Raspberry Pi 的 IP 可以參考官方文件中的 [*IP Address*](https://www.raspberrypi.org/documentation/remote-access/ip-address.md) 章節。比如說使用 `ping`: ```shell= $ ping raspberrypi.local ``` 如果成功連線的話,就會發現以下的 ```shel= $ ping raspberrypi.local PING raspberrypi.local (172.20.10.4): 56 data bytes 64 bytes from 172.20.10.4: icmp_seq=0 ttl=64 time=9.253 ms 64 bytes from 172.20.10.4: icmp_seq=1 ttl=64 time=9.160 ms 64 bytes from 172.20.10.4: icmp_seq=2 ttl=64 time=10.580 ms 64 bytes from 172.20.10.4: icmp_seq=3 ttl=64 time=43.283 ms 64 bytes from 172.20.10.4: icmp_seq=4 ttl=64 time=8.124 ms 64 bytes from 172.20.10.4: icmp_seq=5 ttl=64 time=8.192 ms 64 bytes from 172.20.10.4: icmp_seq=6 ttl=64 time=14.043 ms 64 bytes from 172.20.10.4: icmp_seq=7 ttl=64 time=12.170 ms 64 bytes from 172.20.10.4: icmp_seq=8 ttl=64 time=12.216 ms 64 bytes from 172.20.10.4: icmp_seq=9 ttl=64 time=11.172 ms ^C --- raspberrypi.local ping statistics --- 10 packets transmitted, 10 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 8.124/13.819/43.283/9.988 ms ``` ### Step 5:使用 SSH 連線 ```= $ ssh pi@<IP Address> ``` 上面找到的東西位址是 `172.20.10.4`,所以 ```= $ ssh pi@172.20.10.4 ``` 會出現: ```= $ ssh pi@172.20.10.4 The authenticity of host '172.20.10.4 (172.20.10.4)' can't be established. ECDSA key fingerprint is SHA256:CktI0Jinn0n21IqFXe3++BSUzKiemWGEPFXDpcg+CPE. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes ``` 輸入 `yes` 之後,就會出現以下輸出: ``` Warning: Permanently added '172.20.10.4' (ECDSA) to the list of known hosts. pi@172.20.10.4's password: ``` 然後輸入密碼。預設的密碼是 `raspberry`,所以就把 `raspberry` 當作密碼打進去。就可以進去了: ``` Linux raspberrypi 5.4.51-v7+ #1333 SMP Mon Aug 10 16:45:19 BST 2020 armv7l The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. SSH is enabled and the default password for the 'pi' user has not been changed. This is a security risk - please login as the 'pi' user and type 'passwd' to set a new password. ``` ### Step 6:raspi-config 可以登入之後,先做一些設定: ```shell= $ sudo raspi-config ``` 會出現以下的畫面: ![](https://i.imgur.com/5GFkKIM.png) 1. 改密碼:第一個 *Change User Password Change password for the 'pi' user* 的選項就是了: ![](https://i.imgur.com/5GFkKIM.png) 2. 啟動 SSH:在 *Interfacing Option* 的地方,會看到這個選項 ![](https://i.imgur.com/rLWDhSI.png) 點進去會問要不要,就選要。 3. 擴大檔案系統:*Advanced Option* 裡面的 *A1 Expand Filesystem Ensures that all of the SD card storage is available to the OS* 選項: ![](https://i.imgur.com/63GX25d.png) 4. 更新相關套件:最後一個 ![](https://i.imgur.com/G5DASRw.png) 5. 在 *Boot Options* 中,有一個選項是 *Wait for Network at Boot*,有時候可能會有用。 ![](https://i.imgur.com/M43CA0W.png) ## Part 2:編譯核心 > 查閱 BCC 中提示的核心組態選項之後,發現 Raspberry Pi OS 預設的核心組態沒有啟動 BPF JIT Compiler,所以重新編譯一次。 ### Step 7:查看核心組態 在這之前,可能會先想看一下核心組態是什麼。 ```shell= pi@raspberrypi:~ $ sudo modprobe configs pi@raspberrypi:~ $ zcat /proc/config.gz > .config pi@raspberrypi:~ $ cat .config ``` 這時候會跑出很多組態。可以用 `grep` 找到感興趣的。比如說找跟 BPF 有關的選項: ```= pi@raspberrypi:~ $ cat .config | grep BPF CONFIG_CGROUP_BPF=y CONFIG_BPF=y CONFIG_BPF_SYSCALL=y CONFIG_NETFILTER_XT_MATCH_BPF=m # CONFIG_BPFILTER is not set # CONFIG_NET_CLS_BPF is not set # CONFIG_NET_ACT_BPF is not set # CONFIG_BPF_JIT is not set # CONFIG_BPF_STREAM_PARSER is not set CONFIG_HAVE_EBPF_JIT=y CONFIG_BPF_LIRC_MODE2=y # CONFIG_NBPFAXI_DMA is not set CONFIG_BPF_EVENTS=y # CONFIG_TEST_BPF is not set ``` 就可以發現大部分的組態都沒有打開。又比如想要看 `perf` 的選項: ```= $ cat .config | grep PERF_EVENT ``` 就會發現都有開: ``` CONFIG_HAVE_PERF_EVENTS=y CONFIG_PERF_EVENTS=y CONFIG_HW_PERF_EVENTS=y ``` 或是 `FTRACE` 的選項: ```= $ cat .config | grep FTRACE CONFIG_HAVE_DYNAMIC_FTRACE=y CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y CONFIG_FTRACE=y # CONFIG_FTRACE_SYSCALLS is not set CONFIG_DYNAMIC_FTRACE=y CONFIG_DYNAMIC_FTRACE_WITH_REGS=y CONFIG_FTRACE_MCOUNT_RECORD=y # CONFIG_FTRACE_STARTUP_TEST is not set ``` ### Step 8:編譯核心 可以參考官方的[文件](https://www.raspberrypi.org/documentation/linux/kernel/building.md)。首先安裝必要的套件: ```= $ sudo apt install git bc bison flex libssl-dev make ``` 然後先使用預設的配置: ``` $ cd linux/ $ KERNEL=kernel7 $ make bcm2709_defconfig ``` 接下來要自己調整核心的組態。這個配置除了自己暴力修改 `.config` 檔之外,可以使用 `menudonfig` 這個東西配置,先安裝 `libcurses`: ``` $ sudo apt install libncurses-dev ``` 然後使用: ``` $ make menuconfig ``` 就會出現像下面這樣的畫面: ![](https://i.imgur.com/SCRoOz4.png) 接著就開始找出 eBPF 需要的核心組態。從 [BCC](https://github.com/iovisor/bcc/blob/master/INSTALL.md#kernel-configuration) 的安裝指示中,可以知道有哪些核心組態需要開啟: ```config= CONFIG_BPF=y CONFIG_BPF_SYSCALL=y # [optional, for tc filters] CONFIG_NET_CLS_BPF=m # [optional, for tc actions] CONFIG_NET_ACT_BPF=m CONFIG_BPF_JIT=y # [for Linux kernel versions 4.1 through 4.6] CONFIG_HAVE_BPF_JIT=y # [for Linux kernel versions 4.7 and later] CONFIG_HAVE_EBPF_JIT=y # [optional, for kprobes] CONFIG_BPF_EVENTS=y ``` 除了上面這個之外,我還想要開 ftrace 跟 uprobe,所以也另外開了: ```config= CONFIG_FTRACE_SYSCALLS=y CONFIG_UPROBE_EVENTS=y ``` menuconfig 有搜尋功能,可以從名稱去搜尋在哪一個目錄中。比如說想知道 `CONFIG_NET_CLS_BPF` 這個選項在哪邊設定,可以使在畫面中按下 `/` ,就會出現搜尋畫面: ![](https://i.imgur.com/QDO8KIP.png) 打入需要的選項 (前面的 `CONFIG` 字樣可以省略)。比如說: ![](https://i.imgur.com/3xbWN0x.png) 然後就可以找到對應的東西在選單的哪些子項目了。 ![](https://i.imgur.com/wOi8gMw.png) 依照選項,按下 `m` 或 `y` 進行配置。 最後,去 General Setup 中的 Local Version 改一個喜歡的版本,比如說我的是叫做 `-v7-with-eBPF`。然後就準備開始編譯了: > 後來發現這樣 `uname -r` 後面也會加上 `-v7-with-eBPF`,所以日後安裝 *header* 的時候可能要直接把 `5.4` 的版本打上去。 ``` $ make -j4 zImage modules dtbs ``` 這個東西要邊超級超級久,在 Raspberry Pi 3 Model B 上,編譯時間超過 3 個小時。 ### Step 9:換上新的核心 ```shell= $ sudo make modules_install $ sudo cp arch/arm/boot/dts/*.dtb /boot/ $ sudo cp arch/arm/boot/dts/overlays/*.dtb* /boot/overlays/ $ sudo cp arch/arm/boot/dts/overlays/README /boot/overlays/ ``` 編完的核心是那個 `arch/arm/boot/zImage`。所以把他變成自喜歡的名字,然後搬到 `/boot` 底下: ``` $ sudo cp arch/arm/boot/zImage /boot/<name>.img ``` 比如說把 `<name>` 叫做 `kernelebpf`,那就是: ``` $ sudo cp arch/arm/boot/zImage /boot/kernelebpf.img ``` 最後,修改 `/boot/config.txt` 中的 `kernel` 選項,把他改成新的核心: ```= $ sudo vim /boot/config.txt ``` 編輯這個檔案,在最後面加上 `kernel=<name>.img`,其中 `<name>` 就是剛剛取的名字。以這邊為例,就是像下面這樣: ```diff= # For more options and information see # http://rpf.io/configtxt # Some settings may impact device functionality. See link above for details [...] # Additional overlays and parameters are documented /boot/overlays/README # Enable audio (loads snd_bcm2835) dtparam=audio=on [pi4] # Enable DRM VC4 V3D driver on top of the dispmanx display stack dtoverlay=vc4-fkms-v3d max_framebuffers=2 [all] #dtoverlay=vc4-fkms-v3d +# Boot to kernel with eBPF +kernel=kernelebpf.img ``` 然後重開: ```= $ sudo reboot ``` ### Step 10:確認結果 重開機登入之後,檢查核心的版本: ```= $ uname -a ``` 就會發現變成剛剛編譯的核心了: ```= $ uname -a Linux raspberrypi 5.4.59-v7-with-eBPF+ #1 SMP Tue Sep 1 16:59:18 BST 2020 armv7l GNU/Linux ``` > 或是直接查詢核心組態中,是調整過的組態是否有開啟。 ## Part 3:編譯 ply 與 perf 不管是用套件管理軟體安裝,或是自行編譯 BCC,執行 `tools` 資料夾裡面的程式時都出現類似以下的錯誤訊息: ```= $ ./runqlen.py In file included from <built-in>:3: In file included from /virtual/include/bcc/helpers.h:53: In file included from include/linux/log2.h:12: In file included from include/linux/bitops.h:26: In file included from arch/arm/include/asm/bitops.h:243: In file included from include/asm-generic/bitops/lock.h:5: In file included from include/linux/atomic.h:7: In file included from arch/arm/include/asm/atomic.h:16: arch/arm/include/asm/cmpxchg.h:128:2: error: "SMP is not supported on this platform" #error "SMP is not supported on this platform" [...] In file included from <built-in>:3: In file included from /virtual/include/bcc/helpers.h:54: In file included from arch/arm/include/asm/page.h:160: arch/arm/include/asm/memory.h:220:26: error: value '2164260864' out of range for constraint 'I' __pv_stub(x, t, "add", __PV_BITS_31_24); ^~~~~~~~~~~~~~~ arch/arm/include/asm/memory.h:175:25: note: expanded from macro '__PV_BITS_31_24' #define __PV_BITS_31_24 0x81000000 ^~~~~~~~~~ arch/arm/include/asm/memory.h:193:21: note: expanded from macro '__pv_stub' : "r" (from), "I" (type)) ^~~~ arch/arm/include/asm/memory.h:222:3: error: value '129' out of range for constraint 'I' __pv_stub_mov_hi(t); ^~~~~~~~~~~~~~~~~~~ arch/arm/include/asm/memory.h:202:9: note: expanded from macro '__pv_stub_mov_hi' : "I" (__PV_BITS_7_0)) ^~~~~~~~~~~~~ arch/arm/include/asm/memory.h:176:23: note: expanded from macro '__PV_BITS_7_0' #define __PV_BITS_7_0 0x81 ^~~~ arch/arm/include/asm/memory.h:223:3: error: value '2164260864' out of range for constraint 'I' __pv_add_carry_stub(x, t); ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/arm/include/asm/memory.h:212:18: note: expanded from macro '__pv_add_carry_stub' : "r" (x), "I" (__PV_BITS_31_24) \ ^~~~~~~~~~~~~~~ [...] ``` 而要用 `snap` 安裝 `bpftrace` 套件的時候,則會出現不支援的提示: ```= $ sudo snap install bpftrace error: snap "bpftrace" is not available on stable for this architecture (armhf) but exists on other architectures (amd64, arm64, ppc64el, s390x). ``` 所以決定暫時用 [ply](https://github.com/iovisor/ply) 代替。`ply` 的用法類似 `bpftrace`,只是功能比較簡化。 ### Step 11:編譯 ply ``` $ sudo apt install autogen autoconf libtool ``` 接著下載 [ply](https://github.com/iovisor/ply#build-and-installation): ``` $ git clone https://github.com/iovisor/ply.git ``` 並且依照 [readme](https://github.com/iovisor/ply#build-and-installation) 的方式來編譯: ``` $ cd ply $ ./autogen.sh $ ./configure $ make $ sudo make install ``` 在安裝的過程中,可能會出現像下面這樣的輸出。意思是 ``` [...] libtool: finish: PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/sbin" ldconfig -n /usr/local/lib ---------------------------------------------------------------------- Libraries have been installed in: /usr/local/lib If you ever happen to want to link against installed libraries in a given directory, LIBDIR, you must either use libtool, and specify the full pathname of the library, or use the '-LLIBDIR' flag during linking and do at least one of the following: - add LIBDIR to the 'LD_LIBRARY_PATH' environment variable during execution - add LIBDIR to the 'LD_RUN_PATH' environment variable during linking - use the '-Wl,-rpath -Wl,LIBDIR' linker flag - have your system administrator add LIBDIR to '/etc/ld.so.conf' See any operating system documentation about shared libraries for more information, such as the ld(1) and ld.so(8) manual pages. ---------------------------------------------------------------------- make[4]: Nothing to be done for 'install-data-am'. make[4]: Leaving directory '/home/pi/ply/src/libply' [...] ``` 編輯 `/etc/ld.so.conf`: ``` $ vim /etc/ld.so.conf ``` 把裡面加上 `/usr/local/lib`: ```diff= include /etc/ld.so.conf.d/*.conf +/usr/local/lib ``` 然後執行 `ldconfig` ``` $ sudo ldconfig ``` ### Step 12:測試 ply ``` $ sudo ply 'kretprobe:vfs_read { @["size"] = quantize(retval); }' ``` 會出現 `ply: active` 的字樣: ``` $ sudo ply 'kretprobe:vfs_read { @["size"] = quantize(retval); }' ply: active ``` 等待一段時間之後,按下 ctrl + c 終止程式,就會發現對應的統計結果: ``` ^Cply: deactivating @: { size }: [ 0, 1] 3 ┤█████████▋ │ [ 2, 3] 1 ┤███▎ │ ... [ 8, 15] 2 ┤██████▍ │ [ 16, 31] 1 ┤███▎ │ [ 32, 63] 1 ┤███▎ │ ... [ 256, 511] 1 ┤███▎ │ ... [ 1k, 2k) 1 ┤███▎ │ ``` ### Step 13:編譯 perf 套件管理軟體看起來只有 `perf_4.9`,而現在的核心是 5.4,所以重新編譯。首先安裝相依套件: ```shell $ sudo apt-get -y install flex bison libdw-dev libnewt-dev binutils-dev libaudit-dev \ binutils-dev libssl-dev python-dev systemtap-sdt-dev libiberty-dev libperl-dev \ liblzma-dev libpython-dev libunwind-* asciidoc xmlto ``` 然後編譯 `perf`。進入 linux 核心的資料夾中,然後編譯: ``` $ make -C tools/perf/ ``` ### Step 14:測試 perf 編譯完之後,`perf` 會位在 `tools/perf/perf`。試著看看可不可以列出事件: ```= $ ./tools/perf/perf list ``` 預期出現以下的輸出: ``` List of pre-defined events (to be used in -e): branch-instructions OR branches [Hardware event] branch-misses [Hardware event] bus-cycles [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] [...] ```