# 在 Raspberry Pi 3 Model B 上使用 eBPF 與 perf
要可以使用 2020 年 8 月的 Raspberry Pi OS (舊稱 Raspbian) 中,已經搭配 5.4 版的 Linux 核心,這表示對 eBPF 的支援更加齊全。這篇以 Raspberry Pi Model B 為硬體,簡單介紹如何在 2020-08 釋出的 Raspberry Pi OS 為基礎,使用 eBPF 與 `perf`。
[TOC]
## Part 1:安裝 Raspberry Pi OS
因為 Raspbery Pi OS 預設的核心組態 (kernel config) 並沒有啟動 eBPF 相關的選項,所以要想辦法生一個新的核心出來。目前有查到幾個作法:
1. 在 Raspberry Pi OS 上編譯一個新的核心:官方有相關[教學](https://www.raspberrypi.org/documentation/linux/kernel/building.md)。
2. 在 Ubuntu 上交叉編譯:[同樣一份](https://www.raspberrypi.org/documentation/linux/kernel/building.md)官方教學中也有提到如何交叉編譯核心。
3. Buildroot:[這裡](https://www.linuxembedded.fr/2019/05/bcc-integration-into-buildroot/)有一個把 BCC 直接一起編進去的例子。不過目前還在嘗試中。
接下來會使用的是第 1. 個方法。首先裝一個能動的 Raspberry Pi OS。
### Step 1:燒錄映像檔
去[官方網站](https://www.raspberrypi.org/downloads/raspberry-pi-os/)找到合適的映像檔,並且燒錄到 SD 卡中。
> 如果要直接在 Raspberry Pi 上進行編譯,建議記憶卡可以用 8G 或以上的,畢竟還要下載核心的原始程式碼。
燒錄的方法可以參考官方網站的 [*Installing operating system images*](https://www.raspberrypi.org/documentation/installation/installing-images/README.md) 中,*Writing the image* 一節。我使用的是 *Raspberry Pi OS (32-bit) Lite*,2020-08-20 的版本 (可以參考 [release notes](http://downloads.raspberrypi.org/raspios_armhf/release_notes.txt) 上面的資訊)。
> 映像檔的選擇看個人。如果是需要使用桌面環境的,比如說你想要直接把 Raspberry Pi 接上螢幕或鍵盤操作,那麼可以考慮用桌面版。如果是想要用 SSH 遠端連線,反正桌面環境也看不到,所以既可以考慮用 Lite。
以下會用 Lite 來示範,並且使用 headless 的方式進行設定 (不用外接鍵盤或顯示器)。下面的步驟可以搭配官方的 [*Setting up a Raspberry Pi headless*](https://www.raspberrypi.org/documentation/configuration/wireless/headless.md) 一文,以及 [*Remote Access*](https://www.raspberrypi.org/documentation/remote-access/README.md) 一文中的 [*SSH (Secure Shell)*](https://www.raspberrypi.org/documentation/remote-access/ssh/README.md) 章節。
### Step 2:Wifi
> 接下來會使用 headless 的方式設置 Raspberry Pi,也就是在沒有在 Raspberry Pi 上外接顯示器、鍵盤等硬體的設置方式。
>
> 如果自己有習慣的使用方式 (比如說習慣直接使用 VNC 跟 Raspberry Pi 的桌面環境、或是習慣外接鍵盤顯示器等等),那也可以跳過下面這些步驟,直接開始編譯核心的部分。總之只要可以裝上去用就可以。
在 SD 卡裡面的最上層中,新增一個 `wpa_supplicant.conf` 的檔案內容:
```=
ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1
country=TW
network={
ssid="<Name of your wireless LAN>"
psk="<Password for your wireless LAN>"
}
```
其中,`ssid` 是預設 WiFi 的名稱,`psk` 填 WiFi 密碼。這可以用參考 [*Setting up a wireless LAN via the command line*](https://www.raspberrypi.org/documentation/configuration/wireless/wireless-cli.md) 一文中的 *Adding the network details to the Raspberry Pi* 一節來看如何加密使用 `wpa_passphrase` 加密。
### Step 3:啟動 SSH
在 SD 卡最上層新增一個名為 `ssh` 的空白檔案。
### Step 4:找到 Raspberry Pi 的 IP
可以參考官方文件中的 [*IP Address*](https://www.raspberrypi.org/documentation/remote-access/ip-address.md) 章節。比如說使用 `ping`:
```shell=
$ ping raspberrypi.local
```
如果成功連線的話,就會發現以下的
```shel=
$ ping raspberrypi.local
PING raspberrypi.local (172.20.10.4): 56 data bytes
64 bytes from 172.20.10.4: icmp_seq=0 ttl=64 time=9.253 ms
64 bytes from 172.20.10.4: icmp_seq=1 ttl=64 time=9.160 ms
64 bytes from 172.20.10.4: icmp_seq=2 ttl=64 time=10.580 ms
64 bytes from 172.20.10.4: icmp_seq=3 ttl=64 time=43.283 ms
64 bytes from 172.20.10.4: icmp_seq=4 ttl=64 time=8.124 ms
64 bytes from 172.20.10.4: icmp_seq=5 ttl=64 time=8.192 ms
64 bytes from 172.20.10.4: icmp_seq=6 ttl=64 time=14.043 ms
64 bytes from 172.20.10.4: icmp_seq=7 ttl=64 time=12.170 ms
64 bytes from 172.20.10.4: icmp_seq=8 ttl=64 time=12.216 ms
64 bytes from 172.20.10.4: icmp_seq=9 ttl=64 time=11.172 ms
^C
--- raspberrypi.local ping statistics ---
10 packets transmitted, 10 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 8.124/13.819/43.283/9.988 ms
```
### Step 5:使用 SSH 連線
```=
$ ssh pi@<IP Address>
```
上面找到的東西位址是 `172.20.10.4`,所以
```=
$ ssh pi@172.20.10.4
```
會出現:
```=
$ ssh pi@172.20.10.4
The authenticity of host '172.20.10.4 (172.20.10.4)' can't be established.
ECDSA key fingerprint is SHA256:CktI0Jinn0n21IqFXe3++BSUzKiemWGEPFXDpcg+CPE.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
```
輸入 `yes` 之後,就會出現以下輸出:
```
Warning: Permanently added '172.20.10.4' (ECDSA) to the list of known hosts.
pi@172.20.10.4's password:
```
然後輸入密碼。預設的密碼是 `raspberry`,所以就把 `raspberry` 當作密碼打進去。就可以進去了:
```
Linux raspberrypi 5.4.51-v7+ #1333 SMP Mon Aug 10 16:45:19 BST 2020 armv7l
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
SSH is enabled and the default password for the 'pi' user has not been changed.
This is a security risk - please login as the 'pi' user and type 'passwd' to set a new password.
```
### Step 6:raspi-config
可以登入之後,先做一些設定:
```shell=
$ sudo raspi-config
```
會出現以下的畫面:
![](https://i.imgur.com/5GFkKIM.png)
1. 改密碼:第一個 *Change User Password Change password for the 'pi' user* 的選項就是了:
![](https://i.imgur.com/5GFkKIM.png)
2. 啟動 SSH:在 *Interfacing Option* 的地方,會看到這個選項
![](https://i.imgur.com/rLWDhSI.png)
點進去會問要不要,就選要。
3. 擴大檔案系統:*Advanced Option* 裡面的 *A1 Expand Filesystem Ensures that all of the SD card storage is available to the OS* 選項:
![](https://i.imgur.com/63GX25d.png)
4. 更新相關套件:最後一個
![](https://i.imgur.com/G5DASRw.png)
5. 在 *Boot Options* 中,有一個選項是 *Wait for Network at Boot*,有時候可能會有用。
![](https://i.imgur.com/M43CA0W.png)
## Part 2:編譯核心
> 查閱 BCC 中提示的核心組態選項之後,發現 Raspberry Pi OS 預設的核心組態沒有啟動 BPF JIT Compiler,所以重新編譯一次。
### Step 7:查看核心組態
在這之前,可能會先想看一下核心組態是什麼。
```shell=
pi@raspberrypi:~ $ sudo modprobe configs
pi@raspberrypi:~ $ zcat /proc/config.gz > .config
pi@raspberrypi:~ $ cat .config
```
這時候會跑出很多組態。可以用 `grep` 找到感興趣的。比如說找跟 BPF 有關的選項:
```=
pi@raspberrypi:~ $ cat .config | grep BPF
CONFIG_CGROUP_BPF=y
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_NETFILTER_XT_MATCH_BPF=m
# CONFIG_BPFILTER is not set
# CONFIG_NET_CLS_BPF is not set
# CONFIG_NET_ACT_BPF is not set
# CONFIG_BPF_JIT is not set
# CONFIG_BPF_STREAM_PARSER is not set
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_LIRC_MODE2=y
# CONFIG_NBPFAXI_DMA is not set
CONFIG_BPF_EVENTS=y
# CONFIG_TEST_BPF is not set
```
就可以發現大部分的組態都沒有打開。又比如想要看 `perf` 的選項:
```=
$ cat .config | grep PERF_EVENT
```
就會發現都有開:
```
CONFIG_HAVE_PERF_EVENTS=y
CONFIG_PERF_EVENTS=y
CONFIG_HW_PERF_EVENTS=y
```
或是 `FTRACE` 的選項:
```=
$ cat .config | grep FTRACE
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_FTRACE=y
# CONFIG_FTRACE_SYSCALLS is not set
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_FTRACE_MCOUNT_RECORD=y
# CONFIG_FTRACE_STARTUP_TEST is not set
```
### Step 8:編譯核心
可以參考官方的[文件](https://www.raspberrypi.org/documentation/linux/kernel/building.md)。首先安裝必要的套件:
```=
$ sudo apt install git bc bison flex libssl-dev make
```
然後先使用預設的配置:
```
$ cd linux/
$ KERNEL=kernel7
$ make bcm2709_defconfig
```
接下來要自己調整核心的組態。這個配置除了自己暴力修改 `.config` 檔之外,可以使用 `menudonfig` 這個東西配置,先安裝 `libcurses`:
```
$ sudo apt install libncurses-dev
```
然後使用:
```
$ make menuconfig
```
就會出現像下面這樣的畫面:
![](https://i.imgur.com/SCRoOz4.png)
接著就開始找出 eBPF 需要的核心組態。從 [BCC](https://github.com/iovisor/bcc/blob/master/INSTALL.md#kernel-configuration) 的安裝指示中,可以知道有哪些核心組態需要開啟:
```config=
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
# [optional, for tc filters]
CONFIG_NET_CLS_BPF=m
# [optional, for tc actions]
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_JIT=y
# [for Linux kernel versions 4.1 through 4.6]
CONFIG_HAVE_BPF_JIT=y
# [for Linux kernel versions 4.7 and later]
CONFIG_HAVE_EBPF_JIT=y
# [optional, for kprobes]
CONFIG_BPF_EVENTS=y
```
除了上面這個之外,我還想要開 ftrace 跟 uprobe,所以也另外開了:
```config=
CONFIG_FTRACE_SYSCALLS=y
CONFIG_UPROBE_EVENTS=y
```
menuconfig 有搜尋功能,可以從名稱去搜尋在哪一個目錄中。比如說想知道 `CONFIG_NET_CLS_BPF` 這個選項在哪邊設定,可以使在畫面中按下 `/` ,就會出現搜尋畫面:
![](https://i.imgur.com/QDO8KIP.png)
打入需要的選項 (前面的 `CONFIG` 字樣可以省略)。比如說:
![](https://i.imgur.com/3xbWN0x.png)
然後就可以找到對應的東西在選單的哪些子項目了。
![](https://i.imgur.com/wOi8gMw.png)
依照選項,按下 `m` 或 `y` 進行配置。
最後,去 General Setup 中的 Local Version 改一個喜歡的版本,比如說我的是叫做 `-v7-with-eBPF`。然後就準備開始編譯了:
> 後來發現這樣 `uname -r` 後面也會加上 `-v7-with-eBPF`,所以日後安裝 *header* 的時候可能要直接把 `5.4` 的版本打上去。
```
$ make -j4 zImage modules dtbs
```
這個東西要邊超級超級久,在 Raspberry Pi 3 Model B 上,編譯時間超過 3 個小時。
### Step 9:換上新的核心
```shell=
$ sudo make modules_install
$ sudo cp arch/arm/boot/dts/*.dtb /boot/
$ sudo cp arch/arm/boot/dts/overlays/*.dtb* /boot/overlays/
$ sudo cp arch/arm/boot/dts/overlays/README /boot/overlays/
```
編完的核心是那個 `arch/arm/boot/zImage`。所以把他變成自喜歡的名字,然後搬到 `/boot` 底下:
```
$ sudo cp arch/arm/boot/zImage /boot/<name>.img
```
比如說把 `<name>` 叫做 `kernelebpf`,那就是:
```
$ sudo cp arch/arm/boot/zImage /boot/kernelebpf.img
```
最後,修改 `/boot/config.txt` 中的 `kernel` 選項,把他改成新的核心:
```=
$ sudo vim /boot/config.txt
```
編輯這個檔案,在最後面加上 `kernel=<name>.img`,其中 `<name>` 就是剛剛取的名字。以這邊為例,就是像下面這樣:
```diff=
# For more options and information see
# http://rpf.io/configtxt
# Some settings may impact device functionality. See link above for details
[...]
# Additional overlays and parameters are documented /boot/overlays/README
# Enable audio (loads snd_bcm2835)
dtparam=audio=on
[pi4]
# Enable DRM VC4 V3D driver on top of the dispmanx display stack
dtoverlay=vc4-fkms-v3d
max_framebuffers=2
[all]
#dtoverlay=vc4-fkms-v3d
+# Boot to kernel with eBPF
+kernel=kernelebpf.img
```
然後重開:
```=
$ sudo reboot
```
### Step 10:確認結果
重開機登入之後,檢查核心的版本:
```=
$ uname -a
```
就會發現變成剛剛編譯的核心了:
```=
$ uname -a
Linux raspberrypi 5.4.59-v7-with-eBPF+ #1 SMP Tue Sep 1 16:59:18 BST 2020 armv7l GNU/Linux
```
> 或是直接查詢核心組態中,是調整過的組態是否有開啟。
## Part 3:編譯 ply 與 perf
不管是用套件管理軟體安裝,或是自行編譯 BCC,執行 `tools` 資料夾裡面的程式時都出現類似以下的錯誤訊息:
```=
$ ./runqlen.py
In file included from <built-in>:3:
In file included from /virtual/include/bcc/helpers.h:53:
In file included from include/linux/log2.h:12:
In file included from include/linux/bitops.h:26:
In file included from arch/arm/include/asm/bitops.h:243:
In file included from include/asm-generic/bitops/lock.h:5:
In file included from include/linux/atomic.h:7:
In file included from arch/arm/include/asm/atomic.h:16:
arch/arm/include/asm/cmpxchg.h:128:2: error: "SMP is not supported on this platform"
#error "SMP is not supported on this platform"
[...]
In file included from <built-in>:3:
In file included from /virtual/include/bcc/helpers.h:54:
In file included from arch/arm/include/asm/page.h:160:
arch/arm/include/asm/memory.h:220:26: error: value '2164260864' out of range for constraint
'I'
__pv_stub(x, t, "add", __PV_BITS_31_24);
^~~~~~~~~~~~~~~
arch/arm/include/asm/memory.h:175:25: note: expanded from macro '__PV_BITS_31_24'
#define __PV_BITS_31_24 0x81000000
^~~~~~~~~~
arch/arm/include/asm/memory.h:193:21: note: expanded from macro '__pv_stub'
: "r" (from), "I" (type))
^~~~
arch/arm/include/asm/memory.h:222:3: error: value '129' out of range for constraint 'I'
__pv_stub_mov_hi(t);
^~~~~~~~~~~~~~~~~~~
arch/arm/include/asm/memory.h:202:9: note: expanded from macro '__pv_stub_mov_hi'
: "I" (__PV_BITS_7_0))
^~~~~~~~~~~~~
arch/arm/include/asm/memory.h:176:23: note: expanded from macro '__PV_BITS_7_0'
#define __PV_BITS_7_0 0x81
^~~~
arch/arm/include/asm/memory.h:223:3: error: value '2164260864' out of range for constraint 'I'
__pv_add_carry_stub(x, t);
^~~~~~~~~~~~~~~~~~~~~~~~~
arch/arm/include/asm/memory.h:212:18: note: expanded from macro '__pv_add_carry_stub'
: "r" (x), "I" (__PV_BITS_31_24) \
^~~~~~~~~~~~~~~
[...]
```
而要用 `snap` 安裝 `bpftrace` 套件的時候,則會出現不支援的提示:
```=
$ sudo snap install bpftrace
error: snap "bpftrace" is not available on stable for this architecture (armhf) but exists on
other architectures (amd64, arm64, ppc64el, s390x).
```
所以決定暫時用 [ply](https://github.com/iovisor/ply) 代替。`ply` 的用法類似 `bpftrace`,只是功能比較簡化。
### Step 11:編譯 ply
```
$ sudo apt install autogen autoconf libtool
```
接著下載 [ply](https://github.com/iovisor/ply#build-and-installation):
```
$ git clone https://github.com/iovisor/ply.git
```
並且依照 [readme](https://github.com/iovisor/ply#build-and-installation) 的方式來編譯:
```
$ cd ply
$ ./autogen.sh
$ ./configure
$ make
$ sudo make install
```
在安裝的過程中,可能會出現像下面這樣的輸出。意思是
```
[...]
libtool: finish: PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/sbin" ldconfig -n /usr/local/lib
----------------------------------------------------------------------
Libraries have been installed in:
/usr/local/lib
If you ever happen to want to link against installed libraries
in a given directory, LIBDIR, you must either use libtool, and
specify the full pathname of the library, or use the '-LLIBDIR'
flag during linking and do at least one of the following:
- add LIBDIR to the 'LD_LIBRARY_PATH' environment variable
during execution
- add LIBDIR to the 'LD_RUN_PATH' environment variable
during linking
- use the '-Wl,-rpath -Wl,LIBDIR' linker flag
- have your system administrator add LIBDIR to '/etc/ld.so.conf'
See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
----------------------------------------------------------------------
make[4]: Nothing to be done for 'install-data-am'.
make[4]: Leaving directory '/home/pi/ply/src/libply'
[...]
```
編輯 `/etc/ld.so.conf`:
```
$ vim /etc/ld.so.conf
```
把裡面加上 `/usr/local/lib`:
```diff=
include /etc/ld.so.conf.d/*.conf
+/usr/local/lib
```
然後執行 `ldconfig`
```
$ sudo ldconfig
```
### Step 12:測試 ply
```
$ sudo ply 'kretprobe:vfs_read { @["size"] = quantize(retval); }'
```
會出現 `ply: active` 的字樣:
```
$ sudo ply 'kretprobe:vfs_read { @["size"] = quantize(retval); }'
ply: active
```
等待一段時間之後,按下 ctrl + c 終止程式,就會發現對應的統計結果:
```
^Cply: deactivating
@:
{ size }:
[ 0, 1] 3 ┤█████████▋ │
[ 2, 3] 1 ┤███▎ │
...
[ 8, 15] 2 ┤██████▍ │
[ 16, 31] 1 ┤███▎ │
[ 32, 63] 1 ┤███▎ │
...
[ 256, 511] 1 ┤███▎ │
...
[ 1k, 2k) 1 ┤███▎ │
```
### Step 13:編譯 perf
套件管理軟體看起來只有 `perf_4.9`,而現在的核心是 5.4,所以重新編譯。首先安裝相依套件:
```shell
$ sudo apt-get -y install flex bison libdw-dev libnewt-dev binutils-dev libaudit-dev \
binutils-dev libssl-dev python-dev systemtap-sdt-dev libiberty-dev libperl-dev \
liblzma-dev libpython-dev libunwind-* asciidoc xmlto
```
然後編譯 `perf`。進入 linux 核心的資料夾中,然後編譯:
```
$ make -C tools/perf/
```
### Step 14:測試 perf
編譯完之後,`perf` 會位在 `tools/perf/perf`。試著看看可不可以列出事件:
```=
$ ./tools/perf/perf list
```
預期出現以下的輸出:
```
List of pre-defined events (to be used in -e):
branch-instructions OR branches [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]
cache-misses [Hardware event]
cache-references [Hardware event]
cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
[...]
```