# 在 Kubeadm 叢集中使用 Canal 與 Multus 實現 Pod 多網卡 ## 事前準備 - [x] 透過 Kubeadm 已安裝好 1m2w 的 k8s,並且節點都是跨實體機 ## 安裝 Multus * 每一台節點都需執行以下,將下載的 `multus` 套件放置 `/opt/cni/bin` 目錄 ``` $ wget https://github.com/k8snetworkplumbingwg/multus-cni/releases/download/v4.2.1/multus-cni_4.2.1_linux_amd64.tar.gz $ tar -zxvf multus-cni_4.2.1_linux_amd64.tar.gz $ sudo cp multus-cni_4.2.1_linux_amd64/multus /opt/cni/bin ``` * 在 control-plane 佈署以下 daemonset ``` $ kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/master/deployments/multus-daemonset.yml ``` * 檢查 `multus` 佈署狀態 ``` $ kubectl -n kube-system get po -l app=multus NAME READY STATUS RESTARTS AGE kube-multus-ds-42qkg 1/1 Running 0 4m41s kube-multus-ds-djm94 1/1 Running 0 4m40s kube-multus-ds-fdksz 1/1 Running 0 4m40s ``` ## pod 單節點網路測試 ``` $ sudo /opt/cni/bin/bridge CNI bridge plugin v1.4.0 CNI protocol versions supported: 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 1.0.0 ``` * `"subnet": "10.10.0.0/16"` : 定義 pod 第二張網卡所會拿到的 ip 範圍。 * `"type": "bridge"` : 就是 linux 的虛擬橋接器,讓 pod 的第二張網卡都會接到這個虛擬橋接器,此功能只能讓 pod 在同台節點上溝通。 * `"bridge": "mynet0"` : 虛擬橋接器叫 `mynet0`。 * `"cniVersion"` 表示你這個 CNI 設定檔所使用的 CNI 版本。 * `"type": "host-local"` 表示這個 Pod 的 IP 是由 CNI 中的 `host-local` plugin 在"本機"上根據你定義的 IP 子網範圍來分配。 ``` $ cat <<EOF | kubectl create -f - apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata: name: bridge-conf spec: config: '{ "cniVersion": "0.3.1", "name": "mynet", "type": "bridge", "bridge": "mynet0", "ipam": { "type": "host-local", "subnet": "10.10.0.0/16", "rangeStart": "10.10.1.20", "rangeEnd": "10.10.3.50" } }' EOF ``` ``` $ kubectl get network-attachment-definitions NAME AGE bridge-conf 7s ``` * 在 pod 內使用 `annotations` 宣告要使用雙網卡,並在同節點上產生 ``` $ cat <<EOF | kubectl create -f - apiVersion: v1 kind: Pod metadata: name: samplepod1 labels: app: samplepod annotations: k8s.v1.cni.cncf.io/networks: bridge-conf spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - samplepod topologyKey: "kubernetes.io/hostname" containers: - name: samplepod image: taiwanese/debug.alp tty: true imagePullPolicy: IfNotPresent --- apiVersion: v1 kind: Pod metadata: name: samplepod2 labels: app: samplepod annotations: k8s.v1.cni.cncf.io/networks: bridge-conf spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - samplepod topologyKey: "kubernetes.io/hostname" containers: - name: samplepod image: taiwanese/debug.alp tty: true imagePullPolicy: IfNotPresent EOF ``` * 檢查 pod 內部雙網卡資訊 ``` $ kubectl get po -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES samplepod1 1/1 Running 0 32s 10.244.190.104 w1 <none> <none> samplepod2 1/1 Running 0 32s 10.244.190.105 w1 <none> <none> $ kubectl exec samplepod1 -- ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0@if80: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP qlen 1000 link/ether f6:7d:35:af:0e:1e brd ff:ff:ff:ff:ff:ff inet 10.244.0.137/32 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::f47d:35ff:feaf:e1e/64 scope link valid_lft forever preferred_lft forever 3: net1@if82: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP link/ether d6:6c:bf:0b:7f:eb brd ff:ff:ff:ff:ff:ff inet 10.10.1.20/16 brd 10.10.255.255 scope global net1 valid_lft forever preferred_lft forever inet6 fe80::d46c:bfff:fe0b:7feb/64 scope link valid_lft forever preferred_lft forever $ kubectl exec samplepod2 -- ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0@if79: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP qlen 1000 link/ether 5e:ac:25:83:b4:51 brd ff:ff:ff:ff:ff:ff inet 10.244.0.138/32 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::5cac:25ff:fe83:b451/64 scope link valid_lft forever preferred_lft forever 3: net1@if83: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP link/ether ce:9a:cb:9d:ae:03 brd ff:ff:ff:ff:ff:ff inet 10.10.1.21/16 brd 10.10.255.255 scope global net1 valid_lft forever preferred_lft forever inet6 fe80::cc9a:cbff:fe9d:ae03/64 scope link valid_lft forever preferred_lft forever ``` * 驗證 pod 可以透過第二張網卡互通 ``` $ kubectl exec samplepod1 -- ping -c 4 10.10.1.21 PING 10.10.1.21 (10.10.1.21): 56 data bytes 64 bytes from 10.10.1.21: seq=0 ttl=64 time=0.259 ms 64 bytes from 10.10.1.21: seq=1 ttl=64 time=0.180 ms 64 bytes from 10.10.1.21: seq=2 ttl=64 time=0.342 ms 64 bytes from 10.10.1.21: seq=3 ttl=64 time=0.187 ms --- 10.10.1.21 ping statistics --- 4 packets transmitted, 4 packets received, 0% packet loss round-trip min/avg/max = 0.180/0.242/0.342 ms ``` * 這兩個 pod 的第二張網卡都接到 `mynet0` 這個虛擬橋接器,因此可以互相溝通 ``` $ brctl show bridge name bridge id STP enabled interfaces mynet0 8000.fe6641b7a5a2 no veth8fc685fc vethdb320cdd ``` * 環境清除 ``` $ kubectl delete po samplepod1 samplepod2 $ kubectl delete network-attachment-definitions bridge-conf ``` ## pod 跨節點網路測試 下載 `whereabouts` CNI plugin,這個 plugin 是用來管理 (IPAM) 的 CNI 插件,用於在叢集內分配 Pod IP 位址。 ``` $ git clone https://github.com/k8snetworkplumbingwg/whereabouts && cd whereabouts $ kubectl apply \ -f doc/crds/daemonset-install.yaml \ -f doc/crds/whereabouts.cni.cncf.io_ippools.yaml \ -f doc/crds/whereabouts.cni.cncf.io_overlappingrangeipreservations.yaml $ kubectl -n kube-system get po -l app=whereabouts NAME READY STATUS RESTARTS AGE whereabouts-h2pqb 1/1 Running 0 24s whereabouts-km6bx 1/1 Running 0 24s whereabouts-znrhx 1/1 Running 0 24s $ sudo ls -l /opt/cni/bin/whereabouts -rwxr-xr-x 1 root root 62247427 Jun 20 11:17 /opt/cni/bin/whereabouts ``` * 建立基於 macvlan 的額外網路。可讓節點上的 Pod 可以透過實體網卡與其他節點上的 Pod (包含節點本身)溝通。附加到建立基於 macvlan 的額外網路的每個 Pod 都會取得一個唯一的 MAC。 * `"master": "ens18"` : 是指實體的網路介面名稱(例如 eth0、ens18),也就是 用來產生 macvlan 虛擬網卡的實體網卡,因此這裡要注意更換自己的網卡名稱。 * `"mode": "bridge"` : 是 macvlan 的一種運作模式,預設值,虛擬介面彼此之間與主機可以互通,但主機本身無法跟 macvlan interface 通訊。 * `"dst": "10.10.0.0/16"` : 代表加了一條 route 規則,如果 network id 是 `10.10.0.0/16` 代表會走使用 multus 做出來的 `net1` 網卡溝通,其餘的都還是透過 canal 溝通。 ``` $ cat <<EOF | kubectl create -f - apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata: name: macvlan-conf spec: config: '{ "cniVersion": "0.3.1", "type": "macvlan", "master": "ens18", "mode": "bridge", "ipam": { "type": "whereabouts", "range": "10.10.0.0/16", "routes": [ { "dst": "10.10.0.0/16" } ] } }' EOF ``` ``` $ kubectl get network-attachment-definitions NAME AGE macvlan-conf 6s ``` * 在 pod 內使用 `annotations` 宣告要使用雙網卡,並在不同節點上產生 ``` $ cat <<EOF | kubectl create -f - apiVersion: v1 kind: Pod metadata: name: samplepod1 labels: app: samplepod annotations: k8s.v1.cni.cncf.io/networks: macvlan-conf spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - samplepod topologyKey: "kubernetes.io/hostname" containers: - name: samplepod image: taiwanese/debug.alp tty: true imagePullPolicy: IfNotPresent --- apiVersion: v1 kind: Pod metadata: name: samplepod2 labels: app: samplepod annotations: k8s.v1.cni.cncf.io/networks: macvlan-conf spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - samplepod topologyKey: "kubernetes.io/hostname" containers: - name: samplepod image: taiwanese/debug.alp tty: true imagePullPolicy: IfNotPresent EOF ``` * 檢視 pod 內部雙網卡資訊 ``` $ kubectl get po -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES samplepod1 1/1 Running 0 10s 10.244.1.66 w1 <none> <none> samplepod2 1/1 Running 0 10s 10.244.2.45 w2 <none> <none> $ kubectl exec samplepod1 -- ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0@if489: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP qlen 1000 link/ether b6:7b:64:5e:09:65 brd ff:ff:ff:ff:ff:ff inet 10.244.1.29/32 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::b47b:64ff:fe5e:965/64 scope link valid_lft forever preferred_lft forever 3: net1@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 0e:44:33:10:fb:50 brd ff:ff:ff:ff:ff:ff inet 10.10.0.2/16 brd 10.10.255.255 scope global net1 valid_lft forever preferred_lft forever inet6 fe80::c44:33ff:fe10:fb50/64 scope link valid_lft forever preferred_lft forever $ kubectl exec samplepod2 -- ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0@if521: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP qlen 1000 link/ether aa:b2:91:7f:52:4c brd ff:ff:ff:ff:ff:ff inet 10.244.2.48/32 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::a8b2:91ff:fe7f:524c/64 scope link valid_lft forever preferred_lft forever 3: net1@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether ba:ba:05:b0:51:d9 brd ff:ff:ff:ff:ff:ff inet 10.10.0.1/16 brd 10.10.255.255 scope global net1 valid_lft forever preferred_lft forever inet6 fe80::b8ba:5ff:feb0:51d9/64 scope link valid_lft forever preferred_lft forever ``` * 驗證 pod 跨節點可以透過第二張網卡溝通 ``` $ kubectl exec samplepod1 -- ping -c 2 -I net1 10.10.0.1 PING 10.10.0.1 (10.10.0.1): 56 data bytes 64 bytes from 10.10.0.1: seq=0 ttl=64 time=1.551 ms 64 bytes from 10.10.0.1: seq=1 ttl=64 time=0.707 ms --- 10.10.0.1 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 0.707/1.129/1.551 ms ``` * 並且可以上網 ``` $ kubectl exec samplepod1 -- curl -Is www.google.com HTTP/1.1 200 OK Content-Type: text/html; charset=ISO-8859-1 Content-Security-Policy-Report-Only: object-src 'none';base-uri 'self';script-src 'nonce-_PgWdSsyyXM5UX_qhXgYqA' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info." Date: Mon, 23 Jun 2025 14:39:47 GMT ``` * 環境清除 ``` $ kubectl delete po samplepod1 samplepod2 ``` ## 效能測試 * 創建一個 `iperf DaemonSet` 並且宣告不使用 `multus` ``` $ echo 'apiVersion: apps/v1 kind: DaemonSet metadata: name: no-multus-iperf namespace: default spec: selector: matchLabels: app: iperf3 template: metadata: labels: app: iperf3 spec: containers: - name: iperf3 image: leodotcloud/swiss-army-knife command: ["iperf3"] args: ["-s", "-p 12345"] ports: - containerPort: 12345' | kubectl apply -f - ``` * 測試 w1 上的 pod 到 w2 上的 pod 的網速 ``` $ kubectl get po -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES no-multus-iperf-kthz7 1/1 Running 0 90s 10.244.1.3 w1 <none> <none> no-multus-iperf-lk7xm 1/1 Running 0 90s 10.244.2.3 w2 <none> <none> no-multus-iperf-lp9l5 1/1 Running 0 90s 10.244.0.6 m1 <none> <none> ``` ``` $ kubectl exec no-multus-iperf-kthz7 -- iperf3 -c 10.244.2.3 -p 12345 Connecting to host 10.244.2.3, port 12345 [ 4] local 10.244.1.3 port 41996 connected to 10.244.2.3 port 12345 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 110 MBytes 923 Mbits/sec 194 1.32 MBytes [ 4] 1.00-2.00 sec 120 MBytes 1.01 Gbits/sec 6 1.01 MBytes [ 4] 2.00-3.00 sec 102 MBytes 859 Mbits/sec 0 1.08 MBytes [ 4] 3.00-4.00 sec 110 MBytes 920 Mbits/sec 0 1.15 MBytes [ 4] 4.00-5.00 sec 129 MBytes 1.08 Gbits/sec 0 1.22 MBytes [ 4] 5.00-6.00 sec 147 MBytes 1.23 Gbits/sec 0 1.30 MBytes [ 4] 6.00-7.00 sec 165 MBytes 1.38 Gbits/sec 0 1.38 MBytes [ 4] 7.00-8.00 sec 166 MBytes 1.39 Gbits/sec 0 1.46 MBytes [ 4] 8.00-9.00 sec 187 MBytes 1.57 Gbits/sec 22 1.11 MBytes [ 4] 9.00-10.00 sec 160 MBytes 1.34 Gbits/sec 0 1.23 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 1.36 GBytes 1.17 Gbits/sec 222 sender [ 4] 0.00-10.00 sec 1.36 GBytes 1.17 Gbits/sec receiver iperf Done. ``` * 測試 pod 跨節點且跨實體機網路效能。 * 使用 canal vxlan 網路壓測,網路頻寬為 `1.17 Gbits/sec` ![image](https://hackmd.io/_uploads/ByDd5cMEel.png) 環境清除 ``` $ kubectl delete ds no-multus-iperf ``` * 創建一個 `iperf DaemonSet` 並且宣告使用 `multus` ``` $ echo 'apiVersion: apps/v1 kind: DaemonSet metadata: name: ds-iperf namespace: default spec: selector: matchLabels: app: iperf3 template: metadata: annotations: k8s.v1.cni.cncf.io/networks: macvlan-conf labels: app: iperf3 spec: containers: - name: iperf3 image: leodotcloud/swiss-army-knife command: ["iperf3"] args: ["-s", "-p 12345"] ports: - containerPort: 12345' | kubectl apply -f - ``` * 檢視 pod ip 第二張網卡資訊,並測試 w1 上的 pod 到 w2 上的 pod 的網速 ``` $ kubectl get po -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ds-iperf-5v8jc 1/1 Running 0 8s 10.244.0.7 m1 <none> <none> ds-iperf-glrzl 1/1 Running 0 8s 10.244.1.4 w1 <none> <none> ds-iperf-j5vl8 1/1 Running 0 8s 10.244.2.4 w2 <none> <none> $ kubectl exec ds-iperf-glrzl -- ip a s net1 3: net1@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 46:7f:0d:8a:aa:8b brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.10.0.1/16 brd 10.10.255.255 scope global net1 valid_lft forever preferred_lft forever inet6 fe80::447f:dff:fe8a:aa8b/64 scope link valid_lft forever preferred_lft forever $ kubectl exec ds-iperf-j5vl8 -- ip a s net1 3: net1@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 1e:7e:4c:6d:5d:f1 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.10.0.2/16 brd 10.10.255.255 scope global net1 valid_lft forever preferred_lft forever inet6 fe80::1c7e:4cff:fe6d:5df1/64 scope link valid_lft forever preferred_lft forever ``` * 測試 pod 跨節點且跨實體機網路效能。 * 實體網路為 5G ,而使用第二張網卡 macvlan 網路壓測,pod 跨節點網路頻寬為 `4.36 Gbits/sec` 幾乎可以用滿 5G 網路。 ``` $ kubectl exec ds-iperf-glrzl -- iperf3 -c 10.10.0.2 -p 12345 Connecting to host 10.10.0.2, port 12345 [ 4] local 10.10.0.1 port 48188 connected to 10.10.0.2 port 12345 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 540 MBytes 4.52 Gbits/sec 200 1.18 MBytes [ 4] 1.00-2.00 sec 561 MBytes 4.71 Gbits/sec 0 1.45 MBytes [ 4] 2.00-3.00 sec 536 MBytes 4.50 Gbits/sec 0 1.60 MBytes [ 4] 3.00-4.00 sec 495 MBytes 4.15 Gbits/sec 103 1.32 MBytes [ 4] 4.00-5.00 sec 494 MBytes 4.14 Gbits/sec 10 1.51 MBytes [ 4] 5.00-6.00 sec 655 MBytes 5.50 Gbits/sec 177 1.27 MBytes [ 4] 6.00-7.00 sec 502 MBytes 4.21 Gbits/sec 0 1.44 MBytes [ 4] 7.00-8.00 sec 498 MBytes 4.17 Gbits/sec 0 1.61 MBytes [ 4] 8.00-9.00 sec 469 MBytes 3.93 Gbits/sec 16 1.36 MBytes [ 4] 9.00-10.00 sec 454 MBytes 3.81 Gbits/sec 0 1.51 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 5.08 GBytes 4.36 Gbits/sec 506 sender [ 4] 0.00-10.00 sec 5.08 GBytes 4.36 Gbits/sec receiver iperf Done. ``` ![image](https://hackmd.io/_uploads/B1VUjcfEll.png) 環境清除 ``` $ kubectl delete ds ds-iperf ``` ## 參考 https://github.com/k8snetworkplumbingwg/multus-cni/blob/master/docs/how-to-use.md https://github.com/k8snetworkplumbingwg/multus-cni https://github.com/k8snetworkplumbingwg/whereabouts