# Plumbr on K8s: Using macvlan + plumr as a sidecar Let's run plumbr as a sidecar and send the tun traffic over a macvlan interface ### General gist is... * Create a secondary pod interface for macvlan ("net1" interface) * Plumr runs in an init container and creates a tun with IP addressing related to the macvlan interface * This is a kind of "side car" type approach, and the plumr binary runs as a background process ### Requirements * Kubernetes cluster with Multus CNI installed (and operational). ### Limitations * Uses statically defined IP addressing * Plumr is run as a side car "manually" * That is, with a yaml specification for it * This could be improved to be automated (addressing, too) * Has rather loose security restrictions for the pod itself. * This can be addressed with CNI + a controller. * Macvlan is not very friendly in public cloud networks due to "port security" so this is a limitation in public clouds. * If the plumr binary fails, the pod will keep running * This could be improved with a controller as well. ## Setup Copy the `plumr` binary to each host, in this case it's in `/t/plumrbin/plumr` Label two nodes similarly to this: ``` kubectl label node stein-node-1 plumbr-side=left kubectl label node stein-node-2 plumbr-side=right ``` In this example, there is an additional interface on each host ## Yaml resources `kubectl create` the following yaml resource... ```yaml= apiVersion: v1 kind: ConfigMap metadata: name: plumbr-entrypoint-config data: plumbr-entrypoint.sh: | #!/bin/sh TUN_IP="${TUN_IP:-10.4.0.1}" UDP_REMOTE_IP="${UDP_REMOTE_IP:-192.168.122.111}" JSON_TEMPLATE='{ "pipeline": [ { "name": "tun", "tun": { "input": "tun_input", "output": "udp_input", "ip": "%s" } }, { "name": "udp", "udp": { "input": "udp_input", "output": "tun_input", "remote_address": "%s:9999" } } ] }' echo "templating json..." JSON_CONTENT=$(printf "$JSON_TEMPLATE" "$TUN_IP" "$UDP_REMOTE_IP") echo "$JSON_CONTENT" > /shared-data/config.json echo "running plumr..." /plumrbin/plumr -f /shared-data/config.json sleep infinity --- apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata: name: macvlan-conf spec: config: '{ "cniVersion": "0.3.0", "type": "macvlan", "master": "eth1", "mode": "bridge", "ipam": { "type": "static", "capabilites": { "ips": true } } }' --- apiVersion: v1 kind: Pod metadata: name: plumbr-example-a annotations: k8s.v1.cni.cncf.io/networks: '[ { "name": "macvlan-conf", "ips": [ "192.0.2.100/24" ] } ]' spec: shareProcessNamespace: true nodeSelector: plumbr-side: left initContainers: - name: run-plumr image: dougbtv/fedora-iptools:gen1 command: ["/bin/sh", "-c", "/entrypoint/plumbr-entrypoint.sh > /shared-data/entrypoint.log 2>&1 &"] env: - name: TUN_IP value: "10.4.0.1" - name: UDP_REMOTE_IP value: "192.0.2.200" volumeMounts: - name: host-bin mountPath: /plumrbin/ - name: plumbr-entrypoint-volume mountPath: /entrypoint - name: shared-volume mountPath: /shared-data securityContext: privileged: true capabilities: add: ["NET_ADMIN","NET_RAW"] containers: - name: workload image: dougbtv/fedora-iptools:gen1 command: ["/bin/sh", "-c", "sleep 10000000000000000"] volumeMounts: - name: shared-volume mountPath: /shared-data - name: host-bin mountPath: /plumrbin/ securityContext: privileged: true volumes: - name: host-bin hostPath: path: /home/fedora/plumrbin type: Directory - name: plumbr-entrypoint-volume configMap: name: plumbr-entrypoint-config defaultMode: 0744 - name: shared-volume emptyDir: {} --- apiVersion: v1 kind: Pod metadata: name: plumbr-example-b annotations: k8s.v1.cni.cncf.io/networks: '[ { "name": "macvlan-conf", "ips": [ "192.0.2.200/24" ] } ]' spec: shareProcessNamespace: true nodeSelector: plumbr-side: right initContainers: - name: run-plumr image: dougbtv/fedora-iptools:gen1 command: ["/bin/sh", "-c", "/entrypoint/plumbr-entrypoint.sh > /shared-data/entrypoint.log 2>&1 &"] env: - name: TUN_IP value: "10.4.0.2" - name: UDP_REMOTE_IP value: "192.0.2.100" volumeMounts: - name: host-bin mountPath: /plumrbin/ - name: plumbr-entrypoint-volume mountPath: /entrypoint - name: shared-volume mountPath: /shared-data securityContext: privileged: true capabilities: add: ["NET_ADMIN","NET_RAW"] containers: - name: workload image: dougbtv/fedora-iptools:gen1 command: ["/bin/sh", "-c", "sleep 10000000000000000"] volumeMounts: - name: shared-volume mountPath: /shared-data - name: host-bin mountPath: /plumrbin/ securityContext: privileged: true volumes: - name: host-bin hostPath: path: /home/fedora/plumrbin type: Directory - name: plumbr-entrypoint-volume configMap: name: plumbr-entrypoint-config defaultMode: 0744 - name: shared-volume emptyDir: {} ``` Then you can exec into a pod... ``` [fedora@stein-master-1 ~]$ kubectl exec -it plumbr-example-a -- /bin/sh Defaulted container "workload" out of: workload, run-plumr (init) sh-5.2# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 3: eth0@if19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default link/ether 1a:a6:6f:9f:95:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.244.2.18/24 brd 10.244.2.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::18a6:6fff:fe9f:95c6/64 scope link valid_lft forever preferred_lft forever 4: net1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether b6:a1:d1:35:dc:63 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 192.0.2.100/24 brd 192.0.2.255 scope global net1 valid_lft forever preferred_lft forever inet6 fe80::b4a1:d1ff:fe35:dc63/64 scope link valid_lft forever preferred_lft forever 5: tun: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1452 qdisc fq_codel state UNKNOWN group default qlen 500 link/none inet 10.4.0.1/24 scope global tun valid_lft forever preferred_lft forever inet6 fe80::7921:f1d4:3600:435c/64 scope link stable-privacy valid_lft forever preferred_lft forever sh-5.2# ping 192.0.2.200 PING 192.0.2.200 (192.0.2.200) 56(84) bytes of data. 64 bytes from 192.0.2.200: icmp_seq=1 ttl=64 time=0.054 ms 64 bytes from 192.0.2.200: icmp_seq=2 ttl=64 time=0.054 ms 64 bytes from 192.0.2.200: icmp_seq=3 ttl=64 time=0.066 ms ^C --- 192.0.2.200 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 0.054/0.058/0.066/0.005 ms ``` Note that we're seeing: * `eth0`: default pod-to-pod network * `net1`: our additional interface which utilizes macvlan * `tun`: The `tun` device created by `plumr` We also have connectivity on net1, as evidenced by the ping. We can also review the shared disk resources (the json config and the logs for the entrypoint script) between the workload and the plumr init container with: ``` [fedora@stein-master-1 ~]$ kubectl exec -it plumbr-example-a -- /bin/sh Defaulted container "workload" out of: workload, run-plumr (init) sh-5.2# ls /shared-data/ config.json entrypoint.log sh-5.2# cat /shared-data/config.json { "pipeline": [ { "name": "tun", "tun": { "input": "tun_input", "output": "udp_input", "ip": "10.4.0.1" } }, { "name": "udp", "udp": { "input": "udp_input", "output": "tun_input", "remote_address": "192.0.2.200:9999" } } ] } sh-5.2# cat /shared-data/entrypoint.log templating json... running plumr... ``` ## Retrofitted for bridge... NOTE: This is a WIP, it's modified but I had something wrong with connectivity in my lab. ### Bridge addition... Existing IPs... node2: 10.1.3.47/24 node1: 10.1.3.166/24 ``` nmcli connection add type bridge ifname brplumr0 con-name brplumr0 nmcli connection add type ethernet ifname eth1 master brplumr0 nmcli connection modify brplumr0 connection.autoconnect yes nmcli connection up brplumr0 nmcli connection show nmcli connection modify brplumr0 ipv4.addresses 10.1.3.200/24 ipv4.method manual nmcli connection down brplumr0 nmcli connection up brplumr0 ``` Done on both nodes (generally, change the IP) I can see a ping over the bridge... ``` [root@stein-node-2 fedora]# ping 10.1.3.200 PING 10.1.3.200 (10.1.3.200) 56(84) bytes of data. 64 bytes from 10.1.3.200: icmp_seq=1 ttl=64 time=0.221 ms 64 bytes from 10.1.3.200: icmp_seq=2 ttl=64 time=0.071 ms ``` Let's see about the yaml... ```yaml= apiVersion: v1 kind: ConfigMap metadata: name: plumbr-entrypoint-config data: plumbr-entrypoint.sh: | #!/bin/sh TUN_IP="${TUN_IP:-10.4.0.1}" UDP_REMOTE_IP="${UDP_REMOTE_IP:-192.168.122.111}" JSON_TEMPLATE='{ "pipeline": [ { "name": "tun", "tun": { "input": "tun_input", "output": "udp_input", "ip": "%s" } }, { "name": "udp", "udp": { "input": "udp_input", "output": "tun_input", "remote_address": "%s:9999" } } ] }' echo "templating json..." JSON_CONTENT=$(printf "$JSON_TEMPLATE" "$TUN_IP" "$UDP_REMOTE_IP") echo "$JSON_CONTENT" > /shared-data/config.json echo "running plumr..." /plumrbin/plumr -f /shared-data/config.json sleep infinity --- apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata: name: br-plumr-config spec: config: '{ "cniVersion": "0.3.0", "type": "bridge", "bridge": "brplumr0", "mode": "bridge", "ipam": { "type": "static", "capabilites": { "ips": true } } }' --- apiVersion: v1 kind: Pod metadata: name: plumbr-example-a annotations: k8s.v1.cni.cncf.io/networks: '[ { "name": "br-plumr-config", "ips": [ "192.0.2.100/24" ] } ]' spec: shareProcessNamespace: true nodeSelector: plumbr-side: left initContainers: - name: run-plumr image: dougbtv/fedora-iptools:gen1 command: ["/bin/sh", "-c", "/entrypoint/plumbr-entrypoint.sh > /shared-data/entrypoint.log 2>&1 &"] env: - name: TUN_IP value: "10.4.0.1" - name: UDP_REMOTE_IP value: "192.0.2.200" volumeMounts: - name: host-bin mountPath: /plumrbin/ - name: plumbr-entrypoint-volume mountPath: /entrypoint - name: shared-volume mountPath: /shared-data securityContext: privileged: true capabilities: add: ["NET_ADMIN","NET_RAW"] containers: - name: workload image: dougbtv/fedora-iptools:gen2 command: ["/bin/sh", "-c", "sleep 10000000000000000"] volumeMounts: - name: shared-volume mountPath: /shared-data - name: host-bin mountPath: /plumrbin/ securityContext: privileged: true volumes: - name: host-bin hostPath: path: /home/fedora/plumrbin type: Directory - name: plumbr-entrypoint-volume configMap: name: plumbr-entrypoint-config defaultMode: 0744 - name: shared-volume emptyDir: {} --- apiVersion: v1 kind: Pod metadata: name: plumbr-example-b annotations: k8s.v1.cni.cncf.io/networks: '[ { "name": "br-plumr-config", "ips": [ "192.0.2.200/24" ] } ]' spec: shareProcessNamespace: true nodeSelector: plumbr-side: right initContainers: - name: run-plumr image: dougbtv/fedora-iptools:gen2 command: ["/bin/sh", "-c", "/entrypoint/plumbr-entrypoint.sh > /shared-data/entrypoint.log 2>&1 &"] env: - name: TUN_IP value: "10.4.0.2" - name: UDP_REMOTE_IP value: "192.0.2.100" volumeMounts: - name: host-bin mountPath: /plumrbin/ - name: plumbr-entrypoint-volume mountPath: /entrypoint - name: shared-volume mountPath: /shared-data securityContext: privileged: true capabilities: add: ["NET_ADMIN","NET_RAW"] containers: - name: workload image: dougbtv/fedora-iptools:gen2 command: ["/bin/sh", "-c", "sleep 10000000000000000"] volumeMounts: - name: shared-volume mountPath: /shared-data - name: host-bin mountPath: /plumrbin/ securityContext: privileged: true volumes: - name: host-bin hostPath: path: /home/fedora/plumrbin type: Directory - name: plumbr-entrypoint-volume configMap: name: plumbr-entrypoint-config defaultMode: 0744 - name: shared-volume emptyDir: {} ``` ### COnnectivity ### IMPORTANT The bridging works by subnet, and it's setup like this... 10.10.0.0/16 - US 10.10.1.0/24 - US (worker 1) 10.10.2.0/24 - US (worker 2) 10.20.0.0/16 - APAC 10.20.1.0/24 - apac (worker 1) 10.20.2.0/24 - apac (worker 2) "rosa label node" ``` nodeSelector: kubernetes.io/hostname: ip-10-1-246-201.ap-south-1.compute.internal ``` ## More... ``` [fedora@stein-master-1 ~]$ kubectl exec -it plumbr-example-a -c workload -- /bin/sh sh-5.2# iperf3 -s 0.0.0.0 ``` and the other side... ``` sh-5.2# iperf3 -c 192.0.2.100 -u ``` I got nothing. I also ping it plain, and I don't have connectivity either... ``` ping 192.0.2.100 ``` ## On Ali's AWS cluster ```  $ oc describe net-attach-def nad-w1 Name: nad-w1 Namespace: vlc Labels: <none> Annotations: <none> API Version: k8s.cni.cncf.io/v1 Kind: NetworkAttachmentDefinition Metadata: Creation Timestamp: 2023-09-11T14:32:45Z Generation: 1 Managed Fields: API Version: k8s.cni.cncf.io/v1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:kubectl.kubernetes.io/last-applied-configuration: f:spec: .: f:config: Manager: kubectl-client-side-apply Operation: Update Time: 2023-09-11T14:32:45Z Resource Version: 125801 UID: b4b756e6-cf35-46a2-9271-207f78410bfc Spec: Config: { "cniVersion": "0.3.1", "name": "nad-w1", "type": "bridge", "isGateway": true, "ipam": { "type": "whereabouts", "range": "10.10.1.0/24", "exclude": [ "10.10.1.1/32", "10.10.1.254/32" ], "routes": [ { "dst": "10.10.2.0/24" }, { "dst": "10.20.0.0/16" } ] } } Events: <none>  $ oc get net-attach-def -A NAMESPACE NAME AGE vlc nad-w1 31h vlc nad-w2 31h ``` # OHIO SIDE ```yaml= apiVersion: v1 kind: ConfigMap metadata: name: plumbr-entrypoint-config data: plumbr-entrypoint.sh: | #!/bin/sh TUN_IP="${TUN_IP:-10.4.0.1}" UDP_REMOTE_IP="${UDP_REMOTE_IP:-192.168.122.111}" JSON_TEMPLATE='{ "pipeline": [ { "name": "tun", "tun": { "input": "tun_input", "output": "udp_input", "ip": "%s" } }, { "name": "udp", "udp": { "input": "udp_input", "output": "tun_input", "remote_address": "%s:9999" } } ] }' echo "templating json..." JSON_CONTENT=$(printf "$JSON_TEMPLATE" "$TUN_IP" "$UDP_REMOTE_IP") echo "$JSON_CONTENT" > /shared-data/config.json echo "running plumr..." /plumrbin/plumr -f /shared-data/config.json sleep infinity --- apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata: name: br-plumr-config spec: config: '{ "cniVersion": "0.3.0", "type": "bridge", "isGateway": true, "ipam": { "type": "static", "capabilites": { "ips": true }, "routes": [ { "dst": "10.10.2.0/24" }, { "dst": "10.20.0.0/16" } ] } }' --- apiVersion: v1 kind: Pod metadata: name: plumbr-example-a annotations: k8s.v1.cni.cncf.io/networks: '[ { "name": "br-plumr-config", "ips": [ "10.10.1.200/24" ] } ]' spec: shareProcessNamespace: true nodeSelector: kubernetes.io/hostname: ip-10-0-132-142.us-east-2.compute.internal initContainers: - name: run-plumr image: dougbtv/fedora-iptools:gen1 command: ["/bin/sh", "-c", "/entrypoint/plumbr-entrypoint.sh > /shared-data/entrypoint.log 2>&1 &"] env: - name: TUN_IP value: "10.4.0.1" - name: UDP_REMOTE_IP value: "10.20.1.200" volumeMounts: - name: host-bin mountPath: /plumrbin/ - name: plumbr-entrypoint-volume mountPath: /entrypoint - name: shared-volume mountPath: /shared-data securityContext: privileged: true capabilities: add: ["NET_ADMIN","NET_RAW"] containers: - name: workload image: dougbtv/fedora-iptools:gen2 command: ["/bin/sh", "-c", "sleep 10000000000000000"] volumeMounts: - name: shared-volume mountPath: /shared-data - name: host-bin mountPath: /plumrbin/ securityContext: privileged: true volumes: - name: host-bin hostPath: path: /tmp/plumr type: Directory - name: plumbr-entrypoint-volume configMap: name: plumbr-entrypoint-config defaultMode: 0744 - name: shared-volume emptyDir: {} ``` # MUMBAI SIDE ```yaml= apiVersion: v1 kind: ConfigMap metadata: name: plumbr-entrypoint-config data: plumbr-entrypoint.sh: | #!/bin/sh TUN_IP="${TUN_IP:-10.4.0.1}" UDP_REMOTE_IP="${UDP_REMOTE_IP:-192.168.122.111}" JSON_TEMPLATE='{ "pipeline": [ { "name": "tun", "tun": { "input": "tun_input", "output": "udp_input", "ip": "%s" } }, { "name": "udp", "udp": { "input": "udp_input", "output": "tun_input", "remote_address": "%s:9999" } } ] }' echo "templating json..." JSON_CONTENT=$(printf "$JSON_TEMPLATE" "$TUN_IP" "$UDP_REMOTE_IP") echo "$JSON_CONTENT" > /shared-data/config.json echo "running plumr..." /plumrbin/plumr -f /shared-data/config.json sleep infinity --- apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata: name: br-plumr-config spec: config: '{ "cniVersion": "0.3.0", "type": "bridge", "isGateway": true, "ipam": { "type": "static", "capabilites": { "ips": true }, "routes": [ { "dst": "10.20.2.0/24" }, { "dst": "10.20.3.0/24" }, { "dst": "10.10.0.0/16" } ] } }' --- apiVersion: v1 kind: Pod metadata: name: plumbr-example-b annotations: k8s.v1.cni.cncf.io/networks: '[ { "name": "br-plumr-config", "ips": [ "10.20.1.200/24" ] } ]' spec: shareProcessNamespace: true nodeSelector: kubernetes.io/hostname: ip-10-1-241-95.ap-south-1.compute.internal initContainers: - name: run-plumr image: dougbtv/fedora-iptools:gen1 command: ["/bin/sh", "-c", "/entrypoint/plumbr-entrypoint.sh > /shared-data/entrypoint.log 2>&1 &"] env: - name: TUN_IP value: "10.4.0.2" - name: UDP_REMOTE_IP value: "10.10.1.200" volumeMounts: - name: host-bin mountPath: /plumrbin/ - name: plumbr-entrypoint-volume mountPath: /entrypoint - name: shared-volume mountPath: /shared-data securityContext: privileged: true capabilities: add: ["NET_ADMIN","NET_RAW"] containers: - name: workload image: dougbtv/fedora-iptools:gen2 command: ["/bin/sh", "-c", "sleep 10000000000000000"] volumeMounts: - name: shared-volume mountPath: /shared-data - name: host-bin mountPath: /plumrbin/ securityContext: privileged: true volumes: - name: host-bin hostPath: path: /tmp/plumr type: Directory - name: plumbr-entrypoint-volume configMap: name: plumbr-entrypoint-config defaultMode: 0744 - name: shared-volume emptyDir: {} ``` ## Stripped down This seems to work... Ohio ```yaml= apiVersion: v1 kind: Pod metadata: name: mumbai-simplebridge annotations: k8s.v1.cni.cncf.io/networks: nad-w1 spec: shareProcessNamespace: true nodeSelector: kubernetes.io/hostname: ip-10-0-132-142.us-east-2.compute.internal containers: - name: workload image: dougbtv/fedora-iptools:gen2 command: ["/bin/sh", "-c", "sleep 10000000000000000"] ``` Mumbai ```yaml= apiVersion: v1 kind: Pod metadata: name: mumbai-simplebridge annotations: k8s.v1.cni.cncf.io/networks: nad-w1 spec: shareProcessNamespace: true nodeSelector: kubernetes.io/hostname: ip-10-1-241-95.ap-south-1.compute.internal containers: - name: workload image: dougbtv/fedora-iptools:gen2 command: ["/bin/sh", "-c", "sleep 10000000000000000"] ``` ```yaml= --- apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata: name: br-plumr-config spec: config: '{ "cniVersion": "0.3.1", "name": "nad-w1", "type": "bridge", "isGateway": true, "ipam": { "type": "static", "capabilites": { "ips": true }, "routes": [ { "dst": "10.20.2.0/24" }, { "dst": "10.20.3.0/24" }, { "dst": "10.10.0.0/16" } ] } }' --- apiVersion: v1 kind: Pod metadata: name: mumbai-stripped annotations: k8s.v1.cni.cncf.io/networks: '[ { "name": "br-plumr-config", "ips": [ "10.20.1.200/24" ] } ]' spec: shareProcessNamespace: true nodeSelector: kubernetes.io/hostname: ip-10-1-241-95.ap-south-1.compute.internal containers: - name: workload image: dougbtv/fedora-iptools:gen2 command: ["/bin/sh", "-c", "sleep 10000000000000000"] ```