# Kube-vip on RKE2 Kube-vip 這個專案有兩個功能 1. Control-plane 的 vip 2. 提供 Loadbalence Service 的 ip 實作環境 > VIP=192.168.11.107 > rke2-m1 IP=192.168.11.108 > rke2-m2 IP=192.168.11.109 > rke2-m3 IP=192.168.11.110 ## 實作 ### 安裝 rke2 m1 * 環境變數宣告 ``` # 注意自己的網卡名稱,以及每台節點的 ip $ export RKE2_API_VIP=192.168.11.107 $ export RKE2_NODE_0_IP=192.168.11.108 $ export RKE2_NODE_1_IP=192.168.11.109 $ export RKE2_NODE_2_IP=192.168.11.110 $ export INTERFACE=eth1 $ export KUBE_VIP_VERSION=latest ``` * 設定 rke2 資訊 ``` $ curl -sfL https://get.rke2.io --output install.sh $ chmod +x install.sh $ sudo mkdir -p /etc/rancher/rke2/ $ cat <<EOF | tee /etc/rancher/rke2/config.yaml node-name: - "rke2-m1" token: my-shared-secret tls-san: - ${RKE2_API_VIP} - ${RKE2_NODE_0_IP} - ${RKE2_NODE_1_IP} - ${RKE2_NODE_2_IP} etcd-extra-env: TZ=Asia/Taipei kube-apiserver-extra-env: TZ=Asia/Taipei kube-controller-manager-extra-env: TZ=Asia/Taipei kube-proxy-extra-env: TZ=Asia/Taipei kube-scheduler-extra-env: TZ=Asia/Taipei cloud-controller-manager-extra-env: TZ=Asia/Taipei EOF $ cat /etc/rancher/rke2/config.yaml node-name: - "rke2-m1" token: my-shared-secret tls-san: - 192.168.11.107 - 192.168.11.108 - 192.168.11.109 - 192.168.11.110 etcd-extra-env: TZ=Asia/Taipei kube-apiserver-extra-env: TZ=Asia/Taipei kube-controller-manager-extra-env: TZ=Asia/Taipei kube-proxy-extra-env: TZ=Asia/Taipei kube-scheduler-extra-env: TZ=Asia/Taipei cloud-controller-manager-extra-env: TZ=Asia/Taipei ``` * 啟用 rke2 叢集 ``` $ sudo INSTALL_RKE2_CHANNEL=v1.31.3+rke2r1 ./install.sh $ export PATH=$PATH:/opt/rke2/bin $ sudo systemctl enable --now rke2-server ``` * 複製 kubeconfig ``` $ mkdir ~/.kube $ sudo cp /etc/rancher/rke2/rke2.yaml ~/.kube/config $ sudo cp /var/lib/rancher/rke2/bin/* /usr/local/bin/ $ sudo cp /opt/rke2/bin/* /usr/local/bin/ $ kubectl get node NAME STATUS ROLES AGE VERSION rke2-m1 Ready control-plane,etcd,master 2m v1.31.3+rke2r1 ``` * 建立 Kube-VIP RBAC ``` $ curl https://kube-vip.io/manifests/rbac.yaml > /var/lib/rancher/rke2/server/manifests/kube-vip-rbac.yaml ``` ``` $ /var/lib/rancher/rke2/bin/crictl -r "unix:///run/k3s/containerd/containerd.sock" pull ghcr.io/kube-vip/kube-vip:$KUBE_VIP_VERSION $ CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock ctr -n k8s.io run \ --rm \ --net-host \ ghcr.io/kube-vip/kube-vip:$KUBE_VIP_VERSION vip /kube-vip manifest daemonset --arp --interface $INTERFACE --address $RKE2_API_VIP --controlplane --leaderElection --taint --services --inCluster | tee /var/lib/rancher/rke2/server/manifests/kube-vip.yaml ``` * 檢查 kube-vip-ds pod 是否正常 ``` $ kubectl -n kube-system get po NAME READY STATUS RESTARTS AGE cloud-controller-manager-rke2-m1 1/1 Running 1 (10m ago) 10m etcd-rke2-m1 1/1 Running 0 10m helm-install-rke2-canal-kqm27 0/1 Completed 0 10m helm-install-rke2-coredns-mnm22 0/1 Completed 0 10m helm-install-rke2-ingress-nginx-2694w 0/1 Completed 0 10m helm-install-rke2-metrics-server-5mfg4 0/1 Completed 0 10m helm-install-rke2-snapshot-controller-crd-wprzq 0/1 Completed 0 10m helm-install-rke2-snapshot-controller-nwlxk 0/1 Completed 1 10m helm-install-rke2-snapshot-validation-webhook-x2zjb 0/1 Completed 0 10m kube-apiserver-rke2-m1 1/1 Running 0 10m kube-controller-manager-rke2-m1 1/1 Running 0 10m kube-proxy-rke2-m1 1/1 Running 0 10m kube-scheduler-rke2-m1 1/1 Running 0 10m kube-vip-ds-m6psp 1/1 Running 0 50s rke2-canal-6n9vl 2/2 Running 0 10m rke2-coredns-rke2-coredns-9579797d8-nwblk 1/1 Running 0 10m rke2-coredns-rke2-coredns-autoscaler-78db5d674-t8szc 1/1 Running 0 10m rke2-ingress-nginx-controller-h4587 1/1 Running 0 9m52s rke2-metrics-server-7c85d458bd-htktm 1/1 Running 0 10m rke2-snapshot-controller-65bc6fbd57-9hl2j 1/1 Running 0 10m rke2-snapshot-validation-webhook-859c7896df-jjchr 1/1 Running 0 10m ``` * 檢查 eth1 網卡多了 192.168.11.107 vip ``` $ ip a s eth1 2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether bc:24:11:ce:bf:dd brd ff:ff:ff:ff:ff:ff altname enp0s18 altname ens18 inet 192.168.11.108/24 brd 192.168.11.255 scope global eth1 valid_lft forever preferred_lft forever inet 192.168.11.107/32 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::be24:11ff:fece:bfdd/64 scope link proto kernel_ll valid_lft forever preferred_lft forever ``` ### 安裝 rke2 m2 * 環境變數宣告 ``` # 注意自己的網卡名稱,以及每台節點的 ip $ export RKE2_API_VIP=192.168.11.107 $ export RKE2_NODE_0_IP=192.168.11.108 $ export RKE2_NODE_1_IP=192.168.11.109 $ export RKE2_NODE_2_IP=192.168.11.110 ``` * 設定 rke2 資訊 ``` $ curl -sfL https://get.rke2.io --output install.sh $ chmod +x install.sh $ sudo mkdir -p /etc/rancher/rke2/ $ cat <<EOF | tee /etc/rancher/rke2/config.yaml server: https://${RKE2_API_VIP}:9345 node-name: - "rke2-m2" token: my-shared-secret tls-san: - ${RKE2_API_VIP} - ${RKE2_NODE_0_IP} - ${RKE2_NODE_1_IP} - ${RKE2_NODE_2_IP} etcd-extra-env: TZ=Asia/Taipei kube-apiserver-extra-env: TZ=Asia/Taipei kube-controller-manager-extra-env: TZ=Asia/Taipei kube-proxy-extra-env: TZ=Asia/Taipei kube-scheduler-extra-env: TZ=Asia/Taipei cloud-controller-manager-extra-env: TZ=Asia/Taipei EOF $ cat /etc/rancher/rke2/config.yaml node-name: - "rke2-m2" token: my-shared-secret tls-san: - 192.168.11.107 - 192.168.11.108 - 192.168.11.109 - 192.168.11.110 etcd-extra-env: TZ=Asia/Taipei kube-apiserver-extra-env: TZ=Asia/Taipei kube-controller-manager-extra-env: TZ=Asia/Taipei kube-proxy-extra-env: TZ=Asia/Taipei kube-scheduler-extra-env: TZ=Asia/Taipei cloud-controller-manager-extra-env: TZ=Asia/Taipei ``` * 啟用 rke2 叢集 ``` $ sudo INSTALL_RKE2_CHANNEL=v1.31.3+rke2r1 ./install.sh $ export PATH=$PATH:/opt/rke2/bin $ sudo systemctl enable --now rke2-server ``` * 複製 kubeconfig ``` $ mkdir ~/.kube $ sudo cp /etc/rancher/rke2/rke2.yaml ~/.kube/config $ sudo cp /var/lib/rancher/rke2/bin/* /usr/local/bin/ $ sudo cp /opt/rke2/bin/* /usr/local/bin/ $ kubectl get node NAME STATUS ROLES AGE VERSION rke2-m1 Ready control-plane,etcd,master 2d23h v1.31.3+rke2r1 rke2-m2 Ready control-plane,etcd,master 38s v1.31.3+rke2r1 ``` * 檢視 kube-vip pod 狀態 ``` $ kubectl -n kube-system get po -l app.kubernetes.io/name=kube-vip-ds NAME READY STATUS RESTARTS AGE kube-vip-ds-m6psp 1/1 Running 4 (11m ago) 2d23h kube-vip-ds-n49w7 1/1 Running 0 112s ``` * 檢視 eth1 網卡 ``` $ ip a s eth1 2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether bc:24:11:c1:eb:d9 brd ff:ff:ff:ff:ff:ff altname enp0s18 altname ens18 inet 192.168.11.109/24 brd 192.168.11.255 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::be24:11ff:fec1:ebd9/64 scope link proto kernel_ll valid_lft forever preferred_lft forever ``` ### 安裝 rke2 m3 * 環境變數宣告 ``` # 注意自己的網卡名稱,以及每台節點的 ip $ export RKE2_API_VIP=192.168.11.107 $ export RKE2_NODE_0_IP=192.168.11.108 $ export RKE2_NODE_1_IP=192.168.11.109 $ export RKE2_NODE_2_IP=192.168.11.110 ``` * 設定 rke2 資訊 ``` $ curl -sfL https://get.rke2.io --output install.sh $ chmod +x install.sh $ sudo mkdir -p /etc/rancher/rke2/ $ cat <<EOF | tee /etc/rancher/rke2/config.yaml server: https://${RKE2_API_VIP}:9345 node-name: - "rke2-m3" token: my-shared-secret tls-san: - ${RKE2_API_VIP} - ${RKE2_NODE_0_IP} - ${RKE2_NODE_1_IP} - ${RKE2_NODE_2_IP} etcd-extra-env: TZ=Asia/Taipei kube-apiserver-extra-env: TZ=Asia/Taipei kube-controller-manager-extra-env: TZ=Asia/Taipei kube-proxy-extra-env: TZ=Asia/Taipei kube-scheduler-extra-env: TZ=Asia/Taipei cloud-controller-manager-extra-env: TZ=Asia/Taipei EOF $ cat /etc/rancher/rke2/config.yaml node-name: - "rke2-m3" token: my-shared-secret tls-san: - 192.168.11.107 - 192.168.11.108 - 192.168.11.109 - 192.168.11.110 etcd-extra-env: TZ=Asia/Taipei kube-apiserver-extra-env: TZ=Asia/Taipei kube-controller-manager-extra-env: TZ=Asia/Taipei kube-proxy-extra-env: TZ=Asia/Taipei kube-scheduler-extra-env: TZ=Asia/Taipei cloud-controller-manager-extra-env: TZ=Asia/Taipei ``` * 啟用 rke2 叢集 ``` $ sudo INSTALL_RKE2_CHANNEL=v1.31.3+rke2r1 ./install.sh $ export PATH=$PATH:/opt/rke2/bin $ sudo systemctl enable --now rke2-server ``` * 複製 kubeconfig ``` $ mkdir ~/.kube $ sudo cp /etc/rancher/rke2/rke2.yaml ~/.kube/config $ sudo cp /var/lib/rancher/rke2/bin/* /usr/local/bin/ $ sudo cp /opt/rke2/bin/* /usr/local/bin/ $ kubectl get node NAME STATUS ROLES AGE VERSION rke2-m1 Ready control-plane,etcd,master 2d23h v1.31.3+rke2r1 rke2-m2 Ready control-plane,etcd,master 5m56s v1.31.3+rke2r1 rke2-m3 Ready control-plane,etcd,master 21s v1.31.3+rke2r1 ``` * 檢視 kube-vip pod 狀態 ``` $ kubectl -n kube-system get po -l app.kubernetes.io/name=kube-vip-ds NAME READY STATUS RESTARTS AGE kube-vip-ds-2qpqb 1/1 Running 0 2m26s kube-vip-ds-m6psp 1/1 Running 4 (17m ago) 2d23h kube-vip-ds-n49w7 1/1 Running 0 7m57s ``` * 檢視 eth1 網卡 ``` $ ip a s eth1 2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether bc:24:11:c9:5d:4e brd ff:ff:ff:ff:ff:ff altname enp0s18 altname ens18 inet 192.168.11.110/24 brd 192.168.11.255 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::be24:11ff:fec9:5d4e/64 scope link proto kernel_ll valid_lft forever preferred_lft forever ``` ## 獲取 kube-vip kubeconfig * 在其中一台 master 執行,將 kubeconfig 複製到有 kubectl 指令的機器上執行。 ``` $ sed "s|server: https://127.0.0.1:6443|server: https://$RKE2_API_VIP:6443|" /etc/rancher/rke2/rke2.yaml apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJlakNDQVIrZ0F3SUJBZ0lCQURBS0JnZ3Foa2pPUFFRREFqQWtNU0l3SUFZRFZRUUREQmx5YTJVeUxYTmwKY25abGNpMWpZVUF4TnpNNU5USXlOall6TUI0WERUSTFNREl4TkRBNE5EUXlNMW9YRFRNMU1ESXhNakE0TkRReQpNMW93SkRFaU1DQUdBMVVFQXd3WmNtdGxNaTF6WlhKMlpYSXRZMkZBTVRjek9UVXlNalkyTXpCWk1CTUdCeXFHClNNNDlBZ0VHQ0NxR1NNNDlBd0VIQTBJQUJJc3d5Mnk2M2VKMHdmVVdocVlYRzl6WjJaQ1dyNDI5amFPTkF0OVIKcWV0eHpQNTRtSVdjZXJJMDBXdlFmOG1lNnlRNk9JYnR2dFdSQmI3aHd2cXRmYnFqUWpCQU1BNEdBMVVkRHdFQgovd1FFQXdJQ3BEQVBCZ05WSFJNQkFmOEVCVEFEQVFIL01CMEdBMVVkRGdRV0JCVC9yWVN3NXR0bXpvOVRWMmpiClNDQ1gvMVhHc0RBS0JnZ3Foa2pPUFFRREFnTkpBREJHQWlFQTNoMU03blQvRzkyQzk5MUpNUGVxbTQxQnhFT1MKZlBZNXc3MjlPV0NGZm1nQ0lRQ3paMUZTTlIwTnZPOXBFMkljUW5wMWkzbkZ6MXlRcUFOWkRnajd4Y3A0Y2c9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://192.168.11.107:6443 ...... ``` ``` $ kubectl get no NAME STATUS ROLES AGE VERSION rke2-m1 Ready control-plane,etcd,master 2d23h v1.31.3+rke2r1 rke2-m2 Ready control-plane,etcd,master 20m v1.31.3+rke2r1 rke2-m3 Ready control-plane,etcd,master 15m v1.31.3+rke2r1 ``` ### 驗證 control plane 容錯 * vip 目前在 rke2-m1,將 rke2-m1 關機 ``` rke2-m1:~ # ip a s eth1 2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether bc:24:11:ce:bf:dd brd ff:ff:ff:ff:ff:ff altname enp0s18 altname ens18 inet 192.168.11.108/24 brd 192.168.11.255 scope global eth1 valid_lft forever preferred_lft forever inet 192.168.11.107/32 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::be24:11ff:fece:bfdd/64 scope link proto kernel_ll valid_lft forever preferred_lft forever rke2-m1:~ # poweroff ``` * rke2-m1 關機後還是可以透過 vip 訪問到 k8s 叢集 ``` $ kubectl get no NAME STATUS ROLES AGE VERSION rke2-m1 NotReady control-plane,etcd,master 2d23h v1.31.3+rke2r1 rke2-m2 Ready control-plane,etcd,master 24m v1.31.3+rke2r1 rke2-m3 Ready control-plane,etcd,master 19m v1.31.3+rke2r1 ``` * 檢查 kube-vip pod logs 可以看到 rke2-m3 成為 leader ``` $ kubectl -n kube-system logs kube-vip-ds-2qpqb time="2025-02-17T08:18:29Z" level=info msg="Starting kube-vip.io [v0.8.9]" time="2025-02-17T08:18:29Z" level=info msg="Build kube-vip.io [19e660d4a692fab29f407214b452f48d9a65425e]" time="2025-02-17T08:18:29Z" level=info msg="namespace [kube-system], Mode: [ARP], Features(s): Control Plane:[true], Services:[true]" time="2025-02-17T08:18:29Z" level=info msg="Using node name [rke2-m3]" time="2025-02-17T08:18:29Z" level=info msg="prometheus HTTP server started" time="2025-02-17T08:18:29Z" level=info msg="Starting Kube-vip Manager with the ARP engine" time="2025-02-17T08:18:29Z" level=info msg="Starting UPNP Port Refresher" time="2025-02-17T08:18:29Z" level=info msg="beginning services leadership, namespace [kube-system], lock name [plndr-svcs-lock], id [rke2-m3]" I0217 08:18:29.723825 1 leaderelection.go:257] attempting to acquire leader lease kube-system/plndr-svcs-lock... time="2025-02-17T08:18:29Z" level=info msg="Beginning cluster membership, namespace [kube-system], lock name [plndr-cp-lock], id [rke2-m3]" I0217 08:18:29.723922 1 leaderelection.go:257] attempting to acquire leader lease kube-system/plndr-cp-lock... time="2025-02-17T08:18:29Z" level=info msg="Node [rke2-m1] is assuming leadership of the cluster" time="2025-02-17T08:18:29Z" level=info msg="new leader elected: rke2-m1" time="2025-02-17T08:23:29Z" level=info msg="[UPNP] Refreshing 0 Instances" time="2025-02-17T08:28:29Z" level=info msg="[UPNP] Refreshing 0 Instances" time="2025-02-17T08:33:29Z" level=info msg="[UPNP] Refreshing 0 Instances" E0217 08:34:51.477325 1 leaderelection.go:436] error retrieving resource lock kube-system/plndr-cp-lock: etcdserver: leader changed I0217 08:34:52.982930 1 leaderelection.go:271] successfully acquired lease kube-system/plndr-svcs-lock time="2025-02-17T08:34:52Z" level=info msg="(svcs) starting services watcher for all namespaces" I0217 08:34:53.193399 1 leaderelection.go:271] successfully acquired lease kube-system/plndr-cp-lock time="2025-02-17T08:34:53Z" level=info msg="Node [rke2-m3] is assuming leadership of the cluster" time="2025-02-17T08:34:53Z" level=info msg="Gratuitous Arp broadcast will repeat every 3 seconds for [192.168.11.107/eth1]" ``` * 進到 rke2-m3 查看 eth1 網卡多了 192.168.11.107 vip ``` rke2-m3:~ # ip a s eth1 2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether bc:24:11:c9:5d:4e brd ff:ff:ff:ff:ff:ff altname enp0s18 altname ens18 inet 192.168.11.110/24 brd 192.168.11.255 scope global eth1 valid_lft forever preferred_lft forever inet 192.168.11.107/32 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::be24:11ff:fec9:5d4e/64 scope link proto kernel_ll valid_lft forever preferred_lft forever ``` ## 實作 Load Balancer 功能(kube-vip-cloud-provider) * 部屬 kube-vip-cloud-controller ``` $ kubectl apply -f https://raw.githubusercontent.com/kube-vip/kube-vip-cloud-provider/main/manifest/kube-vip-cloud-controller.yaml $ kubectl -n kube-system get po -l app=kube-vip NAME READY STATUS RESTARTS AGE kube-vip-cloud-provider-fb9c65946-85rd2 1/1 Running 0 5m26s ``` * 建立一個 IP range,可用範圍是 `192.168.11.111-192.168.11.114` ``` $ kubectl create configmap --namespace kube-system kubevip --from-literal range-global=192.168.11.111-192.168.11.114 ``` ``` $ kubectl -n kube-system describe cm kubevip Name: kubevip Namespace: kube-system Labels: <none> Annotations: <none> Data ==== range-global: ---- 192.168.11.111-192.168.11.114 BinaryData ==== Events: <none> ``` ### 驗證 * 建立 deployment ``` $ echo 'apiVersion: apps/v1 kind: Deployment metadata: name: s1.dep spec: replicas: 2 selector: matchLabels: app: s1.dep template: metadata: labels: app: s1.dep spec: containers: - name: app image: quay.io/flysangel/image:app.golang' | kubectl apply -f - ``` * 建立 LoadBalancer service ``` $ echo 'apiVersion: v1 kind: Service metadata: name: s1 spec: ports: - port: 80 targetPort: 8080 selector: app: s1.dep type: LoadBalancer' | kubectl apply -f - ``` ``` $ kubectl get pod,svc NAME READY STATUS RESTARTS AGE pod/s1.dep-6f5cb5fc48-p5c9d 1/1 Running 0 12s pod/s1.dep-6f5cb5fc48-rktxq 1/1 Running 0 12s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 3d service/s1 LoadBalancer 10.43.241.107 192.168.11.111 80:30896/TCP 7s ``` * 在叢集外訪問 ``` $ curl 192.168.11.111 {"message":"Hello Golang"} ``` ## 參考 https://github.com/kube-vip/kube-vip-cloud-provider/blob/main/README.md https://portal.nutanix.com/page/documents/solutions/details?targetId=BP-2103-Rancher-SUSE-Nutanix:deploy-highly-available-rke2-with-kube-vip.html https://docs.expertflow.com/cx/4.4/rke2-deployment-in-high-availability-with-kube-vip https://kube-vip.io/docs/usage/kind/