# Kube-vip on RKE2
Kube-vip 這個專案有兩個功能
1. Control-plane 的 vip
2. 提供 Loadbalence Service 的 ip
實作環境
> VIP=192.168.11.107
> rke2-m1 IP=192.168.11.108
> rke2-m2 IP=192.168.11.109
> rke2-m3 IP=192.168.11.110
## 實作
### 安裝 rke2 m1
* 環境變數宣告
```
# 注意自己的網卡名稱,以及每台節點的 ip
$ export RKE2_API_VIP=192.168.11.107
$ export RKE2_NODE_0_IP=192.168.11.108
$ export RKE2_NODE_1_IP=192.168.11.109
$ export RKE2_NODE_2_IP=192.168.11.110
$ export INTERFACE=eth1
$ export KUBE_VIP_VERSION=latest
```
* 設定 rke2 資訊
```
$ curl -sfL https://get.rke2.io --output install.sh
$ chmod +x install.sh
$ sudo mkdir -p /etc/rancher/rke2/
$ cat <<EOF | tee /etc/rancher/rke2/config.yaml
node-name:
- "rke2-m1"
token: my-shared-secret
tls-san:
- ${RKE2_API_VIP}
- ${RKE2_NODE_0_IP}
- ${RKE2_NODE_1_IP}
- ${RKE2_NODE_2_IP}
etcd-extra-env: TZ=Asia/Taipei
kube-apiserver-extra-env: TZ=Asia/Taipei
kube-controller-manager-extra-env: TZ=Asia/Taipei
kube-proxy-extra-env: TZ=Asia/Taipei
kube-scheduler-extra-env: TZ=Asia/Taipei
cloud-controller-manager-extra-env: TZ=Asia/Taipei
EOF
$ cat /etc/rancher/rke2/config.yaml
node-name:
- "rke2-m1"
token: my-shared-secret
tls-san:
- 192.168.11.107
- 192.168.11.108
- 192.168.11.109
- 192.168.11.110
etcd-extra-env: TZ=Asia/Taipei
kube-apiserver-extra-env: TZ=Asia/Taipei
kube-controller-manager-extra-env: TZ=Asia/Taipei
kube-proxy-extra-env: TZ=Asia/Taipei
kube-scheduler-extra-env: TZ=Asia/Taipei
cloud-controller-manager-extra-env: TZ=Asia/Taipei
```
* 啟用 rke2 叢集
```
$ sudo INSTALL_RKE2_CHANNEL=v1.31.3+rke2r1 ./install.sh
$ export PATH=$PATH:/opt/rke2/bin
$ sudo systemctl enable --now rke2-server
```
* 複製 kubeconfig
```
$ mkdir ~/.kube
$ sudo cp /etc/rancher/rke2/rke2.yaml ~/.kube/config
$ sudo cp /var/lib/rancher/rke2/bin/* /usr/local/bin/
$ sudo cp /opt/rke2/bin/* /usr/local/bin/
$ kubectl get node
NAME STATUS ROLES AGE VERSION
rke2-m1 Ready control-plane,etcd,master 2m v1.31.3+rke2r1
```
* 建立 Kube-VIP RBAC
```
$ curl https://kube-vip.io/manifests/rbac.yaml > /var/lib/rancher/rke2/server/manifests/kube-vip-rbac.yaml
```
```
$ /var/lib/rancher/rke2/bin/crictl -r "unix:///run/k3s/containerd/containerd.sock" pull ghcr.io/kube-vip/kube-vip:$KUBE_VIP_VERSION
$ CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock ctr -n k8s.io run \
--rm \
--net-host \
ghcr.io/kube-vip/kube-vip:$KUBE_VIP_VERSION vip /kube-vip manifest daemonset --arp --interface $INTERFACE --address $RKE2_API_VIP --controlplane --leaderElection --taint --services --inCluster | tee /var/lib/rancher/rke2/server/manifests/kube-vip.yaml
```
* 檢查 kube-vip-ds pod 是否正常
```
$ kubectl -n kube-system get po
NAME READY STATUS RESTARTS AGE
cloud-controller-manager-rke2-m1 1/1 Running 1 (10m ago) 10m
etcd-rke2-m1 1/1 Running 0 10m
helm-install-rke2-canal-kqm27 0/1 Completed 0 10m
helm-install-rke2-coredns-mnm22 0/1 Completed 0 10m
helm-install-rke2-ingress-nginx-2694w 0/1 Completed 0 10m
helm-install-rke2-metrics-server-5mfg4 0/1 Completed 0 10m
helm-install-rke2-snapshot-controller-crd-wprzq 0/1 Completed 0 10m
helm-install-rke2-snapshot-controller-nwlxk 0/1 Completed 1 10m
helm-install-rke2-snapshot-validation-webhook-x2zjb 0/1 Completed 0 10m
kube-apiserver-rke2-m1 1/1 Running 0 10m
kube-controller-manager-rke2-m1 1/1 Running 0 10m
kube-proxy-rke2-m1 1/1 Running 0 10m
kube-scheduler-rke2-m1 1/1 Running 0 10m
kube-vip-ds-m6psp 1/1 Running 0 50s
rke2-canal-6n9vl 2/2 Running 0 10m
rke2-coredns-rke2-coredns-9579797d8-nwblk 1/1 Running 0 10m
rke2-coredns-rke2-coredns-autoscaler-78db5d674-t8szc 1/1 Running 0 10m
rke2-ingress-nginx-controller-h4587 1/1 Running 0 9m52s
rke2-metrics-server-7c85d458bd-htktm 1/1 Running 0 10m
rke2-snapshot-controller-65bc6fbd57-9hl2j 1/1 Running 0 10m
rke2-snapshot-validation-webhook-859c7896df-jjchr 1/1 Running 0 10m
```
* 檢查 eth1 網卡多了 192.168.11.107 vip
```
$ ip a s eth1
2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether bc:24:11:ce:bf:dd brd ff:ff:ff:ff:ff:ff
altname enp0s18
altname ens18
inet 192.168.11.108/24 brd 192.168.11.255 scope global eth1
valid_lft forever preferred_lft forever
inet 192.168.11.107/32 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::be24:11ff:fece:bfdd/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
```
### 安裝 rke2 m2
* 環境變數宣告
```
# 注意自己的網卡名稱,以及每台節點的 ip
$ export RKE2_API_VIP=192.168.11.107
$ export RKE2_NODE_0_IP=192.168.11.108
$ export RKE2_NODE_1_IP=192.168.11.109
$ export RKE2_NODE_2_IP=192.168.11.110
```
* 設定 rke2 資訊
```
$ curl -sfL https://get.rke2.io --output install.sh
$ chmod +x install.sh
$ sudo mkdir -p /etc/rancher/rke2/
$ cat <<EOF | tee /etc/rancher/rke2/config.yaml
server: https://${RKE2_API_VIP}:9345
node-name:
- "rke2-m2"
token: my-shared-secret
tls-san:
- ${RKE2_API_VIP}
- ${RKE2_NODE_0_IP}
- ${RKE2_NODE_1_IP}
- ${RKE2_NODE_2_IP}
etcd-extra-env: TZ=Asia/Taipei
kube-apiserver-extra-env: TZ=Asia/Taipei
kube-controller-manager-extra-env: TZ=Asia/Taipei
kube-proxy-extra-env: TZ=Asia/Taipei
kube-scheduler-extra-env: TZ=Asia/Taipei
cloud-controller-manager-extra-env: TZ=Asia/Taipei
EOF
$ cat /etc/rancher/rke2/config.yaml
node-name:
- "rke2-m2"
token: my-shared-secret
tls-san:
- 192.168.11.107
- 192.168.11.108
- 192.168.11.109
- 192.168.11.110
etcd-extra-env: TZ=Asia/Taipei
kube-apiserver-extra-env: TZ=Asia/Taipei
kube-controller-manager-extra-env: TZ=Asia/Taipei
kube-proxy-extra-env: TZ=Asia/Taipei
kube-scheduler-extra-env: TZ=Asia/Taipei
cloud-controller-manager-extra-env: TZ=Asia/Taipei
```
* 啟用 rke2 叢集
```
$ sudo INSTALL_RKE2_CHANNEL=v1.31.3+rke2r1 ./install.sh
$ export PATH=$PATH:/opt/rke2/bin
$ sudo systemctl enable --now rke2-server
```
* 複製 kubeconfig
```
$ mkdir ~/.kube
$ sudo cp /etc/rancher/rke2/rke2.yaml ~/.kube/config
$ sudo cp /var/lib/rancher/rke2/bin/* /usr/local/bin/
$ sudo cp /opt/rke2/bin/* /usr/local/bin/
$ kubectl get node
NAME STATUS ROLES AGE VERSION
rke2-m1 Ready control-plane,etcd,master 2d23h v1.31.3+rke2r1
rke2-m2 Ready control-plane,etcd,master 38s v1.31.3+rke2r1
```
* 檢視 kube-vip pod 狀態
```
$ kubectl -n kube-system get po -l app.kubernetes.io/name=kube-vip-ds
NAME READY STATUS RESTARTS AGE
kube-vip-ds-m6psp 1/1 Running 4 (11m ago) 2d23h
kube-vip-ds-n49w7 1/1 Running 0 112s
```
* 檢視 eth1 網卡
```
$ ip a s eth1
2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether bc:24:11:c1:eb:d9 brd ff:ff:ff:ff:ff:ff
altname enp0s18
altname ens18
inet 192.168.11.109/24 brd 192.168.11.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::be24:11ff:fec1:ebd9/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
```
### 安裝 rke2 m3
* 環境變數宣告
```
# 注意自己的網卡名稱,以及每台節點的 ip
$ export RKE2_API_VIP=192.168.11.107
$ export RKE2_NODE_0_IP=192.168.11.108
$ export RKE2_NODE_1_IP=192.168.11.109
$ export RKE2_NODE_2_IP=192.168.11.110
```
* 設定 rke2 資訊
```
$ curl -sfL https://get.rke2.io --output install.sh
$ chmod +x install.sh
$ sudo mkdir -p /etc/rancher/rke2/
$ cat <<EOF | tee /etc/rancher/rke2/config.yaml
server: https://${RKE2_API_VIP}:9345
node-name:
- "rke2-m3"
token: my-shared-secret
tls-san:
- ${RKE2_API_VIP}
- ${RKE2_NODE_0_IP}
- ${RKE2_NODE_1_IP}
- ${RKE2_NODE_2_IP}
etcd-extra-env: TZ=Asia/Taipei
kube-apiserver-extra-env: TZ=Asia/Taipei
kube-controller-manager-extra-env: TZ=Asia/Taipei
kube-proxy-extra-env: TZ=Asia/Taipei
kube-scheduler-extra-env: TZ=Asia/Taipei
cloud-controller-manager-extra-env: TZ=Asia/Taipei
EOF
$ cat /etc/rancher/rke2/config.yaml
node-name:
- "rke2-m3"
token: my-shared-secret
tls-san:
- 192.168.11.107
- 192.168.11.108
- 192.168.11.109
- 192.168.11.110
etcd-extra-env: TZ=Asia/Taipei
kube-apiserver-extra-env: TZ=Asia/Taipei
kube-controller-manager-extra-env: TZ=Asia/Taipei
kube-proxy-extra-env: TZ=Asia/Taipei
kube-scheduler-extra-env: TZ=Asia/Taipei
cloud-controller-manager-extra-env: TZ=Asia/Taipei
```
* 啟用 rke2 叢集
```
$ sudo INSTALL_RKE2_CHANNEL=v1.31.3+rke2r1 ./install.sh
$ export PATH=$PATH:/opt/rke2/bin
$ sudo systemctl enable --now rke2-server
```
* 複製 kubeconfig
```
$ mkdir ~/.kube
$ sudo cp /etc/rancher/rke2/rke2.yaml ~/.kube/config
$ sudo cp /var/lib/rancher/rke2/bin/* /usr/local/bin/
$ sudo cp /opt/rke2/bin/* /usr/local/bin/
$ kubectl get node
NAME STATUS ROLES AGE VERSION
rke2-m1 Ready control-plane,etcd,master 2d23h v1.31.3+rke2r1
rke2-m2 Ready control-plane,etcd,master 5m56s v1.31.3+rke2r1
rke2-m3 Ready control-plane,etcd,master 21s v1.31.3+rke2r1
```
* 檢視 kube-vip pod 狀態
```
$ kubectl -n kube-system get po -l app.kubernetes.io/name=kube-vip-ds
NAME READY STATUS RESTARTS AGE
kube-vip-ds-2qpqb 1/1 Running 0 2m26s
kube-vip-ds-m6psp 1/1 Running 4 (17m ago) 2d23h
kube-vip-ds-n49w7 1/1 Running 0 7m57s
```
* 檢視 eth1 網卡
```
$ ip a s eth1
2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether bc:24:11:c9:5d:4e brd ff:ff:ff:ff:ff:ff
altname enp0s18
altname ens18
inet 192.168.11.110/24 brd 192.168.11.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::be24:11ff:fec9:5d4e/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
```
## 獲取 kube-vip kubeconfig
* 在其中一台 master 執行,將 kubeconfig 複製到有 kubectl 指令的機器上執行。
```
$ sed "s|server: https://127.0.0.1:6443|server: https://$RKE2_API_VIP:6443|" /etc/rancher/rke2/rke2.yaml
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJlakNDQVIrZ0F3SUJBZ0lCQURBS0JnZ3Foa2pPUFFRREFqQWtNU0l3SUFZRFZRUUREQmx5YTJVeUxYTmwKY25abGNpMWpZVUF4TnpNNU5USXlOall6TUI0WERUSTFNREl4TkRBNE5EUXlNMW9YRFRNMU1ESXhNakE0TkRReQpNMW93SkRFaU1DQUdBMVVFQXd3WmNtdGxNaTF6WlhKMlpYSXRZMkZBTVRjek9UVXlNalkyTXpCWk1CTUdCeXFHClNNNDlBZ0VHQ0NxR1NNNDlBd0VIQTBJQUJJc3d5Mnk2M2VKMHdmVVdocVlYRzl6WjJaQ1dyNDI5amFPTkF0OVIKcWV0eHpQNTRtSVdjZXJJMDBXdlFmOG1lNnlRNk9JYnR2dFdSQmI3aHd2cXRmYnFqUWpCQU1BNEdBMVVkRHdFQgovd1FFQXdJQ3BEQVBCZ05WSFJNQkFmOEVCVEFEQVFIL01CMEdBMVVkRGdRV0JCVC9yWVN3NXR0bXpvOVRWMmpiClNDQ1gvMVhHc0RBS0JnZ3Foa2pPUFFRREFnTkpBREJHQWlFQTNoMU03blQvRzkyQzk5MUpNUGVxbTQxQnhFT1MKZlBZNXc3MjlPV0NGZm1nQ0lRQ3paMUZTTlIwTnZPOXBFMkljUW5wMWkzbkZ6MXlRcUFOWkRnajd4Y3A0Y2c9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
server: https://192.168.11.107:6443
......
```
```
$ kubectl get no
NAME STATUS ROLES AGE VERSION
rke2-m1 Ready control-plane,etcd,master 2d23h v1.31.3+rke2r1
rke2-m2 Ready control-plane,etcd,master 20m v1.31.3+rke2r1
rke2-m3 Ready control-plane,etcd,master 15m v1.31.3+rke2r1
```
### 驗證 control plane 容錯
* vip 目前在 rke2-m1,將 rke2-m1 關機
```
rke2-m1:~ # ip a s eth1
2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether bc:24:11:ce:bf:dd brd ff:ff:ff:ff:ff:ff
altname enp0s18
altname ens18
inet 192.168.11.108/24 brd 192.168.11.255 scope global eth1
valid_lft forever preferred_lft forever
inet 192.168.11.107/32 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::be24:11ff:fece:bfdd/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
rke2-m1:~ # poweroff
```
* rke2-m1 關機後還是可以透過 vip 訪問到 k8s 叢集
```
$ kubectl get no
NAME STATUS ROLES AGE VERSION
rke2-m1 NotReady control-plane,etcd,master 2d23h v1.31.3+rke2r1
rke2-m2 Ready control-plane,etcd,master 24m v1.31.3+rke2r1
rke2-m3 Ready control-plane,etcd,master 19m v1.31.3+rke2r1
```
* 檢查 kube-vip pod logs 可以看到 rke2-m3 成為 leader
```
$ kubectl -n kube-system logs kube-vip-ds-2qpqb
time="2025-02-17T08:18:29Z" level=info msg="Starting kube-vip.io [v0.8.9]"
time="2025-02-17T08:18:29Z" level=info msg="Build kube-vip.io [19e660d4a692fab29f407214b452f48d9a65425e]"
time="2025-02-17T08:18:29Z" level=info msg="namespace [kube-system], Mode: [ARP], Features(s): Control Plane:[true], Services:[true]"
time="2025-02-17T08:18:29Z" level=info msg="Using node name [rke2-m3]"
time="2025-02-17T08:18:29Z" level=info msg="prometheus HTTP server started"
time="2025-02-17T08:18:29Z" level=info msg="Starting Kube-vip Manager with the ARP engine"
time="2025-02-17T08:18:29Z" level=info msg="Starting UPNP Port Refresher"
time="2025-02-17T08:18:29Z" level=info msg="beginning services leadership, namespace [kube-system], lock name [plndr-svcs-lock], id [rke2-m3]"
I0217 08:18:29.723825 1 leaderelection.go:257] attempting to acquire leader lease kube-system/plndr-svcs-lock...
time="2025-02-17T08:18:29Z" level=info msg="Beginning cluster membership, namespace [kube-system], lock name [plndr-cp-lock], id [rke2-m3]"
I0217 08:18:29.723922 1 leaderelection.go:257] attempting to acquire leader lease kube-system/plndr-cp-lock...
time="2025-02-17T08:18:29Z" level=info msg="Node [rke2-m1] is assuming leadership of the cluster"
time="2025-02-17T08:18:29Z" level=info msg="new leader elected: rke2-m1"
time="2025-02-17T08:23:29Z" level=info msg="[UPNP] Refreshing 0 Instances"
time="2025-02-17T08:28:29Z" level=info msg="[UPNP] Refreshing 0 Instances"
time="2025-02-17T08:33:29Z" level=info msg="[UPNP] Refreshing 0 Instances"
E0217 08:34:51.477325 1 leaderelection.go:436] error retrieving resource lock kube-system/plndr-cp-lock: etcdserver: leader changed
I0217 08:34:52.982930 1 leaderelection.go:271] successfully acquired lease kube-system/plndr-svcs-lock
time="2025-02-17T08:34:52Z" level=info msg="(svcs) starting services watcher for all namespaces"
I0217 08:34:53.193399 1 leaderelection.go:271] successfully acquired lease kube-system/plndr-cp-lock
time="2025-02-17T08:34:53Z" level=info msg="Node [rke2-m3] is assuming leadership of the cluster"
time="2025-02-17T08:34:53Z" level=info msg="Gratuitous Arp broadcast will repeat every 3 seconds for [192.168.11.107/eth1]"
```
* 進到 rke2-m3 查看 eth1 網卡多了 192.168.11.107 vip
```
rke2-m3:~ # ip a s eth1
2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether bc:24:11:c9:5d:4e brd ff:ff:ff:ff:ff:ff
altname enp0s18
altname ens18
inet 192.168.11.110/24 brd 192.168.11.255 scope global eth1
valid_lft forever preferred_lft forever
inet 192.168.11.107/32 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::be24:11ff:fec9:5d4e/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
```
## 實作 Load Balancer 功能(kube-vip-cloud-provider)
* 部屬 kube-vip-cloud-controller
```
$ kubectl apply -f https://raw.githubusercontent.com/kube-vip/kube-vip-cloud-provider/main/manifest/kube-vip-cloud-controller.yaml
$ kubectl -n kube-system get po -l app=kube-vip
NAME READY STATUS RESTARTS AGE
kube-vip-cloud-provider-fb9c65946-85rd2 1/1 Running 0 5m26s
```
* 建立一個 IP range,可用範圍是 `192.168.11.111-192.168.11.114`
```
$ kubectl create configmap --namespace kube-system kubevip --from-literal range-global=192.168.11.111-192.168.11.114
```
```
$ kubectl -n kube-system describe cm kubevip
Name: kubevip
Namespace: kube-system
Labels: <none>
Annotations: <none>
Data
====
range-global:
----
192.168.11.111-192.168.11.114
BinaryData
====
Events: <none>
```
### 驗證
* 建立 deployment
```
$ echo 'apiVersion: apps/v1
kind: Deployment
metadata:
name: s1.dep
spec:
replicas: 2
selector:
matchLabels:
app: s1.dep
template:
metadata:
labels:
app: s1.dep
spec:
containers:
- name: app
image: quay.io/flysangel/image:app.golang' | kubectl apply -f -
```
* 建立 LoadBalancer service
```
$ echo 'apiVersion: v1
kind: Service
metadata:
name: s1
spec:
ports:
- port: 80
targetPort: 8080
selector:
app: s1.dep
type: LoadBalancer' | kubectl apply -f -
```
```
$ kubectl get pod,svc
NAME READY STATUS RESTARTS AGE
pod/s1.dep-6f5cb5fc48-p5c9d 1/1 Running 0 12s
pod/s1.dep-6f5cb5fc48-rktxq 1/1 Running 0 12s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 3d
service/s1 LoadBalancer 10.43.241.107 192.168.11.111 80:30896/TCP 7s
```
* 在叢集外訪問
```
$ curl 192.168.11.111
{"message":"Hello Golang"}
```
## 參考
https://github.com/kube-vip/kube-vip-cloud-provider/blob/main/README.md
https://portal.nutanix.com/page/documents/solutions/details?targetId=BP-2103-Rancher-SUSE-Nutanix:deploy-highly-available-rke2-with-kube-vip.html
https://docs.expertflow.com/cx/4.4/rke2-deployment-in-high-availability-with-kube-vip
https://kube-vip.io/docs/usage/kind/