# RKE HA Cluster 部屬 ## Install RKE * 在 sles15-sp6 安裝 docker ``` $ sudo zypper mr -ea # 三個節點都要安裝 $ sudo zypper in docker $ sudo systemctl enable --now docker.service ``` * 第一台產生 ssh 金鑰,因為要透過 ssh 免密碼的方式登入到每個節點做安裝 ``` $ ssh-keygen -t rsa -P '' $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys $ ssh-copy-id root@192.168.11.53 $ ssh-copy-id root@192.168.11.54 ``` * 下載 rke v1.4.19 二進位執行檔 ``` $ wget https://github.com/rancher/rke/releases/download/v1.4.19/rke_linux-amd64 $ chmod +x rke_linux-amd64;mv rke_linux-amd64 /usr/local/bin/rke $ rke --version rke version v1.4.19 ``` * 檢查當前支援的 k8s 版本 ``` $ rke config --list-version --all v1.25.16-rancher2-3 v1.24.17-rancher1-1 v1.26.15-rancher1-1 v1.27.14-rancher1-1 v1.23.16-rancher2-3 ``` 1. master 的 IP 以及 ssh 角色名稱 2. 角色扮演 k8s controlplane, k8s worker 以及 etcd. 3. network 使用 calico 作為 CNI 4. 啟動 etcd 並且啟動自動備份 5. 節點名稱為 rke-m1、rke-m2、rke-m3 ``` $ vim cluster.yaml cluster_name: rke-cluster kubernetes_version: "v1.27.14-rancher1-1" nodes: - address: 192.168.11.52 user: root role: [controlplane,worker,etcd] hostname_override: rke-m1 - address: 192.168.11.53 user: root role: [controlplane,worker,etcd] hostname_override: rke-m2 - address: 192.168.11.54 user: root role: [controlplane,worker,etcd] hostname_override: rke-m3 services: etcd: backup_config: enabled: true interval_hours: 6 retention: 10 network: plugin: calico ``` * 部屬 rke cluster ``` $ rke up --config cluster.yaml INFO[0000] Running RKE version: v1.4.19 INFO[0000] Initiating Kubernetes cluster INFO[0000] [dialer] Setup tunnel for host [192.168.11.54] INFO[0000] [dialer] Setup tunnel for host [192.168.11.53] INFO[0000] [dialer] Setup tunnel for host [192.168.11.52] ...... ``` * 設定 kubeconfig ``` $ mkdir .kube $ mv kube_config_cluster.yaml .kube/config ``` * install kubectl stable 版本 ``` $ curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" $ sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl $ rm -r kubectl ``` * 檢查 ``` $ kubectl get no NAME STATUS ROLES AGE VERSION rke-m1 Ready controlplane,etcd,worker 119s v1.27.14 rke-m2 Ready controlplane,etcd,worker 2m v1.27.14 rke-m3 Ready controlplane,etcd,worker 119s v1.27.14 ``` ``` $ kubectl get po -A NAMESPACE NAME READY STATUS RESTARTS AGE ingress-nginx ingress-nginx-admission-create-9mwt6 0/1 Completed 0 95s ingress-nginx ingress-nginx-admission-patch-jl5n2 0/1 Completed 0 95s ingress-nginx nginx-ingress-controller-5j8th 1/1 Running 0 95s ingress-nginx nginx-ingress-controller-m7p5c 1/1 Running 0 95s ingress-nginx nginx-ingress-controller-zp69g 1/1 Running 0 95s kube-system calico-kube-controllers-6cc44b46bf-f445x 1/1 Running 0 2m6s kube-system calico-node-5hz9l 1/1 Running 0 2m6s kube-system calico-node-94wmm 1/1 Running 0 2m6s kube-system calico-node-b5nzg 1/1 Running 0 2m6s kube-system coredns-75885c777c-2n8rv 1/1 Running 0 49s kube-system coredns-75885c777c-qrscj 1/1 Running 0 116s kube-system coredns-autoscaler-8687f86b76-jw4j6 1/1 Running 0 116s kube-system metrics-server-55fb56d747-bkw5c 1/1 Running 0 104s kube-system rke-coredns-addon-deploy-job-9vxrx 0/1 Completed 0 118s kube-system rke-ingress-controller-deploy-job-dtdrz 0/1 Completed 0 98s kube-system rke-metrics-addon-deploy-job-mswvf 0/1 Completed 0 108s kube-system rke-network-plugin-deploy-job-wmn8r 0/1 Completed 0 2m9s ``` * k8s 元件會以 container 方式運作,而不是 static pod ``` $ docker ps -a ...... 983d5c7338d1 rancher/hyperkube:v1.27.14-rancher1 "/opt/rke-tools/entr…" 6 minutes ago Up 6 minutes kube-proxy 4bacb61dbdbe rancher/hyperkube:v1.27.14-rancher1 "/opt/rke-tools/entr…" 6 minutes ago Up 6 minutes kubelet e4951243903f rancher/hyperkube:v1.27.14-rancher1 "/opt/rke-tools/entr…" 6 minutes ago Up 6 minutes kube-scheduler e88cf3162625 rancher/hyperkube:v1.27.14-rancher1 "/opt/rke-tools/entr…" 6 minutes ago Up 6 minutes kube-controller-manager df34624d11ed rancher/hyperkube:v1.27.14-rancher1 "/opt/rke-tools/entr…" 7 minutes ago Up 7 minutes kube-apiserver 33ae4f839ada rancher/rke-tools:v0.1.96 "/bin/bash" 7 minutes ago Created service-sidekick 0ae05ae74a96 rancher/rke-tools:v0.1.96 "/docker-entrypoint.…" 7 minutes ago Up 7 minutes etcd-rolling-snapshots 4654c173bc42 rancher/mirrored-coreos-etcd:v3.5.10 "/usr/local/bin/etcd…" 7 minutes ago Up 7 minutes etcd ...... ``` ## 新增 worker 節點 ``` $ kubectl get no NAME STATUS ROLES AGE VERSION rke-m1 Ready controlplane,etcd,worker 2m24s v1.27.14 $ cat cluster.yaml cluster_name: rke-cluster kubernetes_version: "v1.27.14-rancher1-1" nodes: - address: 192.168.11.137 user: root role: - controlplane - etcd - worker hostname_override: rke-m1 services: etcd: backup_config: enabled: true interval_hours: 6 retention: 10 network: plugin: calico ``` ``` $ nano cluster.yaml cluster_name: rke-cluster kubernetes_version: "v1.27.14-rancher1-1" nodes: - address: 192.168.11.137 user: root role: - controlplane - etcd - worker hostname_override: rke-m1 - address: 192.168.11.138 # add user: root # add role: # add - worker # add hostname_override: rke-w1 # add services: etcd: backup_config: enabled: true interval_hours: 6 retention: 10 network: plugin: calico ``` * 將 ssh 公要匯入過去 ``` $ ssh-copy-id root@192.168.11.138 ``` * 更新叢集 ``` $ rke up --update-only --config cluster.yaml ``` * 已完成新增 worker node ``` $ kubectl get no NAME STATUS ROLES AGE VERSION rke-m1 Ready controlplane,etcd,worker 20m v1.27.14 rke-w1 Ready worker 10m v1.27.14 ``` ## 移除 worker 節點 * 將 worker node 移除 ``` $ nano cluster.yaml cluster_name: rke-cluster kubernetes_version: "v1.27.14-rancher1-1" nodes: - address: 192.168.11.137 user: root role: - controlplane - etcd - worker hostname_override: rke-m1 services: etcd: backup_config: enabled: true interval_hours: 6 retention: 10 network: plugin: calico ``` * 更新叢集 ``` $ rke up --update-only --config cluster.yaml ``` * 已移除 worker node ``` $ kubectl get no NAME STATUS ROLES AGE VERSION rke-m1 Ready controlplane,etcd,worker 22m v1.27.14 ``` ## 清除 rke container * 在 k8s 移除 worker node 後,還需到原本的節點上清除 container ,不然這個節點重啟又會加回原本的叢集 ``` $ sudo docker rm -f $(sudo docker ps -qa) $ sudo docker rmi -f $(sudo docker images -q) $ sudo docker volume rm $(sudo docker volume ls -q) $ for mount in $(sudo mount | grep tmpfs | grep '/var/lib/kubelet' | awk '{ print $3 }') /var/lib/kubelet /var/lib/rancher; do sudo umount $mount; done $ sudo rm -rf /etc/ceph \ /etc/cni \ /etc/kubernetes \ /etc/rancher \ /opt/cni \ /opt/rke \ /run/secrets/kubernetes.io \ /run/calico \ /run/flannel \ /var/lib/calico \ /var/lib/etcd \ /var/lib/cni \ /var/lib/kubelet \ /var/lib/rancher\ /var/log/containers \ /var/log/kube-audit \ /var/log/pods \ /var/run/calico $ sudo reboot ``` * 在 rke 叢集中有 nginx-proxy 的容器來負責多節點 apiserver 的反向代理並且實現高可用 ``` # 在 worker 上檢查 container $ docker ps -a|grep nginx-proxy c0a7a5aba7e1 registry.rancher.com/rancher/rke-tools:v0.1.100 "nginx-proxy CP_HOST…" 3 months ago Up 25 hours nginx-proxy ```