# RKE HA Cluster 部屬
## Install RKE
* 在 sles15-sp6 安裝 docker
```
$ sudo zypper mr -ea
# 三個節點都要安裝
$ sudo zypper in docker
$ sudo systemctl enable --now docker.service
```
* 第一台產生 ssh 金鑰,因為要透過 ssh 免密碼的方式登入到每個節點做安裝
```
$ ssh-keygen -t rsa -P ''
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ ssh-copy-id root@192.168.11.53
$ ssh-copy-id root@192.168.11.54
```
* 下載 rke v1.4.19 二進位執行檔
```
$ wget https://github.com/rancher/rke/releases/download/v1.4.19/rke_linux-amd64
$ chmod +x rke_linux-amd64;mv rke_linux-amd64 /usr/local/bin/rke
$ rke --version
rke version v1.4.19
```
* 檢查當前支援的 k8s 版本
```
$ rke config --list-version --all
v1.25.16-rancher2-3
v1.24.17-rancher1-1
v1.26.15-rancher1-1
v1.27.14-rancher1-1
v1.23.16-rancher2-3
```
1. master 的 IP 以及 ssh 角色名稱
2. 角色扮演 k8s controlplane, k8s worker 以及 etcd.
3. network 使用 calico 作為 CNI
4. 啟動 etcd 並且啟動自動備份
5. 節點名稱為 rke-m1、rke-m2、rke-m3
```
$ vim cluster.yaml
cluster_name: rke-cluster
kubernetes_version: "v1.27.14-rancher1-1"
nodes:
- address: 192.168.11.52
user: root
role: [controlplane,worker,etcd]
hostname_override: rke-m1
- address: 192.168.11.53
user: root
role: [controlplane,worker,etcd]
hostname_override: rke-m2
- address: 192.168.11.54
user: root
role: [controlplane,worker,etcd]
hostname_override: rke-m3
services:
etcd:
backup_config:
enabled: true
interval_hours: 6
retention: 10
network:
plugin: calico
```
* 部屬 rke cluster
```
$ rke up --config cluster.yaml
INFO[0000] Running RKE version: v1.4.19
INFO[0000] Initiating Kubernetes cluster
INFO[0000] [dialer] Setup tunnel for host [192.168.11.54]
INFO[0000] [dialer] Setup tunnel for host [192.168.11.53]
INFO[0000] [dialer] Setup tunnel for host [192.168.11.52]
......
```
* 設定 kubeconfig
```
$ mkdir .kube
$ mv kube_config_cluster.yaml .kube/config
```
* install kubectl stable 版本
```
$ curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
$ sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
$ rm -r kubectl
```
* 檢查
```
$ kubectl get no
NAME STATUS ROLES AGE VERSION
rke-m1 Ready controlplane,etcd,worker 119s v1.27.14
rke-m2 Ready controlplane,etcd,worker 2m v1.27.14
rke-m3 Ready controlplane,etcd,worker 119s v1.27.14
```
```
$ kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress-nginx ingress-nginx-admission-create-9mwt6 0/1 Completed 0 95s
ingress-nginx ingress-nginx-admission-patch-jl5n2 0/1 Completed 0 95s
ingress-nginx nginx-ingress-controller-5j8th 1/1 Running 0 95s
ingress-nginx nginx-ingress-controller-m7p5c 1/1 Running 0 95s
ingress-nginx nginx-ingress-controller-zp69g 1/1 Running 0 95s
kube-system calico-kube-controllers-6cc44b46bf-f445x 1/1 Running 0 2m6s
kube-system calico-node-5hz9l 1/1 Running 0 2m6s
kube-system calico-node-94wmm 1/1 Running 0 2m6s
kube-system calico-node-b5nzg 1/1 Running 0 2m6s
kube-system coredns-75885c777c-2n8rv 1/1 Running 0 49s
kube-system coredns-75885c777c-qrscj 1/1 Running 0 116s
kube-system coredns-autoscaler-8687f86b76-jw4j6 1/1 Running 0 116s
kube-system metrics-server-55fb56d747-bkw5c 1/1 Running 0 104s
kube-system rke-coredns-addon-deploy-job-9vxrx 0/1 Completed 0 118s
kube-system rke-ingress-controller-deploy-job-dtdrz 0/1 Completed 0 98s
kube-system rke-metrics-addon-deploy-job-mswvf 0/1 Completed 0 108s
kube-system rke-network-plugin-deploy-job-wmn8r 0/1 Completed 0 2m9s
```
* k8s 元件會以 container 方式運作,而不是 static pod
```
$ docker ps -a
......
983d5c7338d1 rancher/hyperkube:v1.27.14-rancher1 "/opt/rke-tools/entr…" 6 minutes ago Up 6 minutes kube-proxy
4bacb61dbdbe rancher/hyperkube:v1.27.14-rancher1 "/opt/rke-tools/entr…" 6 minutes ago Up 6 minutes kubelet
e4951243903f rancher/hyperkube:v1.27.14-rancher1 "/opt/rke-tools/entr…" 6 minutes ago Up 6 minutes kube-scheduler
e88cf3162625 rancher/hyperkube:v1.27.14-rancher1 "/opt/rke-tools/entr…" 6 minutes ago Up 6 minutes kube-controller-manager
df34624d11ed rancher/hyperkube:v1.27.14-rancher1 "/opt/rke-tools/entr…" 7 minutes ago Up 7 minutes kube-apiserver
33ae4f839ada rancher/rke-tools:v0.1.96 "/bin/bash" 7 minutes ago Created service-sidekick
0ae05ae74a96 rancher/rke-tools:v0.1.96 "/docker-entrypoint.…" 7 minutes ago Up 7 minutes etcd-rolling-snapshots
4654c173bc42 rancher/mirrored-coreos-etcd:v3.5.10 "/usr/local/bin/etcd…" 7 minutes ago Up 7 minutes etcd
......
```
## 新增 worker 節點
```
$ kubectl get no
NAME STATUS ROLES AGE VERSION
rke-m1 Ready controlplane,etcd,worker 2m24s v1.27.14
$ cat cluster.yaml
cluster_name: rke-cluster
kubernetes_version: "v1.27.14-rancher1-1"
nodes:
- address: 192.168.11.137
user: root
role:
- controlplane
- etcd
- worker
hostname_override: rke-m1
services:
etcd:
backup_config:
enabled: true
interval_hours: 6
retention: 10
network:
plugin: calico
```
```
$ nano cluster.yaml
cluster_name: rke-cluster
kubernetes_version: "v1.27.14-rancher1-1"
nodes:
- address: 192.168.11.137
user: root
role:
- controlplane
- etcd
- worker
hostname_override: rke-m1
- address: 192.168.11.138 # add
user: root # add
role: # add
- worker # add
hostname_override: rke-w1 # add
services:
etcd:
backup_config:
enabled: true
interval_hours: 6
retention: 10
network:
plugin: calico
```
* 將 ssh 公要匯入過去
```
$ ssh-copy-id root@192.168.11.138
```
* 更新叢集
```
$ rke up --update-only --config cluster.yaml
```
* 已完成新增 worker node
```
$ kubectl get no
NAME STATUS ROLES AGE VERSION
rke-m1 Ready controlplane,etcd,worker 20m v1.27.14
rke-w1 Ready worker 10m v1.27.14
```
## 移除 worker 節點
* 將 worker node 移除
```
$ nano cluster.yaml
cluster_name: rke-cluster
kubernetes_version: "v1.27.14-rancher1-1"
nodes:
- address: 192.168.11.137
user: root
role:
- controlplane
- etcd
- worker
hostname_override: rke-m1
services:
etcd:
backup_config:
enabled: true
interval_hours: 6
retention: 10
network:
plugin: calico
```
* 更新叢集
```
$ rke up --update-only --config cluster.yaml
```
* 已移除 worker node
```
$ kubectl get no
NAME STATUS ROLES AGE VERSION
rke-m1 Ready controlplane,etcd,worker 22m v1.27.14
```
## 清除 rke container
* 在 k8s 移除 worker node 後,還需到原本的節點上清除 container ,不然這個節點重啟又會加回原本的叢集
```
$ sudo docker rm -f $(sudo docker ps -qa)
$ sudo docker rmi -f $(sudo docker images -q)
$ sudo docker volume rm $(sudo docker volume ls -q)
$ for mount in $(sudo mount | grep tmpfs | grep '/var/lib/kubelet' | awk '{ print $3 }') /var/lib/kubelet /var/lib/rancher; do sudo umount $mount; done
$ sudo rm -rf /etc/ceph \
/etc/cni \
/etc/kubernetes \
/etc/rancher \
/opt/cni \
/opt/rke \
/run/secrets/kubernetes.io \
/run/calico \
/run/flannel \
/var/lib/calico \
/var/lib/etcd \
/var/lib/cni \
/var/lib/kubelet \
/var/lib/rancher\
/var/log/containers \
/var/log/kube-audit \
/var/log/pods \
/var/run/calico
$ sudo reboot
```
* 在 rke 叢集中有 nginx-proxy 的容器來負責多節點 apiserver 的反向代理並且實現高可用
```
# 在 worker 上檢查 container
$ docker ps -a|grep nginx-proxy
c0a7a5aba7e1 registry.rancher.com/rancher/rke-tools:v0.1.100 "nginx-proxy CP_HOST…" 3 months ago Up 25 hours nginx-proxy
```