# Step by step install HA Kubernetes with Kubeadm in Disconnected environment ## 0. Preface 本篇文章會介紹,如何在全離線的環境下,在多台 VM 上使用 `kubeadm` 安裝 3 台 Control-plane Node 和 3 台 Worker Node 架構的 Kubernetes Cluster ### 0.1. 環境架構規劃 ![image](https://hackmd.io/_uploads/HJVAgQ0aex.png) ## 1. 準備 K8s 全離線安裝包 ### 1.1. 準備 K8s 軟體套件 1. 連線到一台 Disk 空間 70G 或以上, 可以透過網路正常連線到公有網路的 RHEL 9 電腦 2. 註冊 RHEL 9 ``` sudo rhc connect --activation-key $ACTIVATION-KEY --organization $ORGANIZATION ``` 3. 設定 Kubernetes 和 CRI-O 版本 ``` KUBERNETES_VERSION=v1.34 CRIO_VERSION=v1.34 ``` 4. 添加 Kubernetes yum repository ``` cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://pkgs.k8s.io/core:/stable:/$KUBERNETES_VERSION/rpm/ enabled=1 gpgcheck=1 gpgkey=https://pkgs.k8s.io/core:/stable:/$KUBERNETES_VERSION/rpm/repodata/repomd.xml.key EOF ``` 5. 添加 CRI-O yum repository ``` cat <<EOF | sudo tee /etc/yum.repos.d/cri-o.repo [cri-o] name=CRI-O baseurl=https://download.opensuse.org/repositories/isv:/cri-o:/stable:/$CRIO_VERSION/rpm/ enabled=1 gpgcheck=1 gpgkey=https://download.opensuse.org/repositories/isv:/cri-o:/stable:/$CRIO_VERSION/rpm/repodata/repomd.xml.key EOF ``` 6. 建立工作目錄 ``` mkdir -p "$HOME"/k8s/{addon/{cni/cilium,csi/nfs-provisioner,ingress/ingress-nginx,metrics/metrics-server,logging/{fluentd,logging-operator},kube-vip/{ha-control-plane,k8s-lb-service}},img,pkg/{kubernetes,podman,skopeo},config} ``` 7. 目錄結構如下 ``` tree "$HOME"/k8s ``` 執行結果: ``` /home/bigred/k8s ├── addon │   ├── cni │   │   └── cilium │   ├── csi │   │   └── nfs-provisioner │   ├── ingress │   │   └── ingress-nginx │   ├── kube-vip │   │   ├── ha-control-plane │   │   └── k8s-lb-service │   ├── logging │   │   ├── fluentd │   │   └── logging-operator │   └── metrics │   └── metrics-server ├── config ├── img └── pkg ├── kubernetes ├── podman └── skopeo 21 directories, 0 files ``` 8. 將必要套件載入指定目錄 ``` sudo yum install --downloadonly --downloaddir="$HOME"/k8s/pkg/kubernetes -y \ crun \ runc \ cri-o \ kubeadm \ kubectl \ kubelet ``` 9. 檢視已載入的套件 ``` ls -lh "$HOME"/k8s/pkg/kubernetes ``` 執行結果: ``` total 78M -rw-r--r--. 1 root root 240K Oct 16 16:28 conntrack-tools-1.4.7-4.el9_5.x86_64.rpm -rw-r--r--. 1 root root 66K Oct 16 16:28 container-selinux-2.235.0-1.el9_6.noarch.rpm -rw-r--r--. 1 root root 21M Oct 16 16:28 cri-o-1.34.1-150500.1.1.x86_64.rpm -rw-r--r--. 1 root root 7.3M Oct 16 16:28 cri-tools-1.34.0-150500.1.1.x86_64.rpm -rw-r--r--. 1 root root 563K Oct 16 16:28 criu-3.19-1.el9.x86_64.rpm -rw-r--r--. 1 root root 33K Oct 16 16:28 criu-libs-3.19-1.el9.x86_64.rpm -rw-r--r--. 1 root root 244K Oct 16 16:28 crun-1.23.1-2.el9_6.x86_64.rpm -rw-r--r--. 1 root root 13M Oct 16 16:28 kubeadm-1.34.1-150500.1.1.x86_64.rpm -rw-r--r--. 1 root root 12M Oct 16 16:28 kubectl-1.34.1-150500.1.1.x86_64.rpm -rw-r--r--. 1 root root 13M Oct 16 16:28 kubelet-1.34.1-150500.1.1.x86_64.rpm -rw-r--r--. 1 root root 8.6M Oct 16 16:28 kubernetes-cni-1.7.1-150500.1.1.x86_64.rpm -rw-r--r--. 1 root root 60K Oct 16 16:28 libnet-1.2-7.el9.x86_64.rpm -rw-r--r--. 1 root root 26K Oct 16 16:28 libnetfilter_cthelper-1.0.0-22.el9.x86_64.rpm -rw-r--r--. 1 root root 26K Oct 16 16:28 libnetfilter_cttimeout-1.0.0-19.el9.x86_64.rpm -rw-r--r--. 1 root root 31K Oct 16 16:28 libnetfilter_queue-1.0.5-1.el9.x86_64.rpm -rw-r--r--. 1 root root 38K Oct 16 16:28 protobuf-c-1.3.3-13.el9.x86_64.rpm -rw-r--r--. 1 root root 3.6M Oct 16 16:28 runc-1.2.4-2.el9.x86_64.rpm -rw-r--r--. 1 root root 42K Oct 16 16:28 yajl-2.1.0-25.el9.x86_64.rpm ``` 10. 下載並準備 podman pkg ``` sudo yum install --downloadonly --downloaddir="$HOME"/k8s/pkg/podman -y \ podman ``` 11. 檢視已載入的套件 ``` ls -l "$HOME"/k8s/pkg/podman ``` 執行結果: ``` total 24048 -rw-r--r--. 1 bigred bigred 902510 Oct 16 16:59 aardvark-dns-1.14.0-1.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 55519 Oct 16 16:59 conmon-2.1.12-1.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 158417 Oct 16 16:59 containers-common-1-117.el9_6.x86_64.rpm -rw-r--r--. 1 bigred bigred 67105 Oct 16 16:59 container-selinux-2.235.0-1.el9_6.noarch.rpm -rw-r--r--. 1 bigred bigred 576009 Oct 16 16:59 criu-3.19-1.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 33643 Oct 16 16:59 criu-libs-3.19-1.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 249255 Oct 16 16:59 crun-1.23.1-2.el9_6.x86_64.rpm -rw-r--r--. 1 bigred bigred 58706 Oct 16 16:59 fuse3-3.10.2-9.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 95573 Oct 16 16:59 fuse3-libs-3.10.2-9.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 8750 Oct 16 16:59 fuse-common-3.10.2-9.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 71022 Oct 16 16:59 fuse-overlayfs-1.14-1.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 61278 Oct 16 16:59 libnet-1.2-7.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 71992 Oct 16 16:59 libslirp-4.4.0-8.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 3846970 Oct 16 16:59 netavark-1.14.1-1.el9_6.x86_64.rpm -rw-r--r--. 1 bigred bigred 268131 Oct 16 16:59 'passt-0^20250217.ga1e48a0-10.el9_6.x86_64.rpm' -rw-r--r--. 1 bigred bigred 27658 Oct 16 16:59 'passt-selinux-0^20250217.ga1e48a0-10.el9_6.noarch.rpm' -rw-r--r--. 1 bigred bigred 17803252 Oct 16 16:59 podman-5.4.0-13.el9_6.x86_64.rpm -rw-r--r--. 1 bigred bigred 38224 Oct 16 16:59 protobuf-c-1.3.3-13.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 90389 Oct 16 16:59 shadow-utils-subid-4.9-12.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 50387 Oct 16 16:59 slirp4netns-1.3.2-1.el9.x86_64.rpm -rw-r--r--. 1 bigred bigred 42487 Oct 16 16:59 yajl-2.1.0-25.el9.x86_64.rpm ``` 12. 下載並準備 skopeo pkg ``` sudo yum install --downloadonly --downloaddir="$HOME"/k8s/pkg/skopeo -y \ skopeo ``` 13. 檢視已載入的套件 ``` ls -l "$HOME"/k8s/pkg/skopeo ``` 執行結果: ``` total 9308 -rw-r--r--. 1 root root 9530108 Oct 17 10:19 skopeo-1.18.1-2.el9_6.x86_64.rpm ``` ### 1.2. 準備 kubeadm 要安裝 K8s 所需的 images 1. 列出安裝 kubeadm 要安裝 K8s 所需的 images ``` kubeadm config images list | tee ${HOME}/k8s/img/img-list.txt ``` 執行結果: ``` registry.k8s.io/kube-apiserver:v1.34.1 registry.k8s.io/kube-controller-manager:v1.34.1 registry.k8s.io/kube-scheduler:v1.34.1 registry.k8s.io/kube-proxy:v1.34.1 registry.k8s.io/coredns/coredns:v1.12.1 registry.k8s.io/pause:3.10.1 registry.k8s.io/etcd:3.6.4-0 ``` 2. 將 K8s images 備份到指定目錄中 ``` for i in $(cat ${HOME}/k8s/img/img-list.txt); do podman pull ${i}; done podman save -m $(cat ${HOME}/k8s/img/img-list.txt) > ${HOME}/k8s/img/all-K8s-Components-v1.34.1.tar ``` 3. 拷貝 K8s 所有 components 的 images 到 Harbor 的 library project 中 ``` for i in $(cat ${HOME}/k8s/img/img-list.txt); do skopeo copy docker://${i} docker://${i/registry.k8s.io/${HARBOR_FQDN}/library}; done ``` #### 1.2.3. 準備 ingress-nginx `v1.13.3` 所需之 images 和安裝 yaml 1. 找到 ingress-nginx `v1.13.3` 所需之 images ``` INGRESS_NGINX_VERSION="v1.13.3" curl -s https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-${INGRESS_NGINX_VERSION}/deploy/static/provider/cloud/deploy.yaml | grep image: | sort -u | tr -s " " | cut -d " " -f 3 | tee -a ${HOME}/k8s/img/img-list.txt ``` 執行結果: ``` registry.k8s.io/ingress-nginx/controller:v1.13.3@sha256:1b044f6dcac3afbb59e05d98463f1dec6f3d3fb99940bc12ca5d80270358e3bd registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.6.3@sha256:3d671cf20a35cd94efc5dcd484970779eb21e7938c98fbc3673693b8a117cf39 ``` 2. 將 Ingress-nginx Controller image 備份到指定目錄 ``` for i in $(grep ingress-nginx ${HOME}/k8s/img/img-list.txt); do podman pull ${i}; podman tag ${i} ${i%@*};done podman save -m $(grep ingress-nginx ${HOME}/k8s/img/img-list.txt | cut -d "@" -f 1) > ${HOME}/k8s/img/ingress-nginx-controller-v1.13.3.tar ``` 3. 將安裝 yaml 下載至指定目錄 ``` curl -so ${HOME}/k8s/addon/ingress/ingress-nginx/deploy.yaml https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-${INGRESS_NGINX_VERSION}/deploy/static/provider/cloud/deploy.yaml ``` ### 1.3. 準備 kube-vip `v1.0.1` 所需之 images 1. 設定環境資訊 ``` export VIP="172.20.6.110" export VIPSUBSET="16" export INTERFACE="ens18" KVVERSION="v1.0.1" alias kube-vip="sudo podman run --network host --rm ghcr.io/kube-vip/kube-vip:$KVVERSION" ``` 2. 產生 kube-vip static pod yaml ``` kube-vip manifest pod \ --interface $INTERFACE \ --address $VIP \ --vipSubnet $VIPSUBSET \ --controlplane \ --arp \ --leaderElection | tee ${HOME}/k8s/addon/kube-vip/ha-control-plane/kube-vip-static-pod.yaml ```` > 上述參數沒有 `--service` 是因為 control-plane 的 static pod 不需要管理 loadbalancer type service 3. 產生 kube-vip daemonset yaml ``` kube-vip manifest daemonset \ --interface $INTERFACE \ --vipSubnet $VIPSUBSET \ --services \ --arp \ --leaderElection \ --servicesElection | tee ${HOME}/k8s/addon/kube-vip/k8s-lb-service/kube-vip-ds.yaml ``` 4. 下載 kube-vip Cloud Provider yaml 檔 ``` curl -so ${HOME}/k8s/addon/kube-vip/k8s-lb-service/kube-vip-cloud-controller.yaml https://raw.githubusercontent.com/kube-vip/kube-vip-cloud-provider/main/manifest/kube-vip-cloud-controller.yaml ``` 5. 下載 kube-vip daemonset 所需之 rbac.yaml ``` curl -so ${HOME}/k8s/addon/kube-vip/k8s-lb-service/kube-vip-rbac.yaml https://kube-vip.io/manifests/rbac.yaml ``` 5. 檢視 Kubevip 執行所需之 images ``` find ${HOME}/k8s/addon/kube-vip -name "*.yaml" -type f -exec cat {} \; | grep image: | tr -s " " | uniq | cut -d " " -f 3 | tee -a ${HOME}/k8s/img/img-list.txt ``` 執行結果: ``` ghcr.io/kube-vip/kube-vip:v1.0.1 ghcr.io/kube-vip/kube-vip-cloud-provider:v0.0.12 ``` 6. 將 kube-vip images 備份到指定目錄 ``` for i in $(tail -n 2 ${HOME}/k8s/img/img-list.txt); do podman pull ${i}; done podman save -m $(tail -n 2 ${HOME}/k8s/img/img-list.txt) > ${HOME}/k8s/img/kube-vip-v1.0.1.tar ``` ### 1.4. 準備 Metrics Server `v0.8.0` 所需之 images 1. 定義 Metrics Server 版本變數 ``` METRICS_SERVER_VERSION="v0.8.0" ``` 2. 下載安裝 yaml 到指定路徑 ``` curl -sLo ${HOME}/k8s/addon/metrics/metrics-server/components.yaml https://github.com/kubernetes-sigs/metrics-server/releases/download/${METRICS_SERVER_VERSION}/components.yaml ``` 3. 檢視 Metrics Server 執行所需之 images ``` find ${HOME}/k8s/addon/metrics/metrics-server -name "*.yaml" -type f -exec cat {} \; | grep image: | tr -s " " | uniq | cut -d " " -f 3 | tee -a ${HOME}/k8s/img/img-list.txt ``` 執行結果: ``` registry.k8s.io/metrics-server/metrics-server:v0.8.0 ``` 4. 將 Metrics Server image 備份到指定目錄 ``` for i in $(grep metrics-server ${HOME}/k8s/img/img-list.txt); do podman pull ${i}; done podman save -m $(grep metrics-server ${HOME}/k8s/img/img-list.txt) > ${HOME}/k8s/img/metrics-server-${METRICS_SERVER_VERSION}.tar ``` ### 1.5. 準備 Fluentd `v1.19.0` 所需之 images 1. 定義 Fluentd 版本變數 ``` FLUENTED_VERSION="v1.19.0-1.1" ``` 2. 下載安裝 yaml 到指定路徑 ``` curl -sLo ${HOME}/k8s/addon/logging/fluentd/fluentd-daemonset-syslog.yaml https://raw.githubusercontent.com/fluent/fluentd-kubernetes-daemonset/refs/tags/${FLUENTED_VERSION}/fluentd-daemonset-syslog.yaml ``` 3. 修改 yaml ``` sed -i "s|version: v1|version: ${FLUENTED_VERSION}|g" fluentd-daemonset-syslog.yaml sed -i "s|v1-debian-syslog|v1.19.0-debian-syslog-amd64-1.1|g" fluentd-daemonset-syslog.yaml sed -i 's|fluent/fluentd-kubernetes-daemonset|docker.io/fluent/fluentd-kubernetes-daemonset|g' fluentd-daemonset-syslog.yaml ``` 3. 檢視 fluentd 執行所需之 images ``` find ${HOME}/k8s/addon/logging/fluentd -name "*.yaml" -type f -exec cat {} \; | grep image: | tr -s " " | uniq | cut -d " " -f 3 | tee -a ${HOME}/k8s/img/img-list.txt ``` 執行結果: ``` docker.io/fluent/fluentd-kubernetes-daemonset:v1.19.0-debian-syslog-amd64-1.1 ``` 4. 將 fluentd image 備份到指定目錄 ``` for i in $(grep fluentd ${HOME}/k8s/img/img-list.txt); do podman pull ${i}; done podman save -m $(grep fluentd ${HOME}/k8s/img/img-list.txt) > ${HOME}/k8s/img/fluentd-kubernetes-daemonset-v1.19.0-debian-syslog-amd64-1.1.tar ``` ### 1.6. 準備 cilium `1.18.2` 所需之 images 1. 定義 helm 和 cilium 版本變數 ``` HELM_VERSION="v3.19.0" CILIUM_VERSION="1.18.2" ``` 2. 下載 helm Binary ``` curl -sLo ${HOME}/k8s/addon/cni/cilium/helm-${HELM_VERSION}-linux-amd64.tar.gz https://get.helm.sh/helm-${HELM_VERSION}-linux-amd64.tar.gz ``` 3. 添加 Cilium Helm repository ``` helm repo add cilium https://helm.cilium.io/ ``` 2. 下載 cilium chart file ``` helm fetch cilium/cilium --version=${CILIUM_VERSION} -d ${HOME}/k8s/addon/cni/cilium/ ``` 3. 設定 helm values yaml ``` echo 'bpf: masquerade: true hubble: relay: enabled: true ui: enabled: true service: type: NodePort ipam: mode: kubernetes ipv4NativeRoutingCIDR: 10.244.0.0/16 k8sServiceHost: 172.20.6.10 k8sServicePort: 6443 kubeProxyReplacement: true' > ${HOME}/k8s/addon/cni/cilium/custom-values.yaml ``` 3. 檢視 cilium 執行所需之 images ``` helm template ${HOME}/k8s/addon/cni/cilium/cilium-${CILIUM_VERSION}.tgz -f ${HOME}/k8s/addon/cni/cilium/custom-values.yaml | awk '$1 ~ /image:/ {print $2}' | sed -e 's/\"//g' | sort -u | tee -a ${HOME}/k8s/img/img-list.txt ``` 執行結果: ``` quay.io/cilium/cilium-envoy:v1.34.7-1757592137-1a52bb680a956879722f48c591a2ca90f7791324@sha256:7932d656b63f6f866b6732099d33355184322123cfe1182e6f05175a3bc2e0e0 quay.io/cilium/cilium:v1.18.2@sha256:858f807ea4e20e85e3ea3240a762e1f4b29f1cb5bbd0463b8aa77e7b097c0667 quay.io/cilium/hubble-relay:v1.18.2@sha256:6079308ee15e44dff476fb522612732f7c5c4407a1017bc3470916242b0405ac quay.io/cilium/hubble-ui-backend:v0.13.3@sha256:db1454e45dc39ca41fbf7cad31eec95d99e5b9949c39daaad0fa81ef29d56953 quay.io/cilium/hubble-ui:v0.13.3@sha256:661d5de7050182d495c6497ff0b007a7a1e379648e60830dd68c4d78ae21761d quay.io/cilium/operator-generic:v1.18.2@sha256:cb4e4ffc5789fd5ff6a534e3b1460623df61cba00f5ea1c7b40153b5efb81805 ``` 4. 將 cilium images 備份到指定目錄 ``` for i in $(grep "quay.io/cilium" ${HOME}/k8s/img/img-list.txt); do podman pull ${i}; podman tag ${i} ${i%@*}; done podman save -m $(grep "quay.io/cilium" ${HOME}/k8s/img/img-list.txt | cut -d "@" -f 1) > ${HOME}/k8s/img/cilium-${CILIUM_VERSION}.tar ``` 5. 下載 Cilium CLI ``` CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt) CLI_ARCH=amd64 if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum} sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum ``` 6. 下載 Hubble Client ``` HUBBLE_VERSION="$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)" HUBBLE_ARCH=amd64 if [ "$(uname -m)" = "aarch64" ]; then HUBBLE_ARCH=arm64; fi curl -L --fail --remote-name-all https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-${HUBBLE_ARCH}.tar.gz{,.sha256sum} sha256sum --check hubble-linux-${HUBBLE_ARCH}.tar.gz.sha256sum ``` 10. 準備 cilium 測試網路效能 image ``` sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin cilium connectivity perf --print-image-artifacts | tee -a ${HOME}/k8s/img/img-list.txt ``` 執行結果: ``` quay.io/cilium/network-perf:1751527436-c2462ae@sha256:0c491ed7ca63e6c526593b3a2d478f856410a50fbbce7fe2b64283c3015d752f ``` 11. 將 cilium 測試網路效能 images 備份到指定目錄 ``` for i in $(grep "network-perf" ${HOME}/k8s/img/img-list.txt); do podman pull ${i}; podman tag ${i} ${i%@*}; done podman save $(grep "network-perf" ${HOME}/k8s/img/img-list.txt | cut -d "@" -f 1) > ${HOME}/k8s/img/cilium-network-perf.tar ``` ### 1.7. 準備 NFS subdir external provisioner `v4.0.2` 所需之 images 1. 定義 NFS provisioner 版本變數 ``` NFS_PROVISIONER_VERSION="v4.0.2" ``` 2. 下載安裝 yaml 到指定路徑 ``` git clone --no-checkout --depth=1 --filter=tree:0 https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner.git ${HOME}/k8s/addon/csi/nfs-provisioner/nfs-subdir-external-provisioner cd ${HOME}/k8s/addon/csi/nfs-provisioner/nfs-subdir-external-provisioner git sparse-checkout set --no-cone /deploy git checkout mv ${HOME}/k8s/addon/csi/nfs-provisioner/nfs-subdir-external-provisioner/deploy/ ${HOME}/k8s/addon/csi/nfs-provisioner/ sudo rm -r ${HOME}/k8s/addon/csi/nfs-provisioner/nfs-subdir-external-provisioner sed -i "s|busybox:stable|docker.io/library/busybox:stable|g" ${HOME}/k8s/addon/csi/nfs-provisioner/deploy/test-pod.yaml ``` 3. 建立 `kustomization.yaml` ``` echo 'namespace: nfs-provisioner bases: - ./deploy resources: - namespace.yaml patchesStrategicMerge: - patch_nfs_details.yaml' > ${HOME}/k8s/addon/csi/nfs-provisioner/kustomization.yaml ``` 4. 建立 `namespace.yaml` ``` echo '# namespace.yaml apiVersion: v1 kind: Namespace metadata: name: nfs-provisioner' > ${HOME}/k8s/addon/csi/nfs-provisioner/namespace.yaml ``` 5. 建立 `patch_nfs_details.yaml` ``` echo 'apiVersion: apps/v1 kind: Deployment metadata: labels: app: nfs-client-provisioner name: nfs-client-provisioner spec: template: spec: containers: - name: nfs-client-provisioner env: - name: NFS_SERVER value: 172.20.6.9 - name: NFS_PATH value: /k8s volumes: - name: nfs-client-root nfs: server: 172.20.6.9 path: /k8s' > ${HOME}/k8s/addon/csi/nfs-provisioner/patch_nfs_details.yaml ``` 6. 檢視 nfs-provisioner 執行所需之 images ``` find ${HOME}/k8s/addon/csi/nfs-provisioner -name "*.yaml" -type f -exec cat {} \; | grep image: | tr -s " " | uniq | cut -d " " -f 3 | tee -a ${HOME}/k8s/img/img-list.txt ``` 執行結果: ``` registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2 docker.io/library/busybox:stable ``` 4. 將 nfs-provisioner images 備份到指定目錄 ``` for i in $(grep -E "nfs-subdir-external-provisioner|busybox" ${HOME}/k8s/img/img-list.txt); do podman pull ${i}; done podman save -m $(grep -E "nfs-subdir-external-provisioner|busybox" ${HOME}/k8s/img/img-list.txt) > ${HOME}/k8s/img/nfs-provisioner-${NFS_PROVISIONER_VERSION}.tar ``` ### 1.8. 準備 Ingress NGINX `v1.13.3` 所需之 images 和 yaml 檔 1. 定義 Ingress NGINX 版本變數 ``` INGRESS_NGINX="v1.13.3" ``` 2. 下載部署 Ingress NGINX YAML ``` curl -sLo ${HOME}/k8s/addon/ingress/ingress-nginx/deployment.yaml \ https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-${INGRESS_NGINX}/deploy/static/provider/baremetal/deploy.yaml ``` 3. 將 deployment 轉成 daemonset ``` nano daemonSet.yaml ``` :::spoiler 檔案內容: ``` apiVersion: v1 kind: Namespace metadata: labels: app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx name: ingress-nginx --- apiVersion: v1 automountServiceAccountToken: true kind: ServiceAccount metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx namespace: ingress-nginx --- apiVersion: v1 automountServiceAccountToken: true kind: ServiceAccount metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-admission namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx namespace: ingress-nginx rules: - apiGroups: - "" resources: - namespaces verbs: - get - apiGroups: - "" resources: - configmaps - pods - secrets - endpoints verbs: - get - list - watch - apiGroups: - "" resources: - services verbs: - get - list - watch - apiGroups: - networking.k8s.io resources: - ingresses verbs: - get - list - watch - apiGroups: - networking.k8s.io resources: - ingresses/status verbs: - update - apiGroups: - networking.k8s.io resources: - ingressclasses verbs: - get - list - watch - apiGroups: - coordination.k8s.io resourceNames: - ingress-nginx-leader resources: - leases verbs: - get - update - apiGroups: - coordination.k8s.io resources: - leases verbs: - create - apiGroups: - "" resources: - events verbs: - create - patch - apiGroups: - discovery.k8s.io resources: - endpointslices verbs: - list - watch - get --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-admission namespace: ingress-nginx rules: - apiGroups: - "" resources: - secrets verbs: - get - create --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx rules: - apiGroups: - "" resources: - configmaps - endpoints - nodes - pods - secrets - namespaces verbs: - list - watch - apiGroups: - coordination.k8s.io resources: - leases verbs: - list - watch - apiGroups: - "" resources: - nodes verbs: - get - apiGroups: - "" resources: - services verbs: - get - list - watch - apiGroups: - networking.k8s.io resources: - ingresses verbs: - get - list - watch - apiGroups: - "" resources: - events verbs: - create - patch - apiGroups: - networking.k8s.io resources: - ingresses/status verbs: - update - apiGroups: - networking.k8s.io resources: - ingressclasses verbs: - get - list - watch - apiGroups: - discovery.k8s.io resources: - endpointslices verbs: - list - watch - get --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-admission rules: - apiGroups: - admissionregistration.k8s.io resources: - validatingwebhookconfigurations verbs: - get - update --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx namespace: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: ingress-nginx subjects: - kind: ServiceAccount name: ingress-nginx namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-admission namespace: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: ingress-nginx-admission subjects: - kind: ServiceAccount name: ingress-nginx-admission namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: ingress-nginx subjects: - kind: ServiceAccount name: ingress-nginx namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-admission roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: ingress-nginx-admission subjects: - kind: ServiceAccount name: ingress-nginx-admission namespace: ingress-nginx --- apiVersion: v1 data: null kind: ConfigMap metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-controller namespace: ingress-nginx --- apiVersion: v1 kind: Service metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-controller namespace: ingress-nginx spec: externalTrafficPolicy: Local ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - appProtocol: http name: http port: 80 protocol: TCP targetPort: http - appProtocol: https name: https port: 443 protocol: TCP targetPort: https selector: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx type: LoadBalancer --- apiVersion: v1 kind: Service metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-controller-admission namespace: ingress-nginx spec: ports: - appProtocol: https name: https-webhook port: 443 targetPort: webhook selector: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx type: ClusterIP --- apiVersion: apps/v1 kind: DaemonSet metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-controller namespace: ingress-nginx spec: selector: matchLabels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx template: metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 spec: automountServiceAccountToken: true containers: - args: - /nginx-ingress-controller - --election-id=ingress-nginx-leader - --controller-class=k8s.io/ingress-nginx - --ingress-class=nginx - --configmap=$(POD_NAMESPACE)/ingress-nginx-controller - --validating-webhook=:8443 - --validating-webhook-certificate=/usr/local/certificates/cert - --validating-webhook-key=/usr/local/certificates/key - --enable-metrics=false - --report-node-internal-ip-address=true - --watch-ingress-without-class=true env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: LD_PRELOAD value: /usr/local/lib/libmimalloc.so image: registry.k8s.io/ingress-nginx/controller:v1.13.3@sha256:1b044f6dcac3afbb59e05d98463f1dec6f3d3fb99940bc12ca5d80270358e3bd imagePullPolicy: IfNotPresent lifecycle: preStop: exec: command: - /wait-shutdown livenessProbe: failureThreshold: 5 httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: controller ports: - containerPort: 80 name: http protocol: TCP hostPort: 80 - containerPort: 443 name: https protocol: TCP hostPort: 443 - containerPort: 8443 name: webhook protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: requests: cpu: 100m memory: 90Mi securityContext: allowPrivilegeEscalation: false capabilities: add: - NET_BIND_SERVICE drop: - ALL readOnlyRootFilesystem: false runAsGroup: 82 runAsNonRoot: true runAsUser: 101 seccompProfile: type: RuntimeDefault volumeMounts: - mountPath: /usr/local/certificates/ name: webhook-cert readOnly: true dnsPolicy: ClusterFirst nodeSelector: kubernetes.io/os: linux serviceAccountName: ingress-nginx terminationGracePeriodSeconds: 300 volumes: - name: webhook-cert secret: secretName: ingress-nginx-admission --- apiVersion: batch/v1 kind: Job metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-admission-create namespace: ingress-nginx spec: template: metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-admission-create spec: automountServiceAccountToken: true containers: - args: - create - --host=ingress-nginx-controller-admission,ingress-nginx-controller-admission.$(POD_NAMESPACE).svc - --namespace=$(POD_NAMESPACE) - --secret-name=ingress-nginx-admission env: - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.6.3@sha256:3d671cf20a35cd94efc5dcd484970779eb21e7938c98fbc3673693b8a117cf39 imagePullPolicy: IfNotPresent name: create securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL readOnlyRootFilesystem: true runAsGroup: 65532 runAsNonRoot: true runAsUser: 65532 seccompProfile: type: RuntimeDefault nodeSelector: kubernetes.io/os: linux restartPolicy: OnFailure serviceAccountName: ingress-nginx-admission ttlSecondsAfterFinished: 0 --- apiVersion: batch/v1 kind: Job metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-admission-patch namespace: ingress-nginx spec: template: metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-admission-patch spec: automountServiceAccountToken: true containers: - args: - patch - --webhook-name=ingress-nginx-admission - --namespace=$(POD_NAMESPACE) - --patch-mutating=false - --secret-name=ingress-nginx-admission - --patch-failure-policy=Fail env: - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.6.3@sha256:3d671cf20a35cd94efc5dcd484970779eb21e7938c98fbc3673693b8a117cf39 imagePullPolicy: IfNotPresent name: patch securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL readOnlyRootFilesystem: true runAsGroup: 65532 runAsNonRoot: true runAsUser: 65532 seccompProfile: type: RuntimeDefault nodeSelector: kubernetes.io/os: linux restartPolicy: OnFailure serviceAccountName: ingress-nginx-admission ttlSecondsAfterFinished: 0 --- apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: nginx spec: controller: k8s.io/ingress-nginx --- apiVersion: admissionregistration.k8s.io/v1 kind: ValidatingWebhookConfiguration metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.13.3 name: ingress-nginx-admission webhooks: - admissionReviewVersions: - v1 clientConfig: service: name: ingress-nginx-controller-admission namespace: ingress-nginx path: /networking/v1/ingresses port: 443 failurePolicy: Fail matchPolicy: Equivalent name: validate.nginx.ingress.kubernetes.io rules: - apiGroups: - networking.k8s.io apiVersions: - v1 operations: - CREATE - UPDATE resources: - ingresses sideEffects: None ``` ::: 4. 檢視 ingress nginx 執行所需之 images ``` find ${HOME}/k8s/addon/ingress/ingress-nginx \ -name "daemonSet.yaml" \ -type f \ -exec grep image: {} + | \ tr -s " " | \ cut -d " " -f 3 | \ sort -u | \ uniq | \ tee -a ${HOME}/k8s/img/img-list.txt ``` 執行結果: ``` registry.k8s.io/ingress-nginx/controller:v1.13.3@sha256:1b044f6dcac3afbb59e05d98463f1dec6f3d3fb99940bc12ca5d80270358e3bd registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.6.3@sha256:3d671cf20a35cd94efc5dcd484970779eb21e7938c98fbc3673693b8a117cf39 ``` 5. 將 ingress nginx images 備份到指定目錄 ``` for i in $(grep "ingress-nginx" ${HOME}/k8s/img/img-list.txt); do podman pull ${i}; podman tag ${i} ${i%@*}; done podman save -m $(grep "ingress-nginx" ${HOME}/k8s/img/img-list.txt | cut -d "@" -f 1) > ${HOME}/k8s/img/ingress-nginx-controller-${INGRESS_NGINX}.tar ``` ### 1.9. 準備 logging operator `6.1.0` 所需之 images 1. 定義 logging operator 版本變數 ``` LOGGING_OPERATOR="6.1.0" ``` 2. 下載 logging operator chart file ``` helm pull oci://ghcr.io/kube-logging/helm-charts/logging-operator \ --version ${LOGGING_OPERATOR} \ -d ${HOME}/k8s/addon/logging/logging-operator/ ``` 3. 檢視 logging operator 執行所需之 images ``` echo 'ghcr.io/kube-logging/logging-operator:6.1.0 ghcr.io/axoflow/axosyslog:4.18.1 ghcr.io/kube-logging/logging-operator/fluentd:6.1.0-full ghcr.io/kube-logging/logging-operator/syslog-ng-reloader:6.1.0 ghcr.io/kube-logging/logging-operator/config-reloader:6.1.0 ghcr.io/kube-logging/logging-operator/fluentd-drain-watch:6.1.0 ghcr.io/kube-logging/logging-operator/node-exporter:6.1.0 ghcr.io/axoflow/axosyslog-metrics-exporter:0.0.13' >> ${HOME}/k8s/img/img-list.txt ``` 4. 將 logging operator images 備份到指定目錄 ``` for i in $(grep -E "kube-logging|axoflow" ${HOME}/k8s/img/img-list.txt); do podman pull ${i}; done podman save -m $(grep -E "kube-logging|axoflow" ${HOME}/k8s/img/img-list.txt) > ${HOME}/k8s/img/kube-logging-${LOGGING_OPERATOR}.tar ``` ### 1.10. 將整個工作目錄打包並匯入全離線的主機 1. 壓縮出全離線安裝包 ``` cd ~; tar -zcvf k8s-v1.34.1.tar.gz k8s/ ``` 2. 檢視全離線安裝包 ``` ls -lh ${HOME}/k8s-v1.34.1.tar.gz ``` 執行結果: ``` -rw-r--r--. 1 bigred bigred 1.5G Oct 17 17:25 /home/bigred/k8s-v1.34.1.tar.gz ``` 3. 將檔案上傳至所有全離線的 K8s 叢集主機,包含可上傳 image 至 Harbor 的主機 ``` scp ${HOME}/k8s-v1.34.1.tar.gz user_name@vm_ipaddress:. ``` ### 1.11. 將離線安裝所需之 Container Images 匯入 Harbor :::info 先決條件:已安裝好 Harbor ::: 1. 連線到可將 image tar 檔上傳至 Harbor 的主機 ``` ssh user@ipaddress ``` 2. 定義 Harbor FQDN ``` HARBOR_FQDN=harbor.example.com ``` 2. 下載 Harbor CA 憑證 ``` sudo curl -k \ -o /etc/pki/ca-trust/source/anchors/harbor-ca.crt \ https://${HARBOR_FQDN}/api/v2.0/systeminfo/getcert ``` 3. 更新系統的 CA 信任清單,讓 HTTPS 連線時能信任此憑證 ``` sudo update-ca-trust ``` 4. 設定 podman 信任 Harbor 自簽 CA 憑證 ``` sudo mkdir /etc/containers/certs.d/${HARBOR_FQDN} sudo curl -k \ -o /etc/containers/certs.d/${HARBOR_FQDN}/ca.crt \ https://${HARBOR_FQDN}/api/v2.0/systeminfo/getcert ``` 5. 登入 harbor ``` podman login ${HARBOR_FQDN} ``` 執行結果: ``` Username: admin Password: Login Succeeded! ``` 6. 上傳所有 container images 至 Harbor ``` cd ~/k8s/img; ./imgctl.sh ``` 正確執行結果: ``` Load All Images Successfully Currnt: 1/29 harbor.example.com/library/kube-apiserver:v1.34.1 push ok Currnt: 2/29 harbor.example.com/library/kube-controller-manager:v1.34.1 push ok Currnt: 3/29 harbor.example.com/library/kube-scheduler:v1.34.1 push ok Currnt: 4/29 harbor.example.com/library/kube-proxy:v1.34.1 push ok Currnt: 5/29 harbor.example.com/library/coredns/coredns:v1.12.1 push ok Currnt: 6/29 harbor.example.com/library/pause:3.10.1 push ok Currnt: 7/29 harbor.example.com/library/etcd:3.6.4-0 push ok Currnt: 8/29 harbor.example.com/library/ingress-nginx/controller:v1.13.3@sha256:1b044f6dcac3afbb59e05d98463f1dec6f3d3fb99940bc12ca5d80270358e3bd push ok Currnt: 9/29 harbor.example.com/library/ingress-nginx/kube-webhook-certgen:v1.6.3@sha256:3d671cf20a35cd94efc5dcd484970779eb21e7938c98fbc3673693b8a117cf39 push ok Currnt: 10/29 harbor.example.com/library/kube-vip/kube-vip:v1.0.1 push ok Currnt: 11/29 harbor.example.com/library/kube-vip/kube-vip-cloud-provider:v0.0.12 push ok Currnt: 12/29 harbor.example.com/library/metrics-server/metrics-server:v0.8.0 push ok Currnt: 13/29 harbor.example.com/library/fluent/fluentd-kubernetes-daemonset:v1.19.0-debian-syslog-amd64-1.1 push ok Currnt: 14/29 harbor.example.com/library/cilium/cilium-envoy:v1.34.7-1757592137-1a52bb680a956879722f48c591a2ca90f7791324@sha256:7932d656b63f6f866b6732099d33355184322123cfe1182e6f05175a3bc2e0e0 push ok Currnt: 15/29 harbor.example.com/library/cilium/cilium:v1.18.2@sha256:858f807ea4e20e85e3ea3240a762e1f4b29f1cb5bbd0463b8aa77e7b097c0667 push ok Currnt: 16/29 harbor.example.com/library/cilium/hubble-relay:v1.18.2@sha256:6079308ee15e44dff476fb522612732f7c5c4407a1017bc3470916242b0405ac push ok Currnt: 17/29 harbor.example.com/library/cilium/hubble-ui-backend:v0.13.3@sha256:db1454e45dc39ca41fbf7cad31eec95d99e5b9949c39daaad0fa81ef29d56953 push ok Currnt: 18/29 harbor.example.com/library/cilium/hubble-ui:v0.13.3@sha256:661d5de7050182d495c6497ff0b007a7a1e379648e60830dd68c4d78ae21761d push ok Currnt: 19/29 harbor.example.com/library/cilium/operator-generic:v1.18.2@sha256:cb4e4ffc5789fd5ff6a534e3b1460623df61cba00f5ea1c7b40153b5efb81805 push ok Currnt: 20/29 harbor.example.com/library/sig-storage/nfs-subdir-external-provisioner:v4.0.2 push ok Currnt: 21/29 harbor.example.com/library/library/busybox:stable push ok Currnt: 22/29 harbor.example.com/library/kube-logging/logging-operator:6.1.0 push ok Currnt: 23/29 harbor.example.com/library/axoflow/axosyslog:4.18.1 push ok Currnt: 24/29 harbor.example.com/library/kube-logging/logging-operator/fluentd:6.1.0-full push ok Currnt: 25/29 harbor.example.com/library/kube-logging/logging-operator/syslog-ng-reloader:6.1.0 push ok Currnt: 26/29 harbor.example.com/library/kube-logging/logging-operator/config-reloader:6.1.0 push ok Currnt: 27/29 harbor.example.com/library/kube-logging/logging-operator/fluentd-drain-watch:6.1.0 push ok Currnt: 28/29 harbor.example.com/library/kube-logging/logging-operator/node-exporter:6.1.0 push ok Currnt: 29/29 harbor.example.com/library/axoflow/axosyslog-metrics-exporter:0.0.13 push ok ``` --- ## 2. 建置 HA K8s 前置作業 ### 2.1. 先決條件 1. 已將 K8s 全離線安裝包(檔名:`k8s-v1.34.1.tar.gz`)匯入 K8s 叢集中的每一台主機 3. 已將 Harbor 自簽 root CA 憑證匯入匯入 K8s 叢集中的每一台 (Harbor 如果不是自簽 root CA 則不用做到此步驟) ### 2.2. 連線至 K8s 的第一台 Control-plane 節點 1. ssh 連線至第一台 Control-plane 節點 ``` ssh user@ipaddress ``` ### 2.3. 關閉 Firewalld Systemd Service 1. 關閉 Firewalld Systemd Service ``` sudo systemctl disable --now firewalld.service ``` ### 2.4. 調整 SELinux 為 `permissive` mode 1. 調整 SELinux 為 `permissive` mode ``` # Set SELinux in permissive mode (effectively disabling it) sudo setenforce 0 sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config ``` ### 2.5. 確認 SWAP 已關閉 1. 確認 SWAP 已關閉 ``` free -mh ``` 執行結果: ``` total used free shared buff/cache available Mem: 7.5Gi 530Mi 6.2Gi 8.0Mi 1.0Gi 7.0Gi Swap: 0B 0B 0B ``` > 確認 swap 欄位的值都是 `0` B ### 2.6. 確認 Kernel 版本 >= 5.10 1. 確認 Kernel 版本 >= 5.10 ``` uname -r ``` 執行結果: ``` 5.14.0-570.52.1.el9_6.x86_64 ``` ### 2.7. 添加必要 Kernel Modules 1. 永久添加 `br_netfilter` 和 `overlay` Kernel Modules ``` cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf overlay br_netfilter EOF ``` 2. 立即啟用 `br_netfilter` 和 `overlay` Kernel Modules ``` sudo modprobe overlay sudo modprobe br_netfilter ``` 3. 驗證已載入 `br_netfilter` 和 `overlay` Kernel Modules ``` sudo lsmod | grep -E "overlay|br_netfilter" ``` 執行結果: ``` br_netfilter 36864 0 bridge 417792 1 br_netfilter overlay 229376 0 ``` ### 2.8. 添加必要 Kernel Parameters 1. 永久添加必要 Kernel Parameters ``` cat <<EOF | sudo tee /etc/sysctl.d/99-k8s-cri.conf net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.ip_forward = 1 EOF ``` 2. 立即載入 Kernel Parameters ``` sudo sysctl --system ``` 3. 驗證 `net.bridge.bridge-nf-call-iptables`, `net.bridge.bridge-nf-call-ip6tables`, 和 `net.ipv4.ip_forward` 的值都設成 `1` ``` sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward ``` 執行結果: ``` net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.ip_forward = 1 ``` ### 2.9. 解壓縮 K8s 全離線安裝包 1. 設定 K8s 全離線安裝包所在的目錄變數 ``` PREFIX="$HOME" ``` 2. 解壓縮 K8s 全離線安裝包 ``` tar -zxvf "$PREFIX"/k8s-v1.34.1.tar.gz ``` 執行結果: ``` ... tar: k8s/img/ingress-nginx-controller-v1.13.3.tar: time stamp 2025-10-20 14:18:13 is 166476.992186816 s in the future k8s/config/ tar: k8s/img: time stamp 2025-10-20 14:34:01 is 167424.992012128 s in the future ``` ### 2.10. 安裝所有必要軟體套件 1. 安裝所有必要軟體套件 ``` sudo yum localinstall "$HOME"/k8s/pkg/kubernetes/*.rpm ``` ### 2.11. 確認節點可以透過 DNS 解析到 harbor 1. 定義 Harbor FQDN 變數 ``` HARBOR_FQDN=harbor.example.com DOMAIN_NAME_SYSTEM_SERVER=172.20.6.11 ``` 2. 確認節點可以透過 DNS 解析到 harbor ``` dig @$DOMAIN_NAME_SYSTEM_SERVER $HARBOR_FQDN +short ``` 執行結果: ``` 172.20.7.32 ``` ### 2.12. 啟動 CRI-O Service 並設為開機自動啟動 1. 啟動 CRI-O Service 並設為開機自動啟動 ``` sudo systemctl enable --now crio.service ``` 2. 確認 CRI-O Service 狀態 ``` sudo systemctl status crio.service ``` 執行結果: ``` ● crio.service - Container Runtime Interface for OCI (CRI-O) Loaded: loaded (/usr/lib/systemd/system/crio.service; enabled; preset: disabled) Active: active (running) since Sat 2025-10-18 16:48:59 CST; 1min 18s ago Docs: https://github.com/cri-o/cri-o Main PID: 5734 (crio) Tasks: 7 Memory: 20.0M CPU: 126ms CGroup: /system.slice/crio.service └─5734 /usr/bin/crio Oct 18 16:48:59 topgun-c1.kubeantony.com crio[5734]: time="2025-10-18T16:48:59.314737723+08:00" le…cher" Oct 18 16:48:59 topgun-c1.kubeantony.com crio[5734]: time="2025-10-18T16:48:59.314774402+08:00" le…face" Oct 18 16:48:59 topgun-c1.kubeantony.com crio[5734]: time="2025-10-18T16:48:59.314808872+08:00" le…bled" Oct 18 16:48:59 topgun-c1.kubeantony.com crio[5734]: time="2025-10-18T16:48:59.314814672+08:00" le…ated" Oct 18 16:48:59 topgun-c1.kubeantony.com crio[5734]: time="2025-10-18T16:48:59.314820552+08:00" le… NRI" Oct 18 16:48:59 topgun-c1.kubeantony.com crio[5734]: time="2025-10-18T16:48:59.314823302+08:00" le…p..." Oct 18 16:48:59 topgun-c1.kubeantony.com crio[5734]: time="2025-10-18T16:48:59.314825973+08:00" le…s..." Oct 18 16:48:59 topgun-c1.kubeantony.com crio[5734]: time="2025-10-18T16:48:59.314833213+08:00" le…tate" Oct 18 16:48:59 topgun-c1.kubeantony.com crio[5734]: time="2025-10-18T16:48:59.314882773+08:00" le…bled" Oct 18 16:48:59 topgun-c1.kubeantony.com systemd[1]: Started Container Runtime Interface for OCI (…I-O). Hint: Some lines were ellipsized, use -l to show in full. ``` ### 2.13. 設定 private registry 驗證 1. 將 Harbor 自簽 ca 憑證上傳至主機家目錄 2. 設定 Harbor 自簽 ca 憑證所在的目錄和 CA 憑證檔名為變數 ``` CA_PREFIX="$HOME" CA_FILE_NAME="ca.pem" ``` 3. 讓 RHEL 信任 Harbor 自簽 ca 憑證 ``` sudo cp "$CA_PREFIX"/"$CA_FILE_NAME" /etc/pki/ca-trust/source/anchors/harbor-ca.crt sudo update-ca-trust ``` 3. 編輯 CRI-O 設定檔,告訴它去哪裡讀取全域的認證資訊 ``` sudo nano /etc/crio/crio.conf.d/10-crio.conf ``` 檔案內容: > 新增 `global_auth_file` 那一行:指定 CRI-O 讀取 Docker 格式的全域認證檔案路徑 > 還有 `pause_image` 這行,如果沒指定這行預設還是會到 K8s 的 image repo 去拉 > 還有 `conmon_cgroup` 和 `cgroup_manager` 也要指定在 `crio.runtime` 區塊底下 ``` [crio.image] signature_policy = "/etc/crio/policy.json" global_auth_file = "/etc/crio/registries.d/auth.json" pause_image="harbor.example.com/library/pause:3.10.1" [crio.runtime] default_runtime = "crun" conmon_cgroup = "pod" cgroup_manager = "systemd" ``` 4. 建立存放認證檔案的目錄 ``` sudo mkdir -p /etc/crio/registries.d/ ``` 5. 設定 Harbor Registry 的 FQDN (網域名稱) 變數,方便後續使用 ``` HARBOR_FQDN=harbor.example.com ``` 6. 使用 podman 登入 Harbor,並將產生的認證 token 儲存到 CRI-O 指定的 auth.json 檔案中 ``` sudo podman login $HARBOR_FQDN --authfile /etc/crio/registries.d/auth.json ``` 7. 驗證認證檔案是否已成功建立 ``` sudo cat /etc/crio/registries.d/auth.json ``` 執行結果如下: ``` { "auths": { "harbor.example.com": { "auth": "YWRtaW46SGFyYm9yMTIzNDU=" } } } ``` 6. 編輯 system-wide 的 registry 設定檔,供 Podman, CRI-O, Skopeo 等工具共用 ``` sudo nano /etc/containers/registries.conf ``` 檔案內容如下: > 注意,Harbor FQDN 每個環境會不同。 ``` [[registry]] location = "harbor.example.com" insecure = false ``` 7. 重新啟動 CRI-O 服務,使其載入所有新的設定 (10-crio.conf 和 registries.conf) ``` sudo systemctl restart crio ``` 8. 使用 crictl (CRI 的客戶端工具) 測試是否能成功從私有 Harbor 拉取映像檔 ``` sudo crictl pull ${HARBOR_FQDN}/library/library/busybox:stable ``` 成功執行結果: ``` Image is up to date for harbor.example.com/library/library/busybox@sha256:f0d4f96113b0ebc3bc741b49cc158254ddd91bbba8dba20281b2dacd2da4a45f ``` 9. 清理測試,移除剛剛拉取的映像檔 ``` sudo crictl rmi ${HARBOR_FQDN}/library/library/busybox:stable ``` ### 2.14. 設定 kubelet service node ip 1. 指定節點 internal ip ``` INTERFACE=ens18 IPV4_IP=$(ip -4 a s $INTERFACE | awk '/inet / {print $2}' | cut -d'/' -f1) sudo sed -i \ -e '/^KUBELET_EXTRA_ARGS=$/s/$/\"--node-ip='"$IPV4_IP"'\"/' \ /etc/sysconfig/kubelet ``` 2. 檢查是否正確修改 ``` cat /etc/sysconfig/kubelet ``` 執行結果: ``` KUBELET_EXTRA_ARGS="--node-ip=172.20.6.111" ``` ### 2.15. 在執行 kubeadm 之前啟用 kubelet 服務 1. 啟動 kubelet 服務 ``` sudo systemctl enable --now kubelet.service ``` ### 2.16. 設定 NTP 校時 1. 編輯 chrony 設定檔 ``` sudo nano /etc/chrony.conf ``` 檔案內容如下: ``` #pool 2.rhel.pool.ntp.org iburst server 172.20.6.249 iburst ``` 2. 重啟 chronyd 服務 ``` sudo systemctl restart chronyd ``` 3. 確認是否成功校時 ``` sudo chronyc sources ``` 正確執行結果: ``` MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* 172.20.6.249 3 6 377 20 -64us[ -237us] +/- 19ms ``` ### 2.16. 在其他所有 K8s 節點重複執行所有步驟 2.2. ~ 2.16. --- ## 3. 透過 kubeadm 開始建立 HA K8s ### 3.1. 設定 kube-vip (單台 Control-plane 可跳過此步驟) 1. 以可以執行 sudo 的 user 透過 ssh 連線至第一台 Control-plane node ``` ssh sudo_user@ipaddderss ``` 2. 將 kube-vip static pod 拷貝到 `/etc/kubernetes/manifests/` ``` sudo cp "$HOME"/k8s/addon/kube-vip/ha-control-plane/kube-vip-static-pod.yaml /etc/kubernetes/manifests/kube-vip.yaml ``` 2. 修改 kube-vip 的 Service Account,只需在第一台 Control-plane 需要修改 ``` sudo sed -i 's|path: /etc/kubernetes/admin.conf|path: /etc/kubernetes/super-admin.conf|g' \ /etc/kubernetes/manifests/kube-vip.yaml ``` 3. 設定 harbor fqdn ``` HARBOR_FQDN=harbor.example.com ``` 4. 修改 kube-vip 的 image repo ``` sudo sed -i "s|image: ghcr.io|image: ${HARBOR_FQDN}/library|g" \ /etc/kubernetes/manifests/kube-vip.yaml ``` ``` $ sudo nano /etc/kubernetes/manifests/kube-vip.yaml ...... spec: containers: ...... - name: vip_interface value: ens18 - name: vip_subnet value: "24" ...... - name: address value: 10.10.7.31 ``` ### 3.2. 編輯 kubeadm 初始化設定檔 1. 設定 kubeadm 初始化設定檔 ``` nano "$HOME"/k8s/config/kubeadm-init.yaml ``` > 注意每個環境需設定以下項目: > * `localAPIEndpoint.advertiseAddress`: API Server 宣告(Advertise)給叢集其他成員的 IP 位址。 > * `nodeRegistration.criSocket`: Kubelet 應使用的 CRI socket 路徑。 > * `nodeRegistration.name`: 此節點在 Kubernetes 叢集中的唯一識別名稱 (通常使用主機名稱)。 > * `clusterName`: K8s 叢集的名稱。 > * `controllerManager.extraVolumes`: 掛載額外的磁碟區 (Volume) 到 `kube-controller-manager` Pod 中。 > * `imageRepository`: 用於拉取 Kubernetes 核心組件 (如 API Server, etcd) images 的 Registry 位址。 > * `kubernetesVersion`: 要部署的 Kubernetes 確切版本。 > * `networking.serviceSubnet`: 分配給 Kubernetes Services 使用的虛擬 IP 位址範圍 (CIDR)。 > * `networking.podSubnet`: 分配給叢集中所有 Pods 使用的 IP 位址範圍 (CIDR)。 > * `proxy.disabled`: 停用節點上的 `kube-proxy` 組件 。 > * `controlPlaneEndpoint`: HA 架構的 K8s LoadBalance IP,供所有節點和使用者存取 Kubernetes API Server 的 IP 或 FQDN。 > * `imageMaximumGCAge`: 節點上未使用 images 在被垃圾回收 (GC) 前可保留的 **最長** 存留時間。 > * `imageMinimumGCAge`: 節點上未使用 images 在被垃圾回收 (GC) 前必須保留的 **最短** 存留時間。 > * `shutdownGracePeriod`: Kubelet 執行正常關機程序 (Node Shutdown) 的總寬限時間。 > * `shutdownGracePeriodCriticalPods`: 在正常關機期間,Kubelet 等待「關鍵 Pods (Critical Pods)」優雅終止的專用寬限時間。 > * `systemReserved.memory`: 為 OS 系統守護進程 (如 systemd, sshd) 保留的記憶體量,Kubelet 不會使用這部分。 > * `kubeReserved.memory`: 為 Kubernetes 系統組件 (如 kubelet, 容器執行階段) 保留的記憶體量,Pod 無法使用這部分。 檔案內容如下 : ``` apiVersion: kubeadm.k8s.io/v1beta4 kind: InitConfiguration localAPIEndpoint: advertiseAddress: 172.20.6.111 bindPort: 6443 nodeRegistration: criSocket: unix:///var/run/crio/crio.sock imagePullPolicy: IfNotPresent imagePullSerial: true name: topgun-c1.kubeantony.com taints: null timeouts: controlPlaneComponentHealthCheck: 4m0s discovery: 5m0s etcdAPICall: 2m0s kubeletHealthCheck: 4m0s kubernetesAPICall: 1m0s tlsBootstrap: 5m0s upgradeManifests: 5m0s --- apiServer: certSANs: - 127.0.0.1 apiVersion: kubeadm.k8s.io/v1beta4 caCertificateValidityPeriod: 87600h0m0s certificateValidityPeriod: 8760h0m0s certificatesDir: /etc/kubernetes/pki clusterName: topgun controllerManager: extraArgs: - name: bind-address value: "0.0.0.0" - name: secure-port value: "10257" extraVolumes: - name: tz-config hostPath: /etc/localtime mountPath: /etc/localtime readOnly: true dns: imageRepository: harbor.example.com/library/coredns encryptionAlgorithm: RSA-2048 etcd: local: dataDir: /var/lib/etcd extraArgs: - name: listen-metrics-urls value: http://0.0.0.0:2381 imageRepository: harbor.example.com/library kind: ClusterConfiguration kubernetesVersion: 1.34.1 networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/16 podSubnet: 10.244.0.0/16 proxy: disabled: true scheduler: extraArgs: - name: bind-address value: "0.0.0.0" - name: secure-port value: "10259" controlPlaneEndpoint: 172.20.6.110:6443 --- apiVersion: kubelet.config.k8s.io/v1beta1 authentication: anonymous: enabled: false webhook: cacheTTL: 0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/ca.crt authorization: mode: Webhook webhook: cacheAuthorizedTTL: 0s cacheUnauthorizedTTL: 0s cgroupDriver: systemd clusterDNS: - 10.96.0.10 clusterDomain: cluster.local containerRuntimeEndpoint: "" cpuManagerReconcilePeriod: 0s crashLoopBackOff: {} evictionPressureTransitionPeriod: 0s fileCheckFrequency: 0s healthzBindAddress: 127.0.0.1 healthzPort: 10248 httpCheckFrequency: 0s imageMaximumGCAge: "168h" imageMinimumGCAge: "2m0s" kind: KubeletConfiguration logging: flushFrequency: 0 options: json: infoBufferSize: "0" text: infoBufferSize: "0" verbosity: 0 memorySwap: {} nodeStatusReportFrequency: 0s nodeStatusUpdateFrequency: 0s rotateCertificates: true runtimeRequestTimeout: 0s shutdownGracePeriod: 30s shutdownGracePeriodCriticalPods: 10s staticPodPath: /etc/kubernetes/manifests streamingConnectionIdleTimeout: 0s syncFrequency: 0s volumeStatsAggPeriod: 0s maxPods: 110 systemReserved: memory: "1Gi" kubeReserved: memory: "2Gi" ``` ### 3.3. 初始化第一台 Kubernetes control plane 1. Dry run set up the Kubernetes control plane,用以檢視是否設定檔有誤 ``` sudo kubeadm init \ --dry-run \ --upload-certs \ --config="$HOME"/k8s/config/kubeadm-init.yaml ``` 正確執行結果 : ``` ... Then you can join any number of worker nodes by running the following on each as root: kubeadm join 172.20.6.110:6443 --token 0n2iwr.ugdsni61mrtuqv4l \ --discovery-token-ca-cert-hash sha256:0e9dd65a4b3ba662a1a9b9f448cc1a632b808f863dd892d46fa4005b731da5bb ``` 2. Set up the Kubernetes control plane ``` sudo kubeadm init \ --upload-certs \ --config="$HOME"/k8s/config/kubeadm-init.yaml ``` 正確執行結果 : ``` W1018 22:31:59.203382 8373 initconfiguration.go:332] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeadm.k8s.io", Version:"v1beta4", Kind:"ClusterConfiguration"}: strict decoding error: unknown field "apiServer.timeoutForControlPlane" [init] Using Kubernetes version: v1.34.1 [preflight] Running pre-flight checks [WARNING SystemVerification]: kernel release 5.14.0-570.52.1.el9_6.x86_64 is unsupported. Supported LTS versions from the 5.x series are 5.4, 5.10 and 5.15. Any 6.x version is also supported. For cgroups v2 support, the recommended version is 5.10 or newer [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action beforehand using 'kubeadm config images pull' [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local topgun-c1.kubeantony.com] and IPs [10.96.0.1 172.20.6.111 172.20.6.110 127.0.0.1] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [localhost topgun-c1.kubeantony.com] and IPs [172.20.6.111 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [localhost topgun-c1.kubeantony.com] and IPs [172.20.6.111 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "super-admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/instance-config.yaml" [patches] Applied patch of type "application/strategic-merge-patch+json" to target "kubeletconfiguration" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests" [kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s [kubelet-check] The kubelet is healthy after 501.238442ms [control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s [control-plane-check] Checking kube-apiserver at https://172.20.6.111:6443/livez [control-plane-check] Checking kube-controller-manager at https://0.0.0.0:10257/healthz [control-plane-check] Checking kube-scheduler at https://0.0.0.0:10259/livez [control-plane-check] kube-controller-manager is healthy after 1.860780055s [control-plane-check] kube-scheduler is healthy after 2.364803775s [control-plane-check] kube-apiserver is healthy after 4.00165287s [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: 681423a365cee2a5ff5b612831e42a97ff941c0c6ad0595298a6ff603b0fae5d [mark-control-plane] Marking the node topgun-c1.kubeantony.com as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [mark-control-plane] Marking the node topgun-c1.kubeantony.com as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule] [bootstrap-token] Using token: z6kmo1.qp9d9u4xcb85xmjl [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of control-plane nodes running the following command on each as root: kubeadm join 172.20.6.110:6443 --token z6kmo1.qp9d9u4xcb85xmjl \ --discovery-token-ca-cert-hash sha256:4ea08e749828d599c6e3aefb69be7bbfbfe47bdab702907464fe033f22547e50 \ --control-plane --certificate-key 681423a365cee2a5ff5b612831e42a97ff941c0c6ad0595298a6ff603b0fae5d Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join 172.20.6.110:6443 --token z6kmo1.qp9d9u4xcb85xmjl \ --discovery-token-ca-cert-hash sha256:4ea08e749828d599c6e3aefb69be7bbfbfe47bdab702907464fe033f22547e50 ``` ### 3.4. 設定 kubeConfig 1. 設定 kubeconfig ``` mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config printf "\nsource <(kubectl completion bash)\nalias k=kubectl\ncomplete -o default -F __start_kubectl k\n" >> ~/.bashrc source ~/.bashrc ``` 2. 第一台 control plan 安裝好後再將 `kube-vip` 權限調整回來 ``` $ sudo sed -i 's|path: /etc/kubernetes/super-admin.conf|path: /etc/kubernetes/admin.conf|g' /etc/kubernetes/manifests/kube-vip.yaml $ sudo systemctl daemon-reload $ sudo systemctl restart kubelet ``` ### 3.5. 安裝 Cilium CNI 1. 安裝 helm 指令 ``` cd ${HOME}/k8s/addon/cni/cilium/ tar -zxvf helm-v3.19.0-linux-amd64.tar.gz sudo mv linux-amd64/helm /usr/local/bin/ ``` 2. 安裝 cilium 與 hubble client 指令 ``` tar -zxvf cilium-linux-amd64.tar.gz -C /tmp tar -zxvf hubble-linux-amd64.tar.gz -C /tmp sudo mv /tmp/cilium /usr/bin/ sudo mv /tmp/hubble /usr/bin/ ``` 3. 編輯 cilium 設定檔 ``` nano custom-values.yaml ``` 檔案內容: ``` operator: image: repository: harbor.example.com/library/cilium/operator tag: v1.18.2 useDigest: false image: repository: harbor.example.com/library/cilium/cilium tag: v1.18.2 useDigest: false envoy: image: repository: harbor.example.com/library/cilium/cilium-envoy tag: v1.34.7-1757592137-1a52bb680a956879722f48c591a2ca90f7791324 useDigest: false bpf: masquerade: true hubble: relay: enabled: true image: repository: harbor.example.com/library/cilium/hubble-relay tag: v1.18.2 useDigest: false ui: enabled: true service: type: NodePort frontend: image: repository: harbor.example.com/library/cilium/hubble-ui tag: v0.13.3 useDigest: false backend: image: repository: harbor.example.com/library/cilium/hubble-ui-backend tag: v0.13.3 useDigest: false ipam: mode: kubernetes ipv4NativeRoutingCIDR: 10.244.0.0/16 k8sServiceHost: 172.20.6.110 k8sServicePort: 6443 kubeProxyReplacement: true autoDirectNodeRoutes: true routingMode: native devices: ens18 ``` 4. 安裝 cilium ``` helm -n kube-system install cilium ./cilium-1.18.2.tgz -f custom-values.yaml ``` 6. 確認 cilium 已安裝成功 ``` cilium status ``` 執行結果: ``` /¯¯\ /¯¯\__/¯¯\ Cilium: OK \__/¯¯\__/ Operator: OK /¯¯\__/¯¯\ Envoy DaemonSet: OK \__/¯¯\__/ Hubble Relay: OK \__/ ClusterMesh: disabled DaemonSet cilium Desired: 6, Ready: 6/6, Available: 6/6 DaemonSet cilium-envoy Desired: 6, Ready: 6/6, Available: 6/6 Deployment cilium-operator Desired: 2, Ready: 2/2, Available: 2/2 Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1 Deployment hubble-ui Desired: 1, Ready: 1/1, Available: 1/1 Containers: cilium Running: 6 cilium-envoy Running: 6 cilium-operator Running: 2 clustermesh-apiserver hubble-relay Running: 1 hubble-ui Running: 1 Cluster Pods: 9/9 managed by Cilium Helm chart version: Image versions cilium harbor.example.com/library/cilium/cilium:v1.18.2: 6 cilium-envoy harbor.example.com/library/cilium/cilium-envoy:v1.34.7-1757592137-1a52bb680a956879722f48c591a2ca90f7791324: 6 cilium-operator harbor.example.com/library/cilium/operator-generic:v1.18.2: 2 hubble-relay harbor.example.com/library/cilium/hubble-relay:v1.18.2: 1 hubble-ui harbor.example.com/library/cilium/hubble-ui-backend:v0.13.3: 1 hubble-ui harbor.example.com/library/cilium/hubble-ui:v0.13.3: 1 ``` 7. 確認 cilium 有啟用 eBPF host routing ``` cilium_pod=$(kubectl -n kube-system get pods --field-selector status.phase=Running -l k8s-app=cilium -o name | tail -n 1 | cut -d "/" -f 2) kubectl -n kube-system exec -it ${cilium_pod} -- cilium-dbg status ``` 執行結果: ``` KubeProxyReplacement: True [ens18 172.20.6.115 172.20.6.117 fe80::be24:11ff:fe2e:cc62 (Direct Routing)] Host firewall: Disabled SRv6: Disabled CNI Chaining: none CNI Config file: successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist Cilium: Ok 1.18.2 (v1.18.2-5bd307a8) NodeMonitor: Listening for events on 2 CPUs with 64x4096 of shared memory Cilium health daemon: Ok IPAM: IPv4: 7/254 allocated from 10.244.5.0/24, IPv4 BIG TCP: Disabled IPv6 BIG TCP: Disabled BandwidthManager: Disabled Routing: Network: Native Host: BPF Attach Mode: TCX Device Mode: veth Masquerading: BPF [ens18] 10.244.0.0/16 [IPv4: Enabled, IPv6: Disabled] Controller Status: 47/47 healthy Proxy Status: OK, ip 10.244.5.177, 0 redirects active on ports 10000-20000, Envoy: external Global Identity Range: min 256, max 65535 Hubble: Ok Current/Max Flows: 2491/4095 (60.83%), Flows/s: 3.51 Metrics: Disabled Encryption: Disabled Cluster health: 6/6 reachable (2025-10-23T08:51:12Z) (Probe interval: 2m33.896961447s) Name IP Node Endpoints Modules Health: Stopped(12) Degraded(0) OK(78) ``` ### 3.6. 新增其他 Control-plane 節點進 K8s 叢集 (單台 Control-plane 可跳過此步驟) 1. 以可以執行 sudo 的 user 透過 ssh 連線至其他 Control-plane node ``` ssh sudo_user@ipaddderss ``` 2. 將 kube-vip static pod 拷貝到 `/etc/kubernetes/manifests/` ``` sudo cp "$HOME"/k8s/addon/kube-vip/ha-control-plane/kube-vip-static-pod.yaml /etc/kubernetes/manifests/kube-vip.yaml ``` 3. 設定 harbor fqdn ``` HARBOR_FQDN=harbor.example.com ``` 4. 修改 kube-vip 的 image repo ``` sudo sed -i "s|image: ghcr.io|image: ${HARBOR_FQDN}/library|g" \ /etc/kubernetes/manifests/kube-vip.yaml ``` 5. 獲得 Control-plane node Kubeadm Join Command 請在第一台 Control-plane node 執行以下命令 ``` kubeadm_init_config_file="${HOME}/k8s/config/kubeadm-init.yaml" certificate_key=$(sudo kubeadm init phase upload-certs --upload-certs --config ${kubeadm_init_config_file} | tail -n 1) join_cmd=$(sudo kubeadm token create --print-join-command) join_cp_cmd="sudo $join_cmd --control-plane --certificate-key $certificate_key" echo $join_cp_cmd ``` 執行結果: ``` sudo kubeadm join 172.20.6.110:6443 --token rl5cdl.8552g1j3ezrvxk21 --discovery-token-ca-cert-hash sha256:4ea08e749828d599c6e3aefb69be7bbfbfe47bdab702907464fe033f22547e50 --control-plane --certificate-key d52af404ed5765485f28e0c5e7e886ae244a8861a78dc3b1ce1a0487194e8a1a ``` 6. 將 Control-plane 節點加入 K8s 叢集 請在要加入 K8s 的 Control-plane node 執行以下命令 ``` sudo kubeadm join 172.20.6.110:6443 --token rl5cdl.8552g1j3ezrvxk21 --discovery-token-ca-cert-hash sha256:4ea08e749828d599c6e3aefb69be7bbfbfe47bdab702907464fe033f22547e50 --control-plane --certificate-key d52af404ed5765485f28e0c5e7e886ae244a8861a78dc3b1ce1a0487194e8a1a ``` > join node 使用的 token 會在 24 小時後自動過期 7. 重複以上 1. ~ 6. 步驟將其他 Control-plane node 加到 K8s Cluster ### 3.7. 新增 Worker 節點進 K8s 叢集 1. **在第一台 Controlplane** 將 join worker node 指令 print 出來 ``` sudo kubeadm token create --print-join-command ``` 執行結果: ``` kubeadm join 172.20.6.110:6443 --token inyh77.qp4dqe6wto9uhk8m --discovery-token-ca-cert-hash sha256:4ea08e749828d599c6e3aefb69be7bbfbfe47bdab702907464fe033f22547e50 ``` 2. 以可以執行 sudo 的 user 透過 ssh 連線至 worker node ``` ssh sudo_user@ipaddderss ``` 3. 將 worker node 加入 K8s 叢集 ``` sudo kubeadm join 172.20.6.110:6443 \ --token inyh77.qp4dqe6wto9uhk8m \ --discovery-token-ca-cert-hash sha256:4ea08e749828d599c6e3aefb69be7bbfbfe47bdab702907464fe033f22547e50 ``` 4. ssh 連線到其他 worker node,重複執行步驟 3,將其他 worker node 也加入 K8s 叢集 5. 回到第一台 Control-plane 執行以下命令,將所有 worker nodes 都貼 label ``` kubectl label node topgun-w1.kubeantony.com node-role.kubernetes.io/worker= kubectl label node topgun-w2.kubeantony.com node-role.kubernetes.io/worker= kubectl label node topgun-w3.kubeantony.com node-role.kubernetes.io/worker= ``` ### 3.8. 檢視 K8s 狀態是否正常 1. 檢視 K8s node 狀態 ``` kubectl get nodes ``` 執行結果: ``` NAME STATUS ROLES AGE VERSION topgun-c1.kubeantony.com Ready control-plane 3d15h v1.34.1 topgun-c2.kubeantony.com Ready control-plane 26h v1.34.1 topgun-c3.kubeantony.com Ready control-plane 3h24m v1.34.1 topgun-w1.kubeantony.com Ready worker 3d6h v1.34.1 topgun-w2.kubeantony.com Ready worker 17m v1.34.1 topgun-w3.kubeantony.com Ready worker 16m v1.34.1 ``` 2. 檢視 kube-system namespace 底下所有 pods 狀態 ``` kubectl -n kube-system get pods ``` 執行結果: ``` NAME READY STATUS RESTARTS AGE cilium-b74sh 1/1 Running 0 26h cilium-btg2h 1/1 Running 0 26h cilium-envoy-24skx 1/1 Running 0 26h cilium-envoy-llwv7 1/1 Running 0 17m cilium-envoy-nb5l2 1/1 Running 0 3h24m cilium-envoy-nvkz9 1/1 Running 0 26h cilium-envoy-vz6kj 1/1 Running 0 16m cilium-envoy-wm4xw 1/1 Running 0 26h cilium-gp9v9 1/1 Running 0 17m cilium-operator-794b864946-8j8j5 1/1 Running 1 (10h ago) 26h cilium-operator-794b864946-9whct 1/1 Running 0 26h cilium-vqp5v 1/1 Running 0 26h cilium-xgcch 1/1 Running 0 16m cilium-zj2xj 1/1 Running 0 3h24m coredns-f74588c75-86h95 1/1 Running 0 26h coredns-f74588c75-9992t 1/1 Running 0 26h etcd-topgun-c1.kubeantony.com 1/1 Running 0 3d15h etcd-topgun-c2.kubeantony.com 1/1 Running 0 26h etcd-topgun-c3.kubeantony.com 1/1 Running 0 3h24m hubble-relay-7d6865bf7b-7wvhl 1/1 Running 0 26h hubble-ui-c6998f8ff-6m4cn 2/2 Running 0 26h kube-apiserver-topgun-c1.kubeantony.com 1/1 Running 0 3d15h kube-apiserver-topgun-c2.kubeantony.com 1/1 Running 0 26h kube-apiserver-topgun-c3.kubeantony.com 1/1 Running 0 3h24m kube-controller-manager-topgun-c1.kubeantony.com 1/1 Running 1 (10h ago) 3d15h kube-controller-manager-topgun-c2.kubeantony.com 1/1 Running 0 26h kube-controller-manager-topgun-c3.kubeantony.com 1/1 Running 0 3h24m kube-scheduler-topgun-c1.kubeantony.com 1/1 Running 1 (10h ago) 3d15h kube-scheduler-topgun-c2.kubeantony.com 1/1 Running 0 26h kube-scheduler-topgun-c3.kubeantony.com 1/1 Running 0 3h24m kube-vip-topgun-c1.kubeantony.com 1/1 Running 1 (10h ago) 3d15h kube-vip-topgun-c3.kubeantony.com 1/1 Running 0 3h24m ``` ### 3.9. 檢測 K8s 網路延遲和吞吐量 1. 主要測試以下項目 - **吞吐量 - TCP Throughput (TCP_STREAM)**:有助於了解在特定配置下可達到的最大吞吐量。 - 此測試代表大量資料傳輸的應用程式,例如串流服務或執行資料上傳/下載的服務。 - **請求/回應速率 - Request/Response Rate (TCP_RR)**:主要測量處理單一網路封包來回轉發的延遲與效率。這項基準測試將在線路上產生每秒可能的最大封包數(packets per second),並著重測試處理單一網路封包的成本(資源消耗)。 - 此測試代表了那些維持持續性連線(persistent connections)、並與其他服務交換請求/回應類型互動的服務。這對於使用 REST 或 gRPC API 的服務來說很常見。 - **連線速率 - Connection Rate (TCP_CRR)**:它類似於 Request/Response Rate 測試,但會為每一次來回建立一個新的 TCP 連線。 - 此測試代表暴露於公網的服務,它會接收來自許多客戶端的連線。L4 代理或對外部端點開啟許多連線的服務,都是很好的例子。 1. 透過 cilium 指令檢測 K8s 網路頻寬 ``` cilium connectivity perf \ --crr \ --host-to-pod \ --pod-to-host \ --performance-image quay.io/cilium/network-perf:1751527436-c2462ae \ --node-selector-client kubernetes.io/hostname=topgun-w1.kubeantony.com \ --node-selector-server kubernetes.io/hostname=topgun-w2.kubeantony.com ``` 執行結果如下: ``` 🏃[cilium-test-1] Running 1 tests ... [=] [cilium-test-1] Test [network-perf] [1/1] ................................ 🔥 Network Performance Test Summary [cilium-test-1]: -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 📋 Scenario | Node | Test | Duration | Min | Mean | Max | P50 | P90 | P99 | Transaction rate OP/s -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 📋 pod-to-pod | same-node | TCP_CRR | 10s | 50µs | 84.48µs | 11.99ms | 77µs | 96µs | 144µs | 11787.95 📋 pod-to-pod | same-node | TCP_RR | 10s | 11µs | 30.74µs | 4.659ms | 31µs | 35µs | 46µs | 32197.86 📋 pod-to-host | same-node | TCP_CRR | 10s | 61µs | 98.57µs | 21.094ms | 90µs | 109µs | 155µs | 10104.54 📋 pod-to-host | same-node | TCP_RR | 10s | 12µs | 31.86µs | 1.091ms | 32µs | 37µs | 49µs | 31071.90 📋 host-to-pod | same-node | TCP_CRR | 10s | 60µs | 100.33µs | 14.059ms | 90µs | 109µs | 160µs | 9930.36 📋 host-to-pod | same-node | TCP_RR | 10s | 12µs | 31.94µs | 892µs | 32µs | 36µs | 48µs | 30996.63 📋 host-to-host | same-node | TCP_CRR | 10s | 52µs | 76.41µs | 1.924ms | 73µs | 89µs | 127µs | 13016.02 📋 host-to-host | same-node | TCP_RR | 10s | 12µs | 31.89µs | 1.281ms | 32µs | 36µs | 48µs | 31043.05 📋 pod-to-pod | other-node | TCP_CRR | 10s | 242µs | 330.04µs | 13.257ms | 301µs | 342µs | 790µs | 3027.08 📋 pod-to-pod | other-node | TCP_RR | 10s | 87µs | 112.64µs | 4.085ms | 112µs | 124µs | 146µs | 8867.04 📋 pod-to-host | other-node | TCP_CRR | 10s | 243µs | 321.36µs | 15.197ms | 302µs | 340µs | 483µs | 3108.95 📋 pod-to-host | other-node | TCP_RR | 10s | 89µs | 113.52µs | 3.172ms | 113µs | 125µs | 147µs | 8797.96 📋 host-to-pod | other-node | TCP_CRR | 10s | 230µs | 311.48µs | 13.035ms | 287µs | 334µs | 635µs | 3207.10 📋 host-to-pod | other-node | TCP_RR | 10s | 88µs | 115.57µs | 4.126ms | 114µs | 128µs | 153µs | 8641.95 📋 host-to-host | other-node | TCP_CRR | 10s | 229µs | 303.13µs | 21.099ms | 285µs | 322µs | 417µs | 2739.19 📋 host-to-host | other-node | TCP_RR | 10s | 89µs | 114.78µs | 4.428ms | 114µs | 126µs | 148µs | 8702.07 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------- 📋 Scenario | Node | Test | Duration | Throughput Mb/s ---------------------------------------------------------------------------------------- 📋 pod-to-pod | same-node | TCP_STREAM | 10s | 18952.05 📋 pod-to-pod | same-node | TCP_STREAM_MULTI | 10s | 45634.84 📋 pod-to-host | same-node | TCP_STREAM | 10s | 16338.34 📋 pod-to-host | same-node | TCP_STREAM_MULTI | 10s | 41785.81 📋 host-to-pod | same-node | TCP_STREAM | 10s | 15643.64 📋 host-to-pod | same-node | TCP_STREAM_MULTI | 10s | 40812.09 📋 host-to-host | same-node | TCP_STREAM | 10s | 36117.05 📋 host-to-host | same-node | TCP_STREAM_MULTI | 10s | 82274.99 📋 pod-to-pod | other-node | TCP_STREAM | 10s | 9407.21 📋 pod-to-pod | other-node | TCP_STREAM_MULTI | 10s | 9417.40 📋 pod-to-host | other-node | TCP_STREAM | 10s | 9411.27 📋 pod-to-host | other-node | TCP_STREAM_MULTI | 10s | 9411.89 📋 host-to-pod | other-node | TCP_STREAM | 10s | 9409.00 📋 host-to-pod | other-node | TCP_STREAM_MULTI | 10s | 9398.15 📋 host-to-host | other-node | TCP_STREAM | 10s | 9408.12 📋 host-to-host | other-node | TCP_STREAM_MULTI | 10s | 9416.04 ---------------------------------------------------------------------------------------- ✅ [cilium-test-1] All 1 tests (32 actions) successful, 0 tests skipped, 0 scenarios skipped. ``` 未開 ebpf host routing 的數值: ``` ... 🏃[cilium-test-1] Running 1 tests ... [=] [cilium-test-1] Test [network-perf] [1/1] ................................ 🔥 Network Performance Test Summary [cilium-test-1]: -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 📋 Scenario | Node | Test | Duration | Min | Mean | Max | P50 | P90 | P99 | Transaction rate OP/s -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 📋 pod-to-pod | same-node | TCP_CRR | 10s | 50µs | 82.85µs | 10.656ms | 78µs | 94µs | 124µs | 12017.63 📋 pod-to-pod | same-node | TCP_RR | 10s | 10µs | 30.71µs | 828µs | 30µs | 35µs | 46µs | 32231.92 📋 pod-to-host | same-node | TCP_CRR | 10s | 59µs | 94.56µs | 12.326ms | 91µs | 108µs | 137µs | 10532.25 📋 pod-to-host | same-node | TCP_RR | 10s | 12µs | 31.94µs | 2.572ms | 32µs | 37µs | 48µs | 30996.44 📋 host-to-pod | same-node | TCP_CRR | 10s | 56µs | 93.13µs | 15.588ms | 90µs | 107µs | 136µs | 10694.60 📋 host-to-pod | same-node | TCP_RR | 10s | 12µs | 31.27µs | 1.842ms | 31µs | 36µs | 48µs | 31651.76 📋 host-to-host | same-node | TCP_CRR | 10s | 53µs | 75µs | 1.479ms | 72µs | 88µs | 123µs | 13261.02 📋 host-to-host | same-node | TCP_RR | 10s | 12µs | 31.87µs | 583µs | 32µs | 36µs | 47µs | 31059.78 📋 pod-to-pod | other-node | TCP_CRR | 10s | 278µs | 368.98µs | 9.809ms | 339µs | 382µs | 965µs | 2707.98 📋 pod-to-pod | other-node | TCP_RR | 10s | 95µs | 123.71µs | 4.119ms | 122µs | 137µs | 163µs | 8073.77 📋 pod-to-host | other-node | TCP_CRR | 10s | 248µs | 371.53µs | 1.052362s | 298µs | 335µs | 500µs | 2006.19 📋 pod-to-host | other-node | TCP_RR | 10s | 90µs | 114.49µs | 2.982ms | 114µs | 126µs | 147µs | 8723.87 📋 host-to-pod | other-node | TCP_CRR | 10s | 279µs | 366.76µs | 12.3ms | 338µs | 375µs | 850µs | 2724.34 📋 host-to-pod | other-node | TCP_RR | 10s | 105µs | 135.79µs | 15.723ms | 133µs | 148µs | 176µs | 7355.48 📋 host-to-host | other-node | TCP_CRR | 10s | 229µs | 301.42µs | 15.967ms | 284µs | 314µs | 415µs | 2739.19 📋 host-to-host | other-node | TCP_RR | 10s | 88µs | 114.62µs | 4.249ms | 114µs | 127µs | 148µs | 8713.25 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------- 📋 Scenario | Node | Test | Duration | Throughput Mb/s ---------------------------------------------------------------------------------------- 📋 pod-to-pod | same-node | TCP_STREAM | 10s | 19819.33 📋 pod-to-pod | same-node | TCP_STREAM_MULTI | 10s | 44767.93 📋 pod-to-host | same-node | TCP_STREAM | 10s | 17519.15 📋 pod-to-host | same-node | TCP_STREAM_MULTI | 10s | 41288.42 📋 host-to-pod | same-node | TCP_STREAM | 10s | 15667.62 📋 host-to-pod | same-node | TCP_STREAM_MULTI | 10s | 40686.99 📋 host-to-host | same-node | TCP_STREAM | 10s | 36663.82 📋 host-to-host | same-node | TCP_STREAM_MULTI | 10s | 83454.15 📋 pod-to-pod | other-node | TCP_STREAM | 10s | 5515.69 📋 pod-to-pod | other-node | TCP_STREAM_MULTI | 10s | 5462.09 📋 pod-to-host | other-node | TCP_STREAM | 10s | 9393.23 📋 pod-to-host | other-node | TCP_STREAM_MULTI | 10s | 9399.02 📋 host-to-pod | other-node | TCP_STREAM | 10s | 5456.02 📋 host-to-pod | other-node | TCP_STREAM_MULTI | 10s | 5423.52 📋 host-to-host | other-node | TCP_STREAM | 10s | 9410.65 📋 host-to-host | other-node | TCP_STREAM_MULTI | 10s | 9416.39 ---------------------------------------------------------------------------------------- ✅ [cilium-test-1] All 1 tests (32 actions) successful, 0 tests skipped, 0 scenarios skipped. ``` - 通用欄位 * `Scenario` (情境): 描述流量的來源和目的地。 * `pod-to-pod`: 從一個 Pod 到另一個 Pod。 * `pod-to-host`: 從一個 Pod 到其運行的節點 (Host)。 * `host-to-pod`: 從一個節點 (Host) 到一個 Pod。 * `host-to-host`: 從一個節點 (Host) 到另一個節點 (Host)。 * **`Node` (節點):** 描述來源和目的地的相對位置。 * `same-node`: 來源和目的地在**同一個** Kubernetes 節點上。 * `other-node`: 來源和目的地在**不同的** Kubernetes 節點上。 - 表 1 這張表格專注於延遲(完成一個操作需要多長時間)和事務處理速率(每秒可以完成多少操作),主要用於 `TCP_CRR` 和 `TCP_RR` 測試。時間單位是微秒 (`µs`) 或毫秒 (`ms`)。 * `Min` (最小延遲): 在測試期間觀察到的最快的單次事務(請求/回應或連接/關閉)延遲。 * `Mean` (平均延遲): 所有事務延遲的算術平均值。 * `Max` (最大延遲): 在測試期間觀察到的最慢的單次事務延遲。高 `Max` 值可能表示網路抖動或偶發性的效能瓶頸。 * `P50` (50百分位延遲): 也稱為中位數。50% 的事務延遲低於此值。與 `Mean` 相比,`P50` 受極端異常值的影響較小。 * `P90` (90百分位延遲): 90% 的事務延遲低於此值。這是一個常見的服務水平指標 (SLI),用於衡量絕大多數用戶的體驗。 * `P99` (99百分位延遲): 99% 的事務延遲低於此值。這有助於識別「長尾延遲」(long-tail latency),即影響少數請求的嚴重效能問題。 * `Transaction rate OP/s` (每秒事務處理速率): 每秒操作數 (Operations Per Second)。 * 這個值越高越好,表示處理事務的效率越高。 - 表 2 這張表格專注於頻寬(每秒可以傳輸多少數據),主要用於 `TCP_STREAM` 和 `TCP_STREAM_MULTI` 測試。 * `Throughput Mb/s` (吞吐量 Mb/s): - 每秒百萬位元 (Megabits per second)。這個值顯示了在測試期間,來源和目的地之間數據傳輸的平均速率。值越高,表示網路頻寬越大。 * `TCP_STREAM`: - 使用單個 TCP 連接測量最大數據傳輸吞吐量(頻寬)。 * `TCP_STREAM_MULTI`: - 使用 32 個並行 TCP 連接測量最大數據傳輸吞吐量。這通常能更好地佔滿網路鏈路,顯示系統的總頻寬上限。 --- ## 4. 安裝 K8s Addon ### 4.1. 安裝 Metrics server 1. 切換工作目錄 ``` cd ${HOME}/k8s/addon/metrics/metrics-server ``` 2. 編輯 Kustomization File ``` nano kustomization.yaml ``` 檔案內容: ``` apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources: - ./components.yaml images: - name: registry.k8s.io/metrics-server/metrics-server newName: harbor.example.com/library/metrics-server/metrics-server newTag: v0.8.0 patches: - target: kind: Deployment name: metrics-server patch: |- - op: add path: /spec/template/spec/containers/0/args/- value: --kubelet-insecure-tls ``` 3. 驗證 yaml 設定檔是否有誤 ``` kubectl apply -k . --dry-run=server ``` 正確執行結果: ``` serviceaccount/metrics-server created (server dry run) clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created (server dry run) clusterrole.rbac.authorization.k8s.io/system:metrics-server created (server dry run) rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created (server dry run) clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created (server dry run) clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created (server dry run) service/metrics-server created (server dry run) deployment.apps/metrics-server created (server dry run) apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created (server dry run) ``` 4. 部署 Metrics Server ``` kubectl apply -k . ``` 5. 檢視 Metrics Server pods 運作狀態 ``` kubectl -n kube-system get pods -l k8s-app=metrics-server ``` 執行結果: ``` NAME READY STATUS RESTARTS AGE metrics-server-69779649f-frvqz 1/1 Running 0 43s ``` 6. 檢視 K8s 所有 nodes 的 cpu 記憶體使用狀況 ``` kubectl top nodes ``` 執行結果: ``` NAME CPU(cores) CPU(%) MEMORY(bytes) MEMORY(%) topgun-c1.kubeantony.com 41m 2% 1705Mi 37% topgun-c2.kubeantony.com 34m 1% 1175Mi 26% topgun-c3.kubeantony.com 28m 1% 1057Mi 23% topgun-w1.kubeantony.com 12m 0% 1066Mi 23% topgun-w2.kubeantony.com 10m 0% 947Mi 21% topgun-w3.kubeantony.com 9m 0% 778Mi 17% ``` ### 4.2. 安裝 kube-vip On-Premises Cloud Controller 1. 切換工作目錄 ``` cd ${HOME}/k8s/addon/kube-vip/k8s-lb-service ``` 2. 將可分配的 IP Range 寫入 configmap ``` kubectl -n kube-system create configmap kubevip \ --from-literal range-global=172.20.6.117-172.20.6.120 \ --dry-run=client -o yaml > kube-vip-cm.yaml ``` 3. 編輯 Kustomization File ``` nano kustomization.yaml ``` 檔案內容: ``` apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources: - ./kube-vip-rbac.yaml - ./kube-vip-ds.yaml - ./kube-vip-cloud-controller.yaml - ./kube-vip-cm.yaml images: - name: ghcr.io/kube-vip/kube-vip newName: harbor.example.com/library/kube-vip/kube-vip newTag: v1.0.1 - name: ghcr.io/kube-vip/kube-vip-cloud-provider newName: harbor.example.com/library/kube-vip/kube-vip-cloud-provider newTag: v0.0.12 patches: - target: kind: DaemonSet name: kube-vip-ds namespace: kube-system patch: |- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-vip-ds spec: template: spec: serviceAccountName: kube-vip nodeSelector: node-role.kubernetes.io/worker: "" containers: - name: kube-vip env: - name: vip_interface value: "ens18" - name: vip_subnet value: "16" volumeMounts: - mountPath: /etc/kubernetes/admin.conf $patch: delete volumes: - name: kubeconfig $patch: delete - target: kind: ConfigMap name: kubevip namespace: kube-system patch: |- - op: add path: /data/range-global value: "172.20.6.117-172.20.6.120" ``` 4. 驗證 yaml 設定檔是否有誤 ``` kubectl apply -k . --dry-run=server ``` 正確執行結果: ``` serviceaccount/kube-vip created (server dry run) serviceaccount/kube-vip-cloud-controller created (server dry run) clusterrole.rbac.authorization.k8s.io/system:kube-vip-cloud-controller-role created (server dry run) clusterrole.rbac.authorization.k8s.io/system:kube-vip-role created (server dry run) clusterrolebinding.rbac.authorization.k8s.io/system:kube-vip-binding created (server dry run) clusterrolebinding.rbac.authorization.k8s.io/system:kube-vip-cloud-controller-binding created (server dry run) configmap/kubevip created (server dry run) Warning: spec.template.spec.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[1].preference.matchExpressions[0].key: node-role.kubernetes.io/master is use "node-role.kubernetes.io/control-plane" instead deployment.apps/kube-vip-cloud-provider created (server dry run) daemonset.apps/kube-vip-ds created (server dry run) ``` 5. 部署 kube-vip On-Premises Cloud Controller ``` kubectl apply -k . ``` 6. 檢視 kube-vip On-Premises Cloud Controller pods 運作狀態 ``` kubectl -n kube-system get pods -l component=kube-vip-cloud-provider ``` 執行結果: ``` NAME READY STATUS RESTARTS AGE kube-vip-cloud-provider-d9b9446b9-6ll6t 1/1 Running 0 45s ``` 7. 檢視 kube-vip ds pods 運作狀態,並確認 pods 只跑在 worker nodes 上 ``` kubectl -n kube-system get pods -l app.kubernetes.io/name=kube-vip-ds -o wide ``` 執行結果: ``` NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-vip-ds-kglmc 1/1 Running 0 59s 172.20.6.114 topgun-w1.kubeantony.com <none> <none> kube-vip-ds-krst5 1/1 Running 0 59s 172.20.6.115 topgun-w2.kubeantony.com <none> <none> kube-vip-ds-r8cmc 1/1 Running 0 59s 172.20.6.116 topgun-w3.kubeantony.com <none> <none> ``` ### 4.3. 安裝 Ingress NGINX Controller 1. 切換工作目錄 ``` cd ${HOME}/k8s/addon/ingress/ingress-nginx ``` 2. 編輯 Kustomization File ``` nano kustomization.yaml ``` 檔案內容如下: ``` apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources: #- ./deployment.yaml - ./daemonSet.yaml images: - name: registry.k8s.io/ingress-nginx/controller newName: harbor.example.com/library/ingress-nginx/controller newTag: v1.13.3 - name: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.6.3 newName: harbor.example.com/library/ingress-nginx/kube-webhook-certgen newTag: v1.6.3 patches: - target: kind: DaemonSet name: ingress-nginx-controller patch: |- apiVersion: apps/v1 kind: DaemonSet metadata: name: ingress-nginx-controller spec: template: spec: nodeSelector: node-role.kubernetes.io/worker: "" ``` 3. 驗證 yaml 設定檔是否有誤 ``` kubectl apply -k . --dry-run=server ``` 正確執行結果: ``` namespace/ingress-nginx created (server dry run) clusterrole.rbac.authorization.k8s.io/ingress-nginx created (server dry run) clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created (server dry run) clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created (server dry run) clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created (server dry run) ingressclass.networking.k8s.io/nginx created (server dry run) validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created (server dry run) Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found Error from server (NotFound): error when creating ".": namespaces "ingress-nginx" not found ``` 5. 部署 Ingress NGINX Controller ``` kubectl apply -k . ``` 6. 檢視 Ingress NGINX Controller pods 運作狀態,並確認是否只在 woker nodes 上執行 ``` kubectl -n ingress-nginx get pods -o wide ``` 執行結果: ``` NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ingress-nginx-controller-4tgxs 1/1 Running 0 90s 10.244.5.109 topgun-w2.kubeantony.com <none> <none> ingress-nginx-controller-n9t75 1/1 Running 0 67s 10.244.6.115 topgun-w3.kubeantony.com <none> <none> ingress-nginx-controller-thkd4 1/1 Running 0 44s 10.244.1.190 topgun-w1.kubeantony.com <none> <none> ``` 7. 建立測試目錄 ``` mkdir test ``` 9. 編輯測試用 deployment ``` nano test/test-deployment ``` 檔案內容: ``` apiVersion: apps/v1 kind: Deployment metadata: labels: app: httpd name: httpd spec: replicas: 3 selector: matchLabels: app: httpd strategy: {} template: metadata: labels: app: httpd spec: containers: - image: harbor.example.com/library/library/busybox:stable name: busybox command: ["/bin/sh", "-c"] args: - echo haha > /tmp/index.html; httpd -p 888 -h /tmp -f ``` 7. 編輯 K8s service yaml ``` nano test/test-svc.yaml ``` 檔案內容: ``` apiVersion: v1 kind: Service metadata: labels: app: httpd name: test-svc spec: ports: - port: 80 protocol: TCP targetPort: 888 selector: app: httpd ``` 8. 編輯 K8s ingress yaml ``` nano test/test-ing.yaml ``` 檔案內容: ``` apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: minimal-ingress spec: ingressClassName: nginx rules: - host: test.k8s.io http: paths: - path: / pathType: Prefix backend: service: name: test-svc port: number: 80 ``` 10. 部署測試 deployment, service 和 ingress ``` kubectl apply -f test/ ``` 11. 確認 pods 狀態 ``` kubectl get pods ``` 執行結果: ``` NAME READY STATUS RESTARTS AGE httpd-77c9cfcf7d-2tqks 1/1 Running 0 70m httpd-77c9cfcf7d-4g9fl 1/1 Running 0 70m httpd-77c9cfcf7d-6w2l2 1/1 Running 0 70m ``` 12. 確認 ingress 有拿到節點的 ip ``` kubectl get ing ``` 執行結果: ``` NAME CLASS HOSTS ADDRESS PORTS AGE minimal-ingress nginx test.k8s.io 172.20.6.114,172.20.6.115,172.20.6.116 80 66m ``` 13. 驗證可以存取網站服務 ``` for i in {114..116}; do time curl -H "Host: test.k8s.io" http://172.20.6.${i};echo ;done ``` 正確執行結果: ``` haha real 0m0.006s user 0m0.003s sys 0m0.002s haha real 0m0.005s user 0m0.002s sys 0m0.002s haha real 0m0.005s user 0m0.002s sys 0m0.001s ```