# iTHome K8s Summit 2025 - Kubernetes in Kubernetes (K3k) :::success K3k(Kubernetes in Kubernetes)允許使用者在自己的K8s叢集中建立自己的K8s叢集,建立叢集的方式是透過K3s,用輕量級的K8s版本,建立一個專屬自己的環境。 支援兩種模式: 1. Shared Mode,與host叢集共用環境資源。 2. Virtual Mode,完全虛擬出自己的環境,**不共用host資源**。 K3k Virtual Mode已經通過CNCF的一致性驗證,Share Mode正在認證中,希望在不久的將來也可以跟大家見面。 ::: 環境需求: :::info 1. OS: 主流Linux即可,本文採用SUSE SLES 15 SP6,也可以用OpenSUSE LEAP 15.6。 2. CPU: 4 core, RAM: 8G, Disk: 50G, NIC: 1 port ::: 注意事項: :::warning 1. 需要建立storageclass, 本文使用local-path-provisioner。 2. 需要使用helm。 3. 需要有個K8s,本文使用RKE2(Rancher's enterprise-ready next-generation Kubernetes distribution),由SUSE所提供的K8s發行版。 4. K3s(Lightweight Kubernetes),是SUSE所提供的輕量級K8s發行版。 ::: ## 1. RKE2 quick start ```shell curl -sfL https://get.rke2.io | INSTALL_RKE2_CHANNEL=v1.33.3+rke2r1 sh - [WARN] /usr/local is read-only or a mount point; installing to /opt/rke2 [INFO] finding release for channel v1.33.3+rke2r1 [INFO] using v1.33.3+rke2r1 as release [INFO] downloading checksums at https://github.com/rancher/rke2/releases/download/v1.33.3%2Brke2r1/sha256sum-amd64.txt [INFO] downloading tarball at https://github.com/rancher/rke2/releases/download/v1.33.3%2Brke2r1/rke2.linux-amd64.tar.gz [INFO] verifying tarball [INFO] unpacking tarball file to /opt/rke2 [INFO] updating tarball contents to reflect install path [INFO] moving systemd units to /etc/systemd/system [INFO] install complete; you may want to run: export PATH=$PATH:/opt/rke2/bin ``` 建立rke2資料夾 ``` mkdir -p /etc/rancher/rke2/ ``` 建立rke2 config file ```shell vim /etc/rancher/rke2/config.yaml ``` config.yaml: ```yaml node-name: - "your_node_name" ``` 數分鐘內會系統會啟用RKE2,會需要一些下載image的時間。 ```shell sudo systemctl enable --now rke2-server.service Created symlink /etc/systemd/system/multi-user.target.wants/rke2-server.service → /etc/systemd/system/rke2-server.service. ``` 直接從環境複製所需要的CLI與kubeconfig ```shell # cp /var/lib/rancher/rke2/bin/kubectl /usr/local/bin/ # mkdir .kube # cp /etc/rancher/rke2/rke2.yaml .kube/config #kubectl get node NAME STATUS ROLES AGE VERSION rke2 Ready control-plane,etcd,master 105s v1.33.3+rke2r1 ``` ## 2. 安裝helm 使用helm 3.19.0 ```shell # wget https://get.helm.sh/helm-v3.19.0-linux-amd64.tar.gz # tar -zxvf helm-v3.19.0-linux-amd64.tar.gz linux-amd64/ linux-amd64/README.md linux-amd64/LICENSE linux-amd64/helm # cp linux-amd64/helm /usr/local/bin/ # helm version version.BuildInfo{Version:"v3.19.0", GitCommit:"3d8990f0836691f0229297773f3524598f46bda6", GitTreeState:"clean", GoVersion:"go1.24.7"} ``` ## 3. 安裝local-path-provisioner 如果沒有storage class的話,啟用cluster會因為沒地方存etcd而fail。 ``` # kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.32/deploy/local-path-storage.yaml namespace/local-path-storage created serviceaccount/local-path-provisioner-service-account created role.rbac.authorization.k8s.io/local-path-provisioner-role created clusterrole.rbac.authorization.k8s.io/local-path-provisioner-role created rolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created clusterrolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created deployment.apps/local-path-provisioner created storageclass.storage.k8s.io/local-path created configmap/local-path-config created # kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE local-path rancher.io/local-path Delete WaitForFirstConsumer false 9s ``` 改為預設的storageclass ```shell # kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' storageclass.storage.k8s.io/local-path patched # kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE local-path (default) rancher.io/local-path Delete WaitForFirstConsumer false 110s ``` ## 3. 安裝K3k (Kubernetes in Kubernetes) K3k主要服務登場 ```shell # helm repo add k3k https://rancher.github.io/k3k "k3k" has been added to your repositories # helm repo update Hang tight while we grab the latest from your chart repositories... ...Successfully got an update from the "k3k" chart repository Update Complete. ⎈Happy Helming!⎈ # helm install --namespace k3k-system --create-namespace k3k k3k/k3k NAME: k3k LAST DEPLOYED: Fri Sep 19 09:32:09 2025 NAMESPACE: k3k-system STATUS: deployed REVISION: 1 TEST SUITE: None # kubectl get po -n k3k-system NAME READY STATUS RESTARTS AGE k3k-667f9f9849-x6qwb 1/1 Running 0 43s # kubectl -n k3k-system get po NAME READY STATUS RESTARTS AGE k3k-667f9f9849-x6qwb 1/1 Running 0 55s # kubectl -n k3k-system logs k3k-667f9f9849-x6qwb {"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"Starting k3k - Version: v0.3.4"} W0919 01:32:31.705342 1 client_config.go:659] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. {"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"adding cluster controller"} {"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"starting port allocator"} {"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"adding etcd pod controller"} {"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"adding clusterpolicy controller"} {"level":"info","timestamp":"2025-09-19T01:32:31.705Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"} {"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"Starting EventSource","controller":"k3k-pod-controller","source":"kind source: *v1.Pod"} {"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"Starting Controller","controller":"k3k-pod-controller"} {"level":"info","timestamp":"2025-09-19T01:32:31.705Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":":8080","secure":false} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster","source":"kind source: *v1alpha1.Cluster"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster","source":"kind source: *v1.StatefulSet"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster","source":"kind source: *v1.Service"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1alpha1.VirtualClusterPolicy"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster","source":"kind source: *v1.Namespace"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting Controller","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1.NetworkPolicy"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1.ResourceQuota"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1.LimitRange"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1.Namespace"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1.Node"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1alpha1.Cluster"} {"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting Controller","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy"} {"level":"info","timestamp":"2025-09-19T01:32:31.811Z","msg":"Starting workers","controller":"k3k-pod-controller","worker count":50} {"level":"info","timestamp":"2025-09-19T01:32:31.815Z","msg":"Starting workers","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster","worker count":50} {"level":"info","timestamp":"2025-09-19T01:32:31.818Z","msg":"Starting workers","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","worker count":50} ``` 建立cluster yaml file - shared cluster ```yaml apiVersion: k3k.io/v1alpha1 kind: Cluster metadata: name: single-server spec: mode: "shared" # share的話,會看到本機的K8s版本 servers: 1 agents: 3 version: v1.30.14-k3s2 # docker hub上沒有的話就不能用,但是Rancher部署的話可以有版本清單,可以用最新的版本 clusterCIDR: 10.30.0.0/16 serviceCIDR: 10.31.0.0/16 clusterDNS: 10.30.0.10 #persistence: local-path serverArgs: - "--write-kubeconfig-mode=777" ``` ```shell # kubectl -n k3k-system apply -f singlenode.yaml cluster.k3k.io/single-server configured # kubectl -n k3k-system logs k3k-single-server-server-0 time="2025-09-19T02:29:09Z" level=warning msg="Webhooks and apiserver aggregation may not function properly without an agent; please set egress-selector-mode to 'cluster' or 'pod'" time="2025-09-19T02:29:09Z" level=info msg="Starting k3s v1.30.14+k3s2 (071b1ead)" time="2025-09-19T02:29:09Z" level=info msg="Managed etcd cluster initializing" time="2025-09-19T02:29:09Z" level=info msg="generated self-signed CA certificate CN=k3s-client-ca@1758248949: notBefore=2025-09-19 02:29:09.343427407 +0000 UTC notAfter=2035-09-17 02:29:09.343427407 +0000 UTC" time="2025-09-19T02:29:09Z" level=info msg="certificate CN=system:admin,O=system:masters signed by CN=k3s-client-ca@1758248949: notBefore=2025-09-19 02:29:09 +0000 UTC notAfter=2026-09-19 02:29:09 +0000 UTC" ... ... ... time="2025-09-19T02:30:13Z" level=info msg="Serving HTTP bootstrap from datastore for 10.42.0.19:59648" time="2025-09-19T02:30:14Z" level=info msg="Serving HTTP bootstrap from datastore for 10.42.0.19:59662" time="2025-09-19T02:30:16Z" level=info msg="Serving HTTP bootstrap from datastore for 10.42.0.19:59674" # kubectl -n k3k-system get pod NAME READY STATUS RESTARTS AGE coredns-645bdb8675-ms2x2-kube-system-single-server-636f72-7f3a3 1/1 Running 0 99s k3k-667f9f9849-x6qwb 1/1 Running 0 60m k3k-single-server-kubelet-jwb54 1/1 Running 3 (2m23s ago) 6m32s k3k-single-server-server-0 1/1 Running 0 3m25s ``` 確認建立成功 ```shell # k3kcli kubeconfig generate --namespace k3k-mycluster --name mycluster INFO[0000] waiting for cluster to be available.. INFO[0000] certificate CN=system:admin,O=system:masters signed by CN=k3s-client-ca@1758248223: notBefore=2025-09-19 02:17:03 +0000 UTC notAfter=2026-09-19 02:34:50 +0000 UTC INFO[0000] You can start using the cluster with: export KUBECONFIG=/root/k3k-mycluster-mycluster-kubeconfig.yaml kubectl cluster-info ``` 確認操作是否正常 ```shell # export KUBECONFIG=/root/k3k-mycluster-mycluster-kubeconfig.yaml # kubectl get po -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-5688667fd4-zn6hx 1/1 Running 0 21m # kubectl get no NAME STATUS ROLES AGE VERSION rke2 Ready agent 15m v1.33.3-k3s1 # kubectl create deploy web --image=nginx deployment.apps/web created demo-node2:~ # kubectl get pod -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES web-65d846d465-plwmn 1/1 Running 0 14s 10.42.0.35 rke2 <none> <none> # kubectl get pod -A -owide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default web-65d846d465-plwmn 1/1 Running 0 23s 10.42.0.35 rke2 <none> <none> kube-system coredns-5688667fd4-zn6hx 1/1 Running 0 21m 10.42.0.29 rke2 <none> <none> # kubectl expose deploy web --target-port=80 --port=80 --type=NodePort service/web exposed # kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 27d web NodePort 10.43.176.62 <none> 80:32719/TCP 5s ``` 外面的世界看起來是如何呢? ```shell # kubectl --kubeconfig .kube/config get po -A |grep web k3k-mycluster web-65d846d465-plwmn-default-mycluster-7765622d3635643834-0359f 1/1 Running 0 27d # kubectl --kubeconfig config get svc -A |grep 32719 k3k-mycluster web-default-mycluster-7765622b64656661756c742b6d79636c757-72ca1 NodePort 10.43.176.62 <none> 80:32719/TCP 55s ``` :::success 是不是感覺很多東西可以嘗試看看呢? ::: 建立cluster yaml file - virtual cluster ```yaml apiVersion: k3k.io/v1alpha1 kind: Cluster metadata: name: virtual-single-server spec: mode: "virtual" servers: 1 agents: 3 version: v1.30.14-k3s2 clusterCIDR: 10.30.0.0/16 serviceCIDR: 10.31.0.0/16 clusterDNS: 10.30.0.10 #persistence: local-path serverArgs: - "--write-kubeconfig-mode=777" ``` 如果這時候踩到open file上限,可以暫時解開一下 ```shell ulimit -n 65535 ```