# iTHome K8s Summit 2025 - Kubernetes in Kubernetes (K3k)
:::success
K3k(Kubernetes in Kubernetes)允許使用者在自己的K8s叢集中建立自己的K8s叢集,建立叢集的方式是透過K3s,用輕量級的K8s版本,建立一個專屬自己的環境。
支援兩種模式:
1. Shared Mode,與host叢集共用環境資源。
2. Virtual Mode,完全虛擬出自己的環境,**不共用host資源**。
K3k Virtual Mode已經通過CNCF的一致性驗證,Share Mode正在認證中,希望在不久的將來也可以跟大家見面。
:::
環境需求:
:::info
1. OS: 主流Linux即可,本文採用SUSE SLES 15 SP6,也可以用OpenSUSE LEAP 15.6。
2. CPU: 4 core, RAM: 8G, Disk: 50G, NIC: 1 port
:::
注意事項:
:::warning
1. 需要建立storageclass, 本文使用local-path-provisioner。
2. 需要使用helm。
3. 需要有個K8s,本文使用RKE2(Rancher's enterprise-ready next-generation Kubernetes distribution),由SUSE所提供的K8s發行版。
4. K3s(Lightweight Kubernetes),是SUSE所提供的輕量級K8s發行版。
:::
## 1. RKE2 quick start
```shell
curl -sfL https://get.rke2.io | INSTALL_RKE2_CHANNEL=v1.33.3+rke2r1 sh -
[WARN] /usr/local is read-only or a mount point; installing to /opt/rke2
[INFO] finding release for channel v1.33.3+rke2r1
[INFO] using v1.33.3+rke2r1 as release
[INFO] downloading checksums at https://github.com/rancher/rke2/releases/download/v1.33.3%2Brke2r1/sha256sum-amd64.txt
[INFO] downloading tarball at https://github.com/rancher/rke2/releases/download/v1.33.3%2Brke2r1/rke2.linux-amd64.tar.gz
[INFO] verifying tarball
[INFO] unpacking tarball file to /opt/rke2
[INFO] updating tarball contents to reflect install path
[INFO] moving systemd units to /etc/systemd/system
[INFO] install complete; you may want to run: export PATH=$PATH:/opt/rke2/bin
```
建立rke2資料夾
```
mkdir -p /etc/rancher/rke2/
```
建立rke2 config file
```shell
vim /etc/rancher/rke2/config.yaml
```
config.yaml:
```yaml
node-name:
- "your_node_name"
```
數分鐘內會系統會啟用RKE2,會需要一些下載image的時間。
```shell
sudo systemctl enable --now rke2-server.service
Created symlink /etc/systemd/system/multi-user.target.wants/rke2-server.service → /etc/systemd/system/rke2-server.service.
```
直接從環境複製所需要的CLI與kubeconfig
```shell
# cp /var/lib/rancher/rke2/bin/kubectl /usr/local/bin/
# mkdir .kube
# cp /etc/rancher/rke2/rke2.yaml .kube/config
#kubectl get node
NAME STATUS ROLES AGE VERSION
rke2 Ready control-plane,etcd,master 105s v1.33.3+rke2r1
```
## 2. 安裝helm
使用helm 3.19.0
```shell
# wget https://get.helm.sh/helm-v3.19.0-linux-amd64.tar.gz
# tar -zxvf helm-v3.19.0-linux-amd64.tar.gz
linux-amd64/
linux-amd64/README.md
linux-amd64/LICENSE
linux-amd64/helm
# cp linux-amd64/helm /usr/local/bin/
# helm version
version.BuildInfo{Version:"v3.19.0", GitCommit:"3d8990f0836691f0229297773f3524598f46bda6", GitTreeState:"clean", GoVersion:"go1.24.7"}
```
## 3. 安裝local-path-provisioner
如果沒有storage class的話,啟用cluster會因為沒地方存etcd而fail。
```
# kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.32/deploy/local-path-storage.yaml
namespace/local-path-storage created
serviceaccount/local-path-provisioner-service-account created
role.rbac.authorization.k8s.io/local-path-provisioner-role created
clusterrole.rbac.authorization.k8s.io/local-path-provisioner-role created
rolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created
clusterrolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created
deployment.apps/local-path-provisioner created
storageclass.storage.k8s.io/local-path created
configmap/local-path-config created
# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-path rancher.io/local-path Delete WaitForFirstConsumer false 9s
```
改為預設的storageclass
```shell
# kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
storageclass.storage.k8s.io/local-path patched
# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-path (default) rancher.io/local-path Delete WaitForFirstConsumer false 110s
```
## 3. 安裝K3k (Kubernetes in Kubernetes)
K3k主要服務登場
```shell
# helm repo add k3k https://rancher.github.io/k3k
"k3k" has been added to your repositories
# helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "k3k" chart repository
Update Complete. ⎈Happy Helming!⎈
# helm install --namespace k3k-system --create-namespace k3k k3k/k3k
NAME: k3k
LAST DEPLOYED: Fri Sep 19 09:32:09 2025
NAMESPACE: k3k-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
# kubectl get po -n k3k-system
NAME READY STATUS RESTARTS AGE
k3k-667f9f9849-x6qwb 1/1 Running 0 43s
# kubectl -n k3k-system get po
NAME READY STATUS RESTARTS AGE
k3k-667f9f9849-x6qwb 1/1 Running 0 55s
# kubectl -n k3k-system logs k3k-667f9f9849-x6qwb
{"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"Starting k3k - Version: v0.3.4"}
W0919 01:32:31.705342 1 client_config.go:659] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
{"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"adding cluster controller"}
{"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"starting port allocator"}
{"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"adding etcd pod controller"}
{"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"adding clusterpolicy controller"}
{"level":"info","timestamp":"2025-09-19T01:32:31.705Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
{"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"Starting EventSource","controller":"k3k-pod-controller","source":"kind source: *v1.Pod"}
{"level":"info","timestamp":"2025-09-19T01:32:31.705Z","msg":"Starting Controller","controller":"k3k-pod-controller"}
{"level":"info","timestamp":"2025-09-19T01:32:31.705Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":":8080","secure":false}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster","source":"kind source: *v1alpha1.Cluster"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster","source":"kind source: *v1.StatefulSet"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster","source":"kind source: *v1.Service"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1alpha1.VirtualClusterPolicy"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster","source":"kind source: *v1.Namespace"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting Controller","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1.NetworkPolicy"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1.ResourceQuota"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1.LimitRange"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1.Namespace"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1.Node"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting EventSource","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","source":"kind source: *v1alpha1.Cluster"}
{"level":"info","timestamp":"2025-09-19T01:32:31.706Z","msg":"Starting Controller","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy"}
{"level":"info","timestamp":"2025-09-19T01:32:31.811Z","msg":"Starting workers","controller":"k3k-pod-controller","worker count":50}
{"level":"info","timestamp":"2025-09-19T01:32:31.815Z","msg":"Starting workers","controller":"cluster","controllerGroup":"k3k.io","controllerKind":"Cluster","worker count":50}
{"level":"info","timestamp":"2025-09-19T01:32:31.818Z","msg":"Starting workers","controller":"virtualclusterpolicy","controllerGroup":"k3k.io","controllerKind":"VirtualClusterPolicy","worker count":50}
```
建立cluster yaml file - shared cluster
```yaml
apiVersion: k3k.io/v1alpha1
kind: Cluster
metadata:
name: single-server
spec:
mode: "shared" # share的話,會看到本機的K8s版本
servers: 1
agents: 3
version: v1.30.14-k3s2 # docker hub上沒有的話就不能用,但是Rancher部署的話可以有版本清單,可以用最新的版本
clusterCIDR: 10.30.0.0/16
serviceCIDR: 10.31.0.0/16
clusterDNS: 10.30.0.10
#persistence: local-path
serverArgs:
- "--write-kubeconfig-mode=777"
```
```shell
# kubectl -n k3k-system apply -f singlenode.yaml
cluster.k3k.io/single-server configured
# kubectl -n k3k-system logs k3k-single-server-server-0
time="2025-09-19T02:29:09Z" level=warning msg="Webhooks and apiserver aggregation may not function properly without an agent; please set egress-selector-mode to 'cluster' or 'pod'"
time="2025-09-19T02:29:09Z" level=info msg="Starting k3s v1.30.14+k3s2 (071b1ead)"
time="2025-09-19T02:29:09Z" level=info msg="Managed etcd cluster initializing"
time="2025-09-19T02:29:09Z" level=info msg="generated self-signed CA certificate CN=k3s-client-ca@1758248949: notBefore=2025-09-19 02:29:09.343427407 +0000 UTC notAfter=2035-09-17 02:29:09.343427407 +0000 UTC"
time="2025-09-19T02:29:09Z" level=info msg="certificate CN=system:admin,O=system:masters signed by CN=k3s-client-ca@1758248949: notBefore=2025-09-19 02:29:09 +0000 UTC notAfter=2026-09-19 02:29:09 +0000 UTC"
...
...
...
time="2025-09-19T02:30:13Z" level=info msg="Serving HTTP bootstrap from datastore for 10.42.0.19:59648"
time="2025-09-19T02:30:14Z" level=info msg="Serving HTTP bootstrap from datastore for 10.42.0.19:59662"
time="2025-09-19T02:30:16Z" level=info msg="Serving HTTP bootstrap from datastore for 10.42.0.19:59674"
# kubectl -n k3k-system get pod
NAME READY STATUS RESTARTS AGE
coredns-645bdb8675-ms2x2-kube-system-single-server-636f72-7f3a3 1/1 Running 0 99s
k3k-667f9f9849-x6qwb 1/1 Running 0 60m
k3k-single-server-kubelet-jwb54 1/1 Running 3 (2m23s ago) 6m32s
k3k-single-server-server-0 1/1 Running 0 3m25s
```
確認建立成功
```shell
# k3kcli kubeconfig generate --namespace k3k-mycluster --name mycluster
INFO[0000] waiting for cluster to be available..
INFO[0000] certificate CN=system:admin,O=system:masters signed by CN=k3s-client-ca@1758248223: notBefore=2025-09-19 02:17:03 +0000 UTC notAfter=2026-09-19 02:34:50 +0000 UTC
INFO[0000] You can start using the cluster with:
export KUBECONFIG=/root/k3k-mycluster-mycluster-kubeconfig.yaml
kubectl cluster-info
```
確認操作是否正常
```shell
# export KUBECONFIG=/root/k3k-mycluster-mycluster-kubeconfig.yaml
# kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-5688667fd4-zn6hx 1/1 Running 0 21m
# kubectl get no
NAME STATUS ROLES AGE VERSION
rke2 Ready agent 15m v1.33.3-k3s1
# kubectl create deploy web --image=nginx
deployment.apps/web created
demo-node2:~ # kubectl get pod -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
web-65d846d465-plwmn 1/1 Running 0 14s 10.42.0.35 rke2 <none> <none>
# kubectl get pod -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default web-65d846d465-plwmn 1/1 Running 0 23s 10.42.0.35 rke2 <none> <none>
kube-system coredns-5688667fd4-zn6hx 1/1 Running 0 21m 10.42.0.29 rke2 <none> <none>
# kubectl expose deploy web --target-port=80 --port=80 --type=NodePort
service/web exposed
# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 27d
web NodePort 10.43.176.62 <none> 80:32719/TCP 5s
```
外面的世界看起來是如何呢?
```shell
# kubectl --kubeconfig .kube/config get po -A |grep web
k3k-mycluster web-65d846d465-plwmn-default-mycluster-7765622d3635643834-0359f 1/1 Running 0 27d
# kubectl --kubeconfig config get svc -A |grep 32719
k3k-mycluster web-default-mycluster-7765622b64656661756c742b6d79636c757-72ca1 NodePort 10.43.176.62 <none> 80:32719/TCP 55s
```
:::success
是不是感覺很多東西可以嘗試看看呢?
:::
建立cluster yaml file - virtual cluster
```yaml
apiVersion: k3k.io/v1alpha1
kind: Cluster
metadata:
name: virtual-single-server
spec:
mode: "virtual"
servers: 1
agents: 3
version: v1.30.14-k3s2
clusterCIDR: 10.30.0.0/16
serviceCIDR: 10.31.0.0/16
clusterDNS: 10.30.0.10
#persistence: local-path
serverArgs:
- "--write-kubeconfig-mode=777"
```
如果這時候踩到open file上限,可以暫時解開一下
```shell
ulimit -n 65535
```