# rke2 cilium replace kube-proxy
* 透過 Cilium 取代 kube-proxy 可以提升網路性能,透過 Linux Kernel 層直接處理網路流量,也避免大量 IPtables 規則所造成的網路瓶頸。
## 實作
* 透過 rancher create custom rke2 叢集,cni 選擇 cilium

* 叢集管理頁面編輯 yaml 修改如下配置
```
rkeConfig:
chartValues:
rke2-cilium:
k8sServiceHost: 127.0.0.1
k8sServicePort: 6443
kubeProxyReplacement: true
......
machineGlobalConfig:
cni: cilium
disable-kube-proxy: true
etcd-expose-metrics: false
```

* 以上升級時會被清除設定(machineGlobalConfig 的值不會),可以將以下配置放在 Additional Manifest
```
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: rke2-cilium
namespace: kube-system
spec:
valuesContent: |-
k8sServiceHost: 127.0.0.1
k8sServicePort: 6443
kubeProxyReplacement: true
```

* 設定好後儲存建立 3m 叢集
```
$ kubectl get no
NAME STATUS ROLES AGE VERSION
m1 Ready control-plane,etcd,master,worker 24m v1.30.5+rke2r1
m2 Ready control-plane,etcd,master,worker 2m42s v1.30.5+rke2r1
m3 Ready control-plane,etcd,master,worker 2m42s v1.30.5+rke2r1
```
* 確認已沒有 kube-proxy pod
```
$ kubectl -n kube-system get po
NAME READY STATUS RESTARTS AGE
cilium-4hjnw 1/1 Running 0 3m18s
cilium-4k7zj 1/1 Running 0 11m
cilium-operator-b6b99f988-2fvbs 1/1 Running 0 8m56s
cilium-operator-b6b99f988-hnmlx 1/1 Running 0 8m56s
cilium-rw5v6 1/1 Running 0 3m19s
cloud-controller-manager-m1 1/1 Running 0 24m
cloud-controller-manager-m2 1/1 Running 0 3m13s
cloud-controller-manager-m3 1/1 Running 0 2m56s
etcd-m1 1/1 Running 0 24m
etcd-m2 1/1 Running 0 3m7s
etcd-m3 1/1 Running 0 2m28s
helm-install-rke2-cilium-j84xk 0/1 Completed 0 11m
helm-install-rke2-coredns-5x5p6 0/1 Completed 0 24m
helm-install-rke2-ingress-nginx-tbtvq 0/1 Completed 0 24m
helm-install-rke2-metrics-server-tqlj7 0/1 Completed 0 24m
helm-install-rke2-snapshot-controller-crd-tqsgz 0/1 Completed 0 24m
helm-install-rke2-snapshot-controller-h6zt2 0/1 Completed 1 24m
helm-install-rke2-snapshot-validation-webhook-5ktrn 0/1 Completed 0 24m
kube-apiserver-m1 1/1 Running 0 24m
kube-apiserver-m2 1/1 Running 0 3m12s
kube-apiserver-m3 1/1 Running 0 2m33s
kube-controller-manager-m1 1/1 Running 0 24m
kube-controller-manager-m2 1/1 Running 0 3m13s
kube-controller-manager-m3 1/1 Running 0 2m56s
kube-scheduler-m1 1/1 Running 0 24m
kube-scheduler-m2 1/1 Running 0 3m13s
kube-scheduler-m3 1/1 Running 0 2m56s
rke2-coredns-rke2-coredns-849bdf86bd-vrk64 1/1 Running 0 24m
rke2-coredns-rke2-coredns-849bdf86bd-zk6zb 1/1 Running 0 3m14s
rke2-coredns-rke2-coredns-autoscaler-7b7dd89c46-h6xxs 1/1 Running 0 24m
rke2-ingress-nginx-controller-hwlct 1/1 Running 0 7m51s
rke2-ingress-nginx-controller-k8qmm 1/1 Running 0 86s
rke2-ingress-nginx-controller-sv9gf 1/1 Running 0 106s
rke2-metrics-server-5cd8496498-n954l 1/1 Running 0 8m32s
rke2-snapshot-controller-b4645b856-vwc97 1/1 Running 0 8m30s
rke2-snapshot-validation-webhook-6747d95cfc-lmgrw 1/1 Running 0 8m31s
```
## 驗證
* 檢查 `KubeProxyReplacement` 為 true
```
$ kubectl -n kube-system exec ds/cilium -- cilium-dbg status | grep KubeProxyReplacement
Defaulted container "cilium-agent" out of: cilium-agent, install-portmap-cni-plugin (init), config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
KubeProxyReplacement: True [eth0 192.168.11.161 fe80::1cfb:b1ff:fea7:1428 (Direct Routing)]
```
* 使用 `--verbose` 檢查完整訊息
```
$ kubectl -n kube-system exec ds/cilium -- cilium-dbg status --verbose
......
KubeProxyReplacement Details:
Status: True
Socket LB: Enabled
Socket LB Tracing: Enabled
Socket LB Coverage: Full
Devices: eth0 192.168.11.161 fe80::1cfb:b1ff:fea7:1428 (Direct Routing)
Mode: SNAT
Backend Selection: Random
Session Affinity: Enabled
Graceful Termination: Enabled
NAT46/64 Support: Disabled
XDP Acceleration: Disabled
Services:
- ClusterIP: Enabled
- NodePort: Enabled (Range: 30000-32767)
- LoadBalancer: Enabled
- externalIPs: Enabled
- HostPort: Enabled
......
```
* 測試建立 deployment 與 service
```
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
spec:
selector:
matchLabels:
run: my-nginx
replicas: 2
template:
metadata:
labels:
run: my-nginx
spec:
containers:
- name: my-nginx
image: nginx
ports:
- containerPort: 80
---
kind: Service
apiVersion: v1
metadata:
name: my-nginx
spec:
selector:
run: my-nginx
ports:
- port: 80
targetPort: 80
```
```
$ kubectl get po,svc
NAME READY STATUS RESTARTS AGE
pod/my-nginx-fdd6574f7-gkj77 1/1 Running 0 51s
pod/my-nginx-fdd6574f7-mhv8r 1/1 Running 0 51s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 31m
service/my-nginx ClusterIP 10.43.89.42 <none> 80/TCP 51s
```
* 驗證可以透過 svc ip 與 svc 名稱解析訪問到服務
```
$ kubectl exec my-nginx-fdd6574f7-gkj77 -- curl 10.43.89.42
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 615 100 615 0 0 132k <!DOCTYPE html> --:--:-- --:--:-- 0
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
$ kubectl exec my-nginx-fdd6574f7-gkj77 -- curl my-nginx.default.svc.cluster.local
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 615 100 615 0 0 44854 0 --:--:-- --:--:-<!DOCTYPE html>0
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
```
* cilium 會負責記錄 service 的 ip 以及對應 Backend 的相關資訊。
```
$ kubectl -n kube-system exec ds/cilium -- cilium-dbg service list
Defaulted container "cilium-agent" out of: cilium-agent, install-portmap-cni-plugin (init), config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
ID Frontend Service Type Backend
1 10.43.204.239:443 ClusterIP 1 => 10.42.0.231:444 (active)
2 => 10.42.0.207:444 (active)
2 10.43.204.239:80 ClusterIP 1 => 10.42.0.231:80 (active)
2 => 10.42.0.207:80 (active)
3 10.43.0.1:443 ClusterIP 1 => 192.168.11.161:6443 (active)
2 => 192.168.11.163:6443 (active)
3 => 192.168.11.162:6443 (active)
4 10.43.0.10:53 ClusterIP 1 => 10.42.0.107:53 (active)
2 => 10.42.2.131:53 (active)
5 10.43.58.89:443 ClusterIP 1 => 10.42.0.119:10250 (active)
6 10.43.16.225:443 ClusterIP 1 => 10.42.0.198:8443 (active)
7 10.43.231.68:443 ClusterIP 1 => 10.42.0.157:8443 (active)
2 => 10.42.2.22:8443 (active)
3 => 10.42.1.58:8443 (active)
8 192.168.11.161:80 HostPort 1 => 10.42.0.157:80 (active)
9 0.0.0.0:80 HostPort 1 => 10.42.0.157:80 (active)
10 192.168.11.161:443 HostPort 1 => 10.42.0.157:443 (active)
11 0.0.0.0:443 HostPort 1 => 10.42.0.157:443 (active)
12 10.43.220.235:443 ClusterIP 1 => 10.42.0.189:9443 (active)
13 10.43.89.42:80 ClusterIP 1 => 10.42.2.98:80 (active)
2 => 10.42.1.92:80 (active)
```
* 檢查 iptables 並沒有任何 service 的紀錄
```
$ iptables-save | grep KUBE-SERVICES
```
## 參考
https://cilium.io/blog/2018/04/17/why-is-the-kernel-community-replacing-iptables/
https://docs.cilium.io/en/v1.15/network/kubernetes/kubeproxy-free/#validate-the-setup
https://github.com/rancher/rke2/issues/4862
https://gist.github.com/PhilipSchmid/c15e2c06b32022eaa90ed9b9262968d8