# 測試 K8S Pod 數量超過 110 會發生甚麼事?
<style>
.indent-title-1{
margin-left: 1em;
}
.indent-title-2{
margin-left: 2em;
}
.indent-title-3{
margin-left: 3em;
}
</style>
## Preface
<div class="indent-title-1">
本篇文章會介紹,在 Kubernetes 中,如果單一台節點上 Pod 的數量超過 110 個會發生甚麼事?
可以透過點擊展開以下目錄,選擇想看的內容,跳轉至特定章節
:::warning
:::spoiler {state="open"} 文章目錄
[TOC]
:::
</div>
## 測試環境
### 1M2W 的架構
<div class="indent-title-1">
```
$ kubectl get nodes -o wide
```
螢幕輸出 :
```
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
antony-m1 Ready control-plane 46m v1.29.0 172.20.0.31 <none> Talos (v1.6.1) 6.1.69-talos containerd://1.7.11
antony-w1 Ready worker 46m v1.29.0 172.20.0.32 <none> Talos (v1.6.1) 6.1.69-talos containerd://1.7.11
antony-w2 Ready worker 46m v1.29.0 172.20.0.33 <none> Talos (v1.6.1) 6.1.69-talos containerd://1.7.11
```
</div>
</div>
</div>
</div>
### 單台節點允許執行 Pod 的數量
<div class="indent-title-1">
```
$ kubectl get nodes -o yaml | grep -A 11 "allocatable:$"
```
螢幕輸出 :
```
allocatable:
cpu: 9950m
ephemeral-storage: "95171693615"
hugepages-2Mi: "0"
memory: 32550712Ki
pods: "110"
capacity:
cpu: "10"
ephemeral-storage: 101132Mi
hugepages-2Mi: "0"
memory: 32849720Ki
pods: "110"
--
allocatable:
cpu: 9950m
ephemeral-storage: "95171693615"
hugepages-2Mi: "0"
memory: 32550704Ki
pods: "110"
capacity:
cpu: "10"
ephemeral-storage: 101132Mi
hugepages-2Mi: "0"
memory: 32849712Ki
pods: "110"
--
allocatable:
cpu: 9950m
ephemeral-storage: "95171693615"
hugepages-2Mi: "0"
memory: 32550712Ki
pods: "110"
capacity:
cpu: "10"
ephemeral-storage: 101132Mi
hugepages-2Mi: "0"
memory: 32849720Ki
pods: "110"
```
> 三台 Node 最多都只能 Run 110 個 Pods
</div>
## 1. 建立測試 Deployment Yaml 檔
<div class="indent-title-1">
```
$ cat <<EOF > deployment-w2.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: alp
labels:
node: w2
spec:
replicas: 3
selector:
matchLabels:
node: w2
template:
metadata:
labels:
node: w2
spec:
containers:
- name: alpine
image: quay.io/cloudwalker/alp.base:latest
command: ["/bin/sleep", "infinity"]
imagePullPolicy: IfNotPresent
tolerations:
- operator: Exists
nodeName: antony-w2
EOF
```
</div>
## 2. 檢視 w2 節點上有幾個 Pod
<div class="indent-title-1">
```!
$ kubectl get pods -n kube-system -o wide | grep 'antony-w2'
```
螢幕輸出 :
```
coredns-85b955d87b-jjkwh 1/1 Running 0 57m 10.244.0.7 antony-w2
kube-flannel-q95bd 1/1 Running 2 (62m ago) 79m 172.20.0.33 antony-w2
kube-proxy-h8k74 1/1 Running 2 (62m ago) 79m 172.20.0.33 antony-w2
```
> 目前在 antony-w2 節點上已有 3 台 pods
</div>
## 3. 部屬 Deployment Yaml 檔
<div class="indent-title-1">
```
$ kubectl apply -f deployment-w2.yaml
```
</div>
## 4. 檢視部屬狀態
<div class="indent-title-1">
```!
$ kubectl get pods -l node=w2
```
螢幕輸出 :
```
NAME READY STATUS RESTARTS AGE
alp-8d96b675-jmkdb 1/1 Running 0 7s
alp-8d96b675-t4x2v 1/1 Running 0 7s
alp-8d96b675-xmp2b 1/1 Running 0 7s
```
</div>
## 5. 把 Pod 數量 Scale 到 20 個
<div class="indent-title-1">
```!
$ kubectl scale deploy alp --replicas=20
```
</div>
## 6. 檢視部屬狀態
<div class="indent-title-1">
```!
$ kubectl get pods -l node=w2
```
螢幕輸出 :
```
NAME READY STATUS RESTARTS AGE
alp-8d96b675-5b7gq 1/1 Running 0 34s
alp-8d96b675-824tt 1/1 Running 0 34s
alp-8d96b675-9q8jn 1/1 Running 0 34s
alp-8d96b675-cgbxq 1/1 Running 0 34s
alp-8d96b675-cp59z 1/1 Running 0 34s
alp-8d96b675-d8775 1/1 Running 0 34s
alp-8d96b675-dbh5r 1/1 Running 0 34s
alp-8d96b675-g58ks 1/1 Running 0 34s
alp-8d96b675-gjlqp 1/1 Running 0 34s
alp-8d96b675-jmkdb 1/1 Running 0 8m6s
alp-8d96b675-jtfx7 1/1 Running 0 34s
alp-8d96b675-khrsq 1/1 Running 0 34s
alp-8d96b675-l7sgf 1/1 Running 0 34s
alp-8d96b675-n7ghc 1/1 Running 0 34s
alp-8d96b675-p8cvs 1/1 Running 0 34s
alp-8d96b675-s9xd6 1/1 Running 0 34s
alp-8d96b675-t4x2v 1/1 Running 0 8m6s
alp-8d96b675-xmp2b 1/1 Running 0 8m6s
alp-8d96b675-z5dz7 1/1 Running 0 34s
alp-8d96b675-zv6p7 1/1 Running 0 34s
```
</div>
## 7. 把 Pod 數量 Scale 到 107 個
<div class="indent-title-1">
```!
$ kubectl scale deploy alp --replicas=107
```
</div>
## 8. 檢視部屬狀態
<div class="indent-title-1">
```!
$ kubectl get pods -l node=w2 --no-headers=true | grep 'Running' | wc -l
```
螢幕輸出 :
```
107
```
</div>
## 9. 把 Pod 數量 Scale 到 108 個 (此時總量會超過 110)
<div class="indent-title-1">
```!
$ kubectl scale deploy alp --replicas=108
```
</div>
## 10. 檢視部屬狀態
<div class="indent-title-1">
```!
$ kubectl get pods -l node=w2
```
螢幕輸出 :
```
NAME READY STATUS RESTARTS AGE
alp-779f49464f-24r7q 1/1 Running 0 2m57s
alp-779f49464f-26245 1/1 Running 0 2m58s
alp-779f49464f-2h85w 0/1 OutOfpods 0 2s
alp-779f49464f-2h896 1/1 Running 0 2m58s
alp-779f49464f-2hcp8 1/1 Running 0 2m58s
alp-779f49464f-2lcc4 1/1 Running 0 2m57s
alp-779f49464f-2ndqf 1/1 Running 0 2m58s
alp-779f49464f-2w2ct 1/1 Running 0 4m27s
alp-779f49464f-2x9fn 1/1 Running 0 4m27s
alp-779f49464f-2xb5q 0/1 OutOfpods 0 4s
alp-779f49464f-42gq5 1/1 Running 0 2m56s
alp-779f49464f-48zhp 0/1 OutOfpods 0 5s
alp-779f49464f-4hbtq 0/1 OutOfpods 0 2s
alp-779f49464f-4jvzs 1/1 Running 0 2m59s
alp-779f49464f-4lvgr 1/1 Running 0 2m59s
alp-779f49464f-4qmgk 0/1 OutOfpods 0 0s
...以下太多省略
```
> 此時會有一堆 Pod 的 STATUS 呈現 `OutOfpods`,並且 STATUS 是 `OutOfpods` 的 Pod 總數還會不斷增長
</div>
## 11. 查看狀態為 OutOfpods 的 Deployment Pod 總數
<div class="indent-title-1">
```!
$ kubectl get pods -l node=w2 | grep OutOfpods | wc -l
```
螢幕輸出 :
```
5824
```
</div>
## 12. 查看 Deployment 狀態
<div class="indent-title-1">
```!
$ kubectl get deploy alp
```
螢幕輸出 :
```
NAME READY UP-TO-DATE AVAILABLE AGE
alp 107/108 108 107 11m
```
> 狀態是 `OutOfpods` 的 pod 看起來並沒有被 Deployment Controller 掌管
</div>
## 13. 查看 replicaset 狀態
<div class="indent-title-1">
```!
$ kubectl get rs
```
螢幕輸出 :
```
NAME DESIRED CURRENT READY AGE
alp-779f49464f 108 108 107 16m
```
> 狀態是 `OutOfpods` 的 pod 看起來也沒有被 Replicaset Controller 記錄下來
</div>
## 14. 查看 node 資源使用量
<div class="indent-title-1">
```!
$ kubectl top nodes
```
螢幕輸出 :
```
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
antony-m1 243m 2% 1595Mi 5%
antony-w1 9m 0% 609Mi 1%
antony-w2 1227m 12% 1334Mi 4%
```
</div>
## 15. 查看狀態為 Running 的 Pod 是否有被影響
<div class="indent-title-1">
```!
$ kubectl get pods -l node=w2 | grep -i runn | wc -l
```
螢幕輸出 :
```
107
```
</div>
## 16. 查看 狀態為 OutOfpods 的 Pod 詳細資訊
<div class="indent-title-1">
```!
$ kubectl describe pod alp-779f49464f-zspj6
```
螢幕輸出 :
```!
Name: alp-779f49464f-zspj6
Namespace: default
Priority: 0
Service Account: default
Node: antony-w2/
Start Time: Mon, 25 Dec 2023 23:21:25 +0800
Labels: node=w2
pod-template-hash=779f49464f
Annotations: <none>
Status: Failed
Reason: OutOfpods
Message: Pod was rejected: Node didn't have enough resource: pods, requested: 1, used: 110, capacity: 110
IP:
IPs: <none>
Controlled By: ReplicaSet/alp-779f49464f
Containers:
alpine:
Image: quay.io/cloudwalker/alp.base:latest
Port: <none>
Host Port: <none>
Command:
/bin/sleep
infinity
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qgkvc (ro)
Volumes:
kube-api-access-qgkvc:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning OutOfpods 12m kubelet Node didn't have enough resource: pods, requested: 1, used: 110, capacity: 110
```
> 看到 Event: `Node didn't have enough resource: pods, requested: 1, used: 110, capacity: 110`
> Pod 也並未得到 IP Address。
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
## 17. 再次查看狀態為 OutOfpods 的 Deployment Pod 總數
<div class="indent-title-1">
```!
$ kubectl get pods -l node=w2 | grep OutOfpods | wc -l
```
螢幕輸出 :
```
11141
```
> 狀態為 OutOfpods 的 Deployment Pod 總數會一直增加,最後測到 `12529` 個不想等了。
</div>
## 結論
1. **即使節點的 CPU、記憶體、Disk ...等硬體資源很夠力,但只要在單一節點上 kubelet 檢查到超過 110 個 Pod 總數,新建立的 Pod 是無法 Running 的**
2. 狀態為 OutOfpods 的 Deployment Pod 總數會一直增加似乎會一直增加到 Node 的資源被使用完畢
3. 狀態為 OutOfpods 的 Deployment Pod 總數會一直增加的原因,推測是因為 Deployment 需要把 Pod 數量擴充到 108 個並且狀態都是 `Running`,但是新建立出來的 Pod 因為超過 `max-pods=110` 的限制,所以狀態一直是 `OutOfpods`,一直無限循環。