# Pod security standard
* pod security standard 就是導入 pod 安全性的標準,有三種等級 Privileged, Baseline, Restricted,權限由寬鬆到嚴格遞增。從中我們是根據 namespace 下 label 去定義 pod security standard 的設定,並且可以規定 pod 是否可以部署或只是通知。
## Pod security standard 的權限
* Privileged: 沒有限制。
* Baseline: 在避免已知的特權提升下,提供最低的限制,例如 `hostPath` 或是 `privileged: true` 提權。
* Restricted: 最嚴格的安全等級,會要求更多限制,例如 container 一定要以非 root 使用者身分執行。
## Pod Security Admission Namespace Label
* Pod Security Admission 是透過 Namespace Label 來控制的。在每個 Namespace 中,可以用以下 Label 來指定安全等級
- `pod-security.kubernetes.io/enforce`:強制執行的安全等級。
- `pod-security.kubernetes.io/warn`:警告級別,當 Pod 不符合指定等級的配置時,會給出警告,但允許部署。
- `pod-security.kubernetes.io/audit`:審核級別,當 Pod 不符合指定等級時,會記錄下來以供審計,但允許部署。
## 實作
* 給 namespace 貼上 `baseline` label,這個 namespace 所建立的 pod 就不能使用 `privileged: true` 提權
```
$ kubectl create ns mytest
$ kubectl patch namespace mytest -p '{"metadata": {"labels": {"pod-security.kubernetes.io/enforce": "baseline"}}}'
$ kubectl get ns mytest --show-labels
NAME STATUS AGE LABELS
mytest Active 9m12s kubernetes.io/metadata.name=mytest,pod-security.kubernetes.io/enforce=baseline
$ echo 'apiVersion: v1
kind: Pod
metadata:
name: baseline-pod
namespace: mytest
spec:
containers:
- name: baseline-container
image: nginx
securityContext:
privileged: true' | kubectl apply -f -
Error from server (Forbidden): error when creating "STDIN": pods "privileged-pod" is forbidden: violates PodSecurity "baseline:latest": privileged (container "privileged-container" must not set securityContext.privileged=true)
# 移除 Pod Security Admission label
$ kubectl patch namespace mytest -p '{"metadata": {"labels": {"pod-security.kubernetes.io/enforce": null}}}'
```
* 測試 `Restricted` 權限
```
$ kubectl patch namespace mytest -p '{"metadata": {"labels": {"pod-security.kubernetes.io/enforce": "restricted"}}}'
$ kubectl get ns mytest --show-labels
NAME STATUS AGE LABELS
mytest Active 12m kubernetes.io/metadata.name=mytest,pod-security.kubernetes.io/enforce=restricted
# Restricted 會要求添加更多限制
$ echo 'apiVersion: v1
kind: Pod
metadata:
name: restricted-pod
namespace: mytest
spec:
containers:
- name: restricted-container
image: quay.io/cloudwalker/alp.kadm
tty: true' | kubectl apply -f -
Error from server (Forbidden): error when creating "STDIN": pods "restricted-pod" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "restricted-container" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "restricted-container" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "restricted-container" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "restricted-container" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
```
* 修改正確可已部屬 yaml
```
# seccomp 是 Linux 提供的安全功能,用於過濾系統使用 (syscalls)
$ echo 'apiVersion: v1
kind: Pod
metadata:
name: restricted-pod
namespace: mytest
spec:
containers:
- name: restricted-container
image: quay.io/cloudwalker/alp.kadm
tty: true
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsUser: 1000
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault' | kubectl apply -f -
$ kubectl -n mytest get po
NAME READY STATUS RESTARTS AGE
restricted-pod 1/1 Running 0 4s
# 移除 Pod Security Admission label
$ kubectl patch namespace mytest -p '{"metadata": {"labels": {"pod-security.kubernetes.io/enforce": null}}}'
```
* 測試部屬 nginx
```
$ echo 'apiVersion: v1
kind: Pod
metadata:
name: restricted-nginx-pod
namespace: mytest
spec:
containers:
- name: restricted-nginx-container
image: docker.io/taiwanese/nginx:1.29.3-alpine-otel-nonroot' | kubectl apply -f -
Error from server (Forbidden): error when creating "STDIN": pods "restricted-nginx-pod" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "restricted-nginx-container" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "restricted-nginx-container" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "restricted-nginx-container" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "restricted-nginx-container" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
```
* nginx 指定不使用 root 執行
```
$ echo 'apiVersion: v1
kind: Pod
metadata:
name: restricted-nginx-pod
namespace: mytest
labels:
app: restricted-nginx
spec:
containers:
- name: restricted-nginx-container
image: docker.io/taiwanese/nginx:1.29.3-alpine-otel-nonroot
securityContext:
privileged: false
allowPrivilegeEscalation: false
capabilities:
drop:
- "ALL"
runAsNonRoot: true
runAsUser: 1001
seccompProfile:
type: "RuntimeDefault"' | kubectl apply -f -
$ kubectl -n mytest get po
NAME READY STATUS RESTARTS AGE
restricted-nginx-pod 1/1 Running 0 18s
```
> 注意,runAsNonRoot 會用 user id 判斷 App Container 是否有透過非 root 的帳號去執行,所以當 App Container image 內定執行的使用者指定的是 user name 不是 user id,就必須要再額外設定 runAsUser 指定 uid 去執行,否則你會遇到以下錯誤訊息:
>
> `Error: container has runAsNonRoot and image has non-numeric user (bigred), cannot verify user is non-root (pod: "restricted-nginx-pod_mytest(d40b3343-1027-48c2-9b7d-e9acac1c7524)", container: restricted-nginx-container)`
## 參考
https://kubernetes.io/docs/concepts/security/pod-security-standards/
https://sean22492249.medium.com/kubernetes-pod-security-standard-%E4%BB%8B%E7%B4%B9-c2556cd3f72b
https://hackmd.io/7h_u0MQPR0aKJR1xkXRN2A