# Secure Cluster Class for Cluster API
Create a secure cluster class, that allows end users to spin "secure by default"
clusters with sane defaults which are configurable.
## Motivation
1. Creates a single place to add other security features
1. It mitigates some threats
from [CAPI Security assessment](https://github.com/kubernetes/sig-security/pull/40)
1. It helps with compliance guidelines
like [NSA/CISA k8s hardening](https://www.cisa.gov/uscert/ncas/current-activity/2022/03/15/updated-kubernetes-hardening-guide)
### Goals
MVP (Minimum Viable Product) goal is to
support [pod security admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/)
with [baseline pod security standard](https://kubernetes.io/docs/concepts/security/pod-security-standards/#baseline)
enforced at cluster level.
### Non-Goals/Future Work
Post MVP can include features like but not limited to support for:
- [Seccomp](https://kubernetes.io/blog/2021/08/25/seccomp-default/)
- [Apparmor](https://github.com/kubernetes/enhancements/pull/1444)
- [User namespaces](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/127-user-namespaces)
Any complex Pod Security Configurations that are *not* supported by built-in pod
security admission controller are out of scope.
## Proposal
To enable pod security admission with baseline pod security standard at cluster
level, API server needs to be passed an `extraArgs` parameter that points
to `AdmissionConfiguration` file that defines the cluster level pod security
standard with exemptions.
This file needs to be present on the control plane nodes where API server is
running. If running as a pod this file needs to be mounted from host inside the
pod or generated within the pod before API server binary is executed
An example of a ClusterClass Configuration for Cluster API Provider - Docker can
be found [here](#Example-Cluster-Class-configuration)
To auto-generate this file we need to add a new feature to `clusterctl generate`
that takes as input these parameters:
- Pod Security standard name (restricted, baseline, privileged)
- Applicable mode (enforce, warn, audit)
- Applicable version (usually default)
Few possible CLI UX options are as follows:
### Secure with sane defaults
```shell
clusterctl generate cluster capi-quickstart --flavor development-topology \
--secure pod-security
--kubernetes-version v1.23.3 \
--control-plane-machine-count=3 \
--worker-machine-count=3 \
> capi-quickstart-secure-default.yaml
```
By default `clusterctl` will enforce `baseline` pod security standard & `audit`
and `warn` on `restricted` an exempt `kube-system` namespace and `version` will
default to `latest`
### Secure with configurable defaults
```shell
clusterctl generate cluster capi-quickstart --flavor development-topology \
--secure pod-security=baseline:enforce,restricted:warn,restricted:audit
--kubernetes-version v1.23.3 \
--control-plane-machine-count=3 \
--worker-machine-count=3 \
> capi-quickstart-secure-configured.yaml
```
This needs to be confirmed for OpenAPI schema compatability
### Secure with configurable defaults via environment substring
```shell
POD_SECURITY_CONFIG='{"baseline":"enforce","restricted":"warn","restricted":"audit"}'
clusterctl generate cluster capi-quickstart --flavor development-topology \
--secure pod-security
--kubernetes-version v1.23.3 \
--control-plane-machine-count=3 \
--worker-machine-count=3 \
> capi-quickstart-secure-env-configured.yaml
```
The outcome of the either of the above UX would be generation of
the `cluster-level-pss.yaml` file which is accessible to API server during start
up.
### Implementation Details/Notes/Constraints
#### Example Cluster Class configuration
```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
metadata:
name: quick-start-secure
namespace: default
spec:
controlPlane:
machineInfrastructure:
ref:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
name: quick-start-secure-control-plane
ref:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
name: quick-start-secure-control-plane
infrastructure:
ref:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerClusterTemplate
name: quick-start-secure-cluster
patches:
- definitions:
- jsonPatches:
- op: add
path: /spec/template/spec/kubeadmConfigSpec/clusterConfiguration/imageRepository
valueFrom:
variable: imageRepository
selector:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
matchResources:
controlPlane: true
description: Sets the imageRepository used for the KubeadmControlPlane.
name: imageRepository
- definitions:
- jsonPatches:
- op: add
path: /spec/template/spec/kubeadmConfigSpec/clusterConfiguration/etcd
valueFrom:
template: |
local:
imageTag: {{ .etcdImageTag }}
selector:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
matchResources:
controlPlane: true
description: Sets tag to use for the etcd image in the KubeadmControlPlane.
name: etcdImageTag
- definitions:
- jsonPatches:
- op: add
path: /spec/template/spec/kubeadmConfigSpec/clusterConfiguration/dns
valueFrom:
template: |
imageTag: {{ .coreDNSImageTag }}
selector:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
matchResources:
controlPlane: true
description: Sets tag to use for the etcd image in the KubeadmControlPlane.
name: coreDNSImageTag
- definitions:
- jsonPatches:
- op: add
path: /spec/template/spec/customImage
valueFrom:
template: |
kindest/node:{{ .builtin.machineDeployment.version }}
selector:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
matchResources:
machineDeploymentClass:
names:
- default-worker
- jsonPatches:
- op: add
path: /spec/template/spec/customImage
valueFrom:
template: |
kindest/node:{{ .builtin.controlPlane.version }}
selector:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
matchResources:
controlPlane: true
description: Sets the container image that is used for running dockerMachines
for the controlPlane and default-worker machineDeployments.
name: customImage
variables:
- name: imageRepository
required: true
schema:
openAPIV3Schema:
default: k8s.gcr.io
description: imageRepository sets the container registry to pull images from.
If empty, `k8s.gcr.io` will be used by default.
example: k8s.gcr.io
type: string
- name: etcdImageTag
required: true
schema:
openAPIV3Schema:
default: ""
description: etcdImageTag sets the tag for the etcd image.
example: 3.5.1-0
type: string
- name: coreDNSImageTag
required: true
schema:
openAPIV3Schema:
default: ""
description: coreDNSImageTag sets the tag for the coreDNS image.
example: v1.8.5
type: string
workers:
machineDeployments:
- class: default-worker
template:
bootstrap:
ref:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: quick-start-secure-default-worker-bootstraptemplate
infrastructure:
ref:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
name: quick-start-secure-default-worker-machinetemplate
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerClusterTemplate
metadata:
name: quick-start-secure-cluster
namespace: default
spec:
template:
spec: {}
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
metadata:
name: quick-start-secure-control-plane
namespace: default
spec:
template:
spec:
kubeadmConfigSpec:
clusterConfiguration:
apiServer:
certSANs:
- localhost
- 127.0.0.1
- 0.0.0.0
extraArgs:
admission-control-config-file: /etc/config/cluster-level-pss.yaml
extraVolumes:
- name: accf
hostPath: /etc/config
mountPath: /etc/config
readOnly: false
pathType: "DirectoryOrCreate"
controllerManager:
extraArgs:
enable-hostpath-provisioner: "true"
initConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
kubeletExtraArgs:
cgroup-driver: cgroupfs
eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%
joinConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
kubeletExtraArgs:
cgroup-driver: cgroupfs
eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
metadata:
name: quick-start-secure-control-plane
namespace: default
spec:
template:
spec:
extraMounts:
- containerPath: /var/run/docker.sock
hostPath: /var/run/docker.sock
- containerPath: /etc/config/cluster-level-pss.yaml
hostPath: /tmp/pss/cluster-level-pss.yaml
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
metadata:
name: quick-start-secure-default-worker-machinetemplate
namespace: default
spec:
template:
spec: {}
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: quick-start-secure-default-worker-bootstraptemplate
namespace: default
spec:
template:
spec:
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
cgroup-driver: cgroupfs
eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: capi-quickstart-secure
namespace: default
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
serviceDomain: k8s.test
services:
cidrBlocks:
- 10.96.0.0/12
topology:
class: quick-start-secure
controlPlane:
metadata: {}
replicas: 3
variables:
- name: imageRepository
value: k8s.gcr.io
- name: etcdImageTag
value: ""
- name: coreDNSImageTag
value: ""
version: v1.23.3
workers:
machineDeployments:
- class: default-worker
name: md-0
replicas: 3
```
#### Demo shell script
```shell
cat /tmp/pss/cluster-level-pss.yaml
sleep 2
echo "\nLet's apply this pod security standard to a cluster created with secure cluster class"
echo "\n\n"
kubectl apply -f capi-quickstart-secure.yaml
sleep 2
kubectl get cluster
kubectl get kubeadmcontrolplane
sleep 5
clusterctl describe cluster capi-quickstart-secure
clusterctl get kubeconfig capi-quickstart-secure > capi-quickstart-secure.kubeconfig
# Point the kubeconfig to the exposed port of the load balancer, rather than the inaccessible container IP.\n
sed -i -e "s/server:.*/server: https:\/\/$(docker port capi-quickstart-secure-lb 6443/tcp | sed "s/0.0.0.0/127.0.0.1/")/g" ./capi-quickstart-secure.kubeconfig
sleep 5
kubectl apply --kubeconfig=./capi-quickstart-secure.kubeconfig \\n -f https://docs.projectcalico.org/v3.21/manifests/calico.yaml
kubectl get --kubeconfig=./capi-quickstart-secure.kubeconfig nodes
cat <<EOF > /tmp/pss/nginx-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- image: nginx
name: nginx
ports:
- containerPort: 80
EOF
echo "Let's wait for cluster to get ready"
sleep 10
kubectl get --kubeconfig=./capi-quickstart-secure.kubeconfig nodes
echo "\n\n"
echo "Creating pod that will throw warning on restricted pod security standard"
cat /tmp/pss/nginx-pod.yaml
echo "\n\n"
kubectl apply --kubeconfig=./capi-quickstart-secure.kubeconfig -f /tmp/pss/nginx-pod.yaml
sleep 2
echo "\nYay, as expected, during pod creation, restricted pod security standard threw a warning but it passed the baseline pod security standard\n"
echo "\n\n-------H A P P Y-------H O N K I N G----------\n\n "
sleep 2
echo "Clean up"
kubectl delete cluster capi-quickstart-secure
```
#### Pre-requisites
##### Cluster level Pod Security Admission configuration
Content of `/tmp/pss/cluster-level-pss.yaml`
```yaml
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1beta1
kind: PodSecurityConfiguration
defaults:
enforce: "baseline"
enforce-version: "latest"
audit: "restricted"
audit-version: "latest"
warn: "restricted"
warn-version: "latest"
exemptions:
usernames: []
runtimeClasses: []
namespaces: [kube-system]
```
##### Example pod yaml
Passes baseline but warns on restricted pod security standard
Content of `/tmp/pss/nginx-pod.yaml`
```yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- image: nginx
name: nginx
ports:
- containerPort: 80
```