# Secure Cluster Class for Cluster API Create a secure cluster class, that allows end users to spin "secure by default" clusters with sane defaults which are configurable. ## Motivation 1. Creates a single place to add other security features 1. It mitigates some threats from [CAPI Security assessment](https://github.com/kubernetes/sig-security/pull/40) 1. It helps with compliance guidelines like [NSA/CISA k8s hardening](https://www.cisa.gov/uscert/ncas/current-activity/2022/03/15/updated-kubernetes-hardening-guide) ### Goals MVP (Minimum Viable Product) goal is to support [pod security admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) with [baseline pod security standard](https://kubernetes.io/docs/concepts/security/pod-security-standards/#baseline) enforced at cluster level. ### Non-Goals/Future Work Post MVP can include features like but not limited to support for: - [Seccomp](https://kubernetes.io/blog/2021/08/25/seccomp-default/) - [Apparmor](https://github.com/kubernetes/enhancements/pull/1444) - [User namespaces](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/127-user-namespaces) Any complex Pod Security Configurations that are *not* supported by built-in pod security admission controller are out of scope. ## Proposal To enable pod security admission with baseline pod security standard at cluster level, API server needs to be passed an `extraArgs` parameter that points to `AdmissionConfiguration` file that defines the cluster level pod security standard with exemptions. This file needs to be present on the control plane nodes where API server is running. If running as a pod this file needs to be mounted from host inside the pod or generated within the pod before API server binary is executed An example of a ClusterClass Configuration for Cluster API Provider - Docker can be found [here](#Example-Cluster-Class-configuration) To auto-generate this file we need to add a new feature to `clusterctl generate` that takes as input these parameters: - Pod Security standard name (restricted, baseline, privileged) - Applicable mode (enforce, warn, audit) - Applicable version (usually default) Few possible CLI UX options are as follows: ### Secure with sane defaults ```shell clusterctl generate cluster capi-quickstart --flavor development-topology \ --secure pod-security --kubernetes-version v1.23.3 \ --control-plane-machine-count=3 \ --worker-machine-count=3 \ > capi-quickstart-secure-default.yaml ``` By default `clusterctl` will enforce `baseline` pod security standard & `audit` and `warn` on `restricted` an exempt `kube-system` namespace and `version` will default to `latest` ### Secure with configurable defaults ```shell clusterctl generate cluster capi-quickstart --flavor development-topology \ --secure pod-security=baseline:enforce,restricted:warn,restricted:audit --kubernetes-version v1.23.3 \ --control-plane-machine-count=3 \ --worker-machine-count=3 \ > capi-quickstart-secure-configured.yaml ``` This needs to be confirmed for OpenAPI schema compatability ### Secure with configurable defaults via environment substring ```shell POD_SECURITY_CONFIG='{"baseline":"enforce","restricted":"warn","restricted":"audit"}' clusterctl generate cluster capi-quickstart --flavor development-topology \ --secure pod-security --kubernetes-version v1.23.3 \ --control-plane-machine-count=3 \ --worker-machine-count=3 \ > capi-quickstart-secure-env-configured.yaml ``` The outcome of the either of the above UX would be generation of the `cluster-level-pss.yaml` file which is accessible to API server during start up. ### Implementation Details/Notes/Constraints #### Example Cluster Class configuration ```yaml apiVersion: cluster.x-k8s.io/v1beta1 kind: ClusterClass metadata: name: quick-start-secure namespace: default spec: controlPlane: machineInfrastructure: ref: apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: DockerMachineTemplate name: quick-start-secure-control-plane ref: apiVersion: controlplane.cluster.x-k8s.io/v1beta1 kind: KubeadmControlPlaneTemplate name: quick-start-secure-control-plane infrastructure: ref: apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: DockerClusterTemplate name: quick-start-secure-cluster patches: - definitions: - jsonPatches: - op: add path: /spec/template/spec/kubeadmConfigSpec/clusterConfiguration/imageRepository valueFrom: variable: imageRepository selector: apiVersion: controlplane.cluster.x-k8s.io/v1beta1 kind: KubeadmControlPlaneTemplate matchResources: controlPlane: true description: Sets the imageRepository used for the KubeadmControlPlane. name: imageRepository - definitions: - jsonPatches: - op: add path: /spec/template/spec/kubeadmConfigSpec/clusterConfiguration/etcd valueFrom: template: | local: imageTag: {{ .etcdImageTag }} selector: apiVersion: controlplane.cluster.x-k8s.io/v1beta1 kind: KubeadmControlPlaneTemplate matchResources: controlPlane: true description: Sets tag to use for the etcd image in the KubeadmControlPlane. name: etcdImageTag - definitions: - jsonPatches: - op: add path: /spec/template/spec/kubeadmConfigSpec/clusterConfiguration/dns valueFrom: template: | imageTag: {{ .coreDNSImageTag }} selector: apiVersion: controlplane.cluster.x-k8s.io/v1beta1 kind: KubeadmControlPlaneTemplate matchResources: controlPlane: true description: Sets tag to use for the etcd image in the KubeadmControlPlane. name: coreDNSImageTag - definitions: - jsonPatches: - op: add path: /spec/template/spec/customImage valueFrom: template: | kindest/node:{{ .builtin.machineDeployment.version }} selector: apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: DockerMachineTemplate matchResources: machineDeploymentClass: names: - default-worker - jsonPatches: - op: add path: /spec/template/spec/customImage valueFrom: template: | kindest/node:{{ .builtin.controlPlane.version }} selector: apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: DockerMachineTemplate matchResources: controlPlane: true description: Sets the container image that is used for running dockerMachines for the controlPlane and default-worker machineDeployments. name: customImage variables: - name: imageRepository required: true schema: openAPIV3Schema: default: k8s.gcr.io description: imageRepository sets the container registry to pull images from. If empty, `k8s.gcr.io` will be used by default. example: k8s.gcr.io type: string - name: etcdImageTag required: true schema: openAPIV3Schema: default: "" description: etcdImageTag sets the tag for the etcd image. example: 3.5.1-0 type: string - name: coreDNSImageTag required: true schema: openAPIV3Schema: default: "" description: coreDNSImageTag sets the tag for the coreDNS image. example: v1.8.5 type: string workers: machineDeployments: - class: default-worker template: bootstrap: ref: apiVersion: bootstrap.cluster.x-k8s.io/v1beta1 kind: KubeadmConfigTemplate name: quick-start-secure-default-worker-bootstraptemplate infrastructure: ref: apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: DockerMachineTemplate name: quick-start-secure-default-worker-machinetemplate --- apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: DockerClusterTemplate metadata: name: quick-start-secure-cluster namespace: default spec: template: spec: {} --- apiVersion: controlplane.cluster.x-k8s.io/v1beta1 kind: KubeadmControlPlaneTemplate metadata: name: quick-start-secure-control-plane namespace: default spec: template: spec: kubeadmConfigSpec: clusterConfiguration: apiServer: certSANs: - localhost - 127.0.0.1 - 0.0.0.0 extraArgs: admission-control-config-file: /etc/config/cluster-level-pss.yaml extraVolumes: - name: accf hostPath: /etc/config mountPath: /etc/config readOnly: false pathType: "DirectoryOrCreate" controllerManager: extraArgs: enable-hostpath-provisioner: "true" initConfiguration: nodeRegistration: criSocket: /var/run/containerd/containerd.sock kubeletExtraArgs: cgroup-driver: cgroupfs eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0% joinConfiguration: nodeRegistration: criSocket: /var/run/containerd/containerd.sock kubeletExtraArgs: cgroup-driver: cgroupfs eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0% --- apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: DockerMachineTemplate metadata: name: quick-start-secure-control-plane namespace: default spec: template: spec: extraMounts: - containerPath: /var/run/docker.sock hostPath: /var/run/docker.sock - containerPath: /etc/config/cluster-level-pss.yaml hostPath: /tmp/pss/cluster-level-pss.yaml --- apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: DockerMachineTemplate metadata: name: quick-start-secure-default-worker-machinetemplate namespace: default spec: template: spec: {} --- apiVersion: bootstrap.cluster.x-k8s.io/v1beta1 kind: KubeadmConfigTemplate metadata: name: quick-start-secure-default-worker-bootstraptemplate namespace: default spec: template: spec: joinConfiguration: nodeRegistration: kubeletExtraArgs: cgroup-driver: cgroupfs eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0% --- apiVersion: cluster.x-k8s.io/v1beta1 kind: Cluster metadata: name: capi-quickstart-secure namespace: default spec: clusterNetwork: pods: cidrBlocks: - 192.168.0.0/16 serviceDomain: k8s.test services: cidrBlocks: - 10.96.0.0/12 topology: class: quick-start-secure controlPlane: metadata: {} replicas: 3 variables: - name: imageRepository value: k8s.gcr.io - name: etcdImageTag value: "" - name: coreDNSImageTag value: "" version: v1.23.3 workers: machineDeployments: - class: default-worker name: md-0 replicas: 3 ``` #### Demo shell script ```shell cat /tmp/pss/cluster-level-pss.yaml sleep 2 echo "\nLet's apply this pod security standard to a cluster created with secure cluster class" echo "\n\n" kubectl apply -f capi-quickstart-secure.yaml sleep 2 kubectl get cluster kubectl get kubeadmcontrolplane sleep 5 clusterctl describe cluster capi-quickstart-secure clusterctl get kubeconfig capi-quickstart-secure > capi-quickstart-secure.kubeconfig # Point the kubeconfig to the exposed port of the load balancer, rather than the inaccessible container IP.\n sed -i -e "s/server:.*/server: https:\/\/$(docker port capi-quickstart-secure-lb 6443/tcp | sed "s/0.0.0.0/127.0.0.1/")/g" ./capi-quickstart-secure.kubeconfig sleep 5 kubectl apply --kubeconfig=./capi-quickstart-secure.kubeconfig \\n -f https://docs.projectcalico.org/v3.21/manifests/calico.yaml kubectl get --kubeconfig=./capi-quickstart-secure.kubeconfig nodes cat <<EOF > /tmp/pss/nginx-pod.yaml apiVersion: v1 kind: Pod metadata: name: nginx spec: containers: - image: nginx name: nginx ports: - containerPort: 80 EOF echo "Let's wait for cluster to get ready" sleep 10 kubectl get --kubeconfig=./capi-quickstart-secure.kubeconfig nodes echo "\n\n" echo "Creating pod that will throw warning on restricted pod security standard" cat /tmp/pss/nginx-pod.yaml echo "\n\n" kubectl apply --kubeconfig=./capi-quickstart-secure.kubeconfig -f /tmp/pss/nginx-pod.yaml sleep 2 echo "\nYay, as expected, during pod creation, restricted pod security standard threw a warning but it passed the baseline pod security standard\n" echo "\n\n-------H A P P Y-------H O N K I N G----------\n\n " sleep 2 echo "Clean up" kubectl delete cluster capi-quickstart-secure ``` #### Pre-requisites ##### Cluster level Pod Security Admission configuration Content of `/tmp/pss/cluster-level-pss.yaml` ```yaml apiVersion: apiserver.config.k8s.io/v1 kind: AdmissionConfiguration plugins: - name: PodSecurity configuration: apiVersion: pod-security.admission.config.k8s.io/v1beta1 kind: PodSecurityConfiguration defaults: enforce: "baseline" enforce-version: "latest" audit: "restricted" audit-version: "latest" warn: "restricted" warn-version: "latest" exemptions: usernames: [] runtimeClasses: [] namespaces: [kube-system] ``` ##### Example pod yaml Passes baseline but warns on restricted pod security standard Content of `/tmp/pss/nginx-pod.yaml` ```yaml apiVersion: v1 kind: Pod metadata: name: nginx spec: containers: - image: nginx name: nginx ports: - containerPort: 80 ```