Pod-Security- - HackMD

--- tags: psa title: Pod-Security- authors: - "@perdasilva" reviewers: - "@cmacedo" - "@jlanford" - "@bparees" approvers: - "@jlanford" creation-date: 2022-07-12 last-updated: 2022-07-12 status: provisional see-also: replaces: superseded-by: --- # pod-security-admission-updates ## Release Signoff Checklist - [ ] Enhancement is `implementable` - [ ] Design details are appropriately documented from clear requirements - [ ] Test plan is defined - [ ] Graduation criteria for dev preview, tech preview, GA ## Open Questions ## Summary The Kubernetes API is will deprecate [PodSecurityPolicy](https://kubernetes.io/blog/2021/04/06/podsecuritypolicy-deprecation-past-present-and-future/#what-is-podsecuritypolicy) in favor of [Pod Security Admission](https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/2579-psp-replacement) (PSA), which allows cluster admins to enfore [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/) (PSS) through namespace annotations. The PSS define three security policies of increasing level of restriction. If a Pod does not meet with the enforced security policy, it will not be scheduled. From OCP 4.12, the `restricted` (most restrictive) security policy will be enforced by default. This enhancement proposal (EP) proposes changes to OLM workload specs (controller and catalog source pods, and bundle unpack job) needed to meet the `restricted` profile's requirements. Additionally, it proposes a new controller that will rely on the [label syncer](https://github.com/openshift/enhancements/blob/master/enhancements/authentication/pod-security-admission-autolabeling.md) to ensure that operators shipped to OCP customers still work in OCP 4.12 without the need for author intervention. ## Motivation The Kubernetes API, from v1.25, will deprecate [PodSecurityPolicy](https://kubernetes.io/blog/2021/04/06/podsecuritypolicy-deprecation-past-present-and-future/#what-is-podsecuritypolicy) in favor of [Pod Security Admission](https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/2579-psp-replacement): a new controller that will enable cluster administrators to enforce [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/) (PSS) via [namespace labels](https://kubernetes.io/docs/tasks/configure-pod-container/enforce-standards-namespace-labels/). PSS offers three security policies, in increasing level of restrictiveness, with `restricted`, as the name suggests, being the most restrictive. If a Pod does not meet with the security policy enforced by the namespace, it will not be scheduled. From OpenShift 4.12, by default, namespaces will be annotated to enforce the `restricted` security policy. Therefore, OLM's workloads need to be updated in order to meet the `restricted` policy's criteria. Specifically, the [OLM controller](https://github.com/openshift/operator-framework-olm/blob/master/manifests/0000_50_olm_07-olm-operator.deployment.yaml), [Catalog controller](https://github.com/openshift/operator-framework-olm/blob/master/manifests/0000_50_olm_08-catalog-operator.deployment.yaml), [CatalogSource pods](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L130), and [bundle unpack job](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/bundle/bundle_unpacker.go#L89). In the wake of these security changes, OLM will also need to ensure that the packages distributed and installed by OLM still work on OpenShift 4.12 without need of author intervention, and that other tools and functions in the operator-framework that make use of OLM or registry objects still work in a backwards compatible way. For instance, `operator-sdk run bundle`. ### Pod Security Admission #### Namespace Labels As described [here](https://kubernetes.io/docs/concepts/security/pod-security-admission/#pod-security-admission-labels-for-namespaces), PSA provides the following modes: | Mode | Description | | -------- | -------- | | enforce | Policy violations will cause the pod to be rejected | | audit | Policy violations will trigger the addition of an audit annotation to the event recorded in the audit log, but are otherwise allowed | | warn | Policy violations will trigger a user-facing warning, but are otherwise allowed | For each mode, there are two labels that will govern PSA behavior in the namespace: ```yaml # The per-mode level label indicates which policy level to apply for the mode. # # MODE must be one of `enforce`, `audit`, or `warn`. # LEVEL must be one of `privileged`, `baseline`, or `restricted`. pod-security.kubernetes.io/<MODE>: <LEVEL> # Optional: per-mode version label that can be used to pin the policy to the # version that shipped with a given Kubernetes minor version (for example v1.24). # # MODE must be one of `enforce`, `audit`, or `warn`. # VERSION must be a valid Kubernetes minor version, or `latest`. pod-security.kubernetes.io/<MODE>-version: <VERSION> ``` #### Required `securityContext` Changes Below are the `pod.spec` `securityContext` changes that need to be made for workloads to conform to the `restricted` profile. ```yaml spec: securityContext: seccompProfile: type: RuntimeDefault runAsNonRoot: true containers: - name: my-container securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL ``` It should be noted that: * `seccompProfile` is only available for Kubernetes v1.19+ * `runAsNonRoot` will either require the container to have a user (have the USER directive in its dockerfile); or also have`runAsUser` declared in the in the `securityContext` #### Interaction with OpenShift Security Context Constraints (SCC) To adjust to the PSA changes, from OpenShift 4.11 introduces the `restricted-v2` [SCC Security Constraint](https://docs.openshift.com/container-platform/4.10/authentication/managing-security-context-constraints.html#default-sccs_configuring-internal-oauth). Permission to use this SCC is granted by default to all users on the cluster. `restricted-v2` drops ALL capabilities, while the legacy `restricted` security constraint only dropped a subset of capabilities. Another important aspect of the `restricted-v2` is the [MustRunAsRange](https://docs.openshift.com/container-platform/4.10/authentication/managing-security-context-constraints.html#authorization-SCC-strategies_configuring-internal-oauth) constraint strategy, which requires the parameter of `securityContex.runAsUser` to be in a range defined by the SCC. However, if `securityContext.runAsUser` is omitted from the spec, it will be injected with a valid value (i.e. a user id in the required range). Much like `securityContext.runAsUser`, when imitted, `securityContext.seccompProfile` will also be injected by SCC into the pod spec with a valid value. #### Ensuring Workloads (still) Work OCP 4.12 will include a [Pod Security Admission autolabeling controller](https://github.com/openshift/enhancements/blob/master/enhancements/authentication/pod-security-admission-autolabeling.md) which will determine the right security profile to apply to namespaces based on the workloads present in the namespace. It will explicitly exclude all CVO managed/standard openshift namespaces, but to sync any other openshift namespaces that are explicitly labeled with `security.openshift.io/scc.podSecurityLabelSync=true` OLM operators are installed in `openshift-*` namespaces, including in some cases shared namespaces (e.g `openshift-operators`). OLM will need to solve this gap by adding an additional controller to label `openshift-*` namespaces that contain `ClusterServiceVersion` (CSV) resources. ### Goals * Migrate OLM and all its workloads to the `restricted` PSS profile * Ensure OLM and all its workloads fall under the `restricted-v2` OpenShift Security Context Constraints (SCC) profile * Ensure both sqlite- and FBC-based catalogs work under the `restricted` PSS profile and `restricted-v2` SCC profile * Ensure Red Hat shipped catalogs OpenShift 4.11 still work seamlessly (to support cluster upgrades) * Provide a mechanism to ensure content shipped and installed by OLM works seamlessly on OpenShift 4.12 without the need for author or cluster admin intervention * Ensure operator-framework tools still work * Provide escape-hatches and migration instructions to users in cases where their workloads, especially older catalog sources, might not be runnable under the `restricted` PSS and `restricted-v2` SCC profiles ### Non-Goals ## Proposal ### Upstream #### OLM Namespace Update the [OLM namespace](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/chart/templates/0000_50_olm_00-namespace.yaml) to include the `pod-security.kubernetes.io/enforce=restricted` and `pod-security.kubernetes.io/restricted-latest` labels. We'll pin the profile to latest to break CI when graduating to future versions of kubernetes signaling and act as a forcing function to make any necessary updates. #### OLM Images Add `USER 1001` to [Dockerfile](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/Dockerfile) and [Dockerfile.goreleaser](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/Dockerfile.goreleaser) Adding the user to the images will make it so that `runAsUser` won't need to be used and make downstreaming easier. #### OLM Controllers Update the [OLM controller](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/chart/templates/0000_50_olm_07-olm-operator.deployment.yaml) and [Catalog controller](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/chart/templates/0000_50_olm_08-catalog-operator.deployment.yaml) deployments' `pod.spec`: ```yaml securityContext: runAsNonRoot: true seccompProfile: type: RuntimeDefault containers: - name: container securityContext: privileged: false readOnlyRootFilesystem: true allowPrivilegeEscalation: false capabilities: drop: ["ALL"] ``` #### CatalogSource Pods The [`CatalogSource` pod](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L130) runs with the current [`opm`](https://github.com/operator-framework/operator-registry) image at the time the index was built. This can be one of two images: `quay.io/operator-framework/opm:latest` (most recent), or `quay.io/operator-framework/upstream-opm-builder:latest` (legacy). Furthermore, we currently support two types of catalogs: sqlite-based (legacy) and File Based Catalogs (FBC) (current). From OCP 4.11, the catalogs shipped with OpenShift are FBC-based. A recent bug was discovered in sqlite-based catalogs, where a copy of the sqlite database was being made to the root of the file system (/). When running as non-root, the process fails with a `permission denied` when attemping to create this copy. The bug has since been fixed, but will still plague any sqlite-based catalog built against an opm image that does not contain the fix. There is no way to resolve this issue within the `restricted` profile since it would require the pod to run as the root user. In order to support these catalogs, a toggle will need to be added to the CatalogSource API (`spec.disableSecurity`) to allow users to switch off the new security settings. Users will still require that the target the target namespace support the PSA `baseline` or `privileged` security profiles. The [`CatalogSource` pod](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L130) spec will need to be updated as follows: ```yaml securityContext: runAsNonRoot: true runAsUser: 1001 seccompProfile: type: RuntimeDefault containers: - name: container securityContext: privileged: false # must be false to allow sqlite-catalogs to create # a copy of the database readOnlyRootFilesystem: false allowPrivilegeEscalation: false capabilities: drop: ["ALL"] ``` Note that `runAsUser` will need to be specified here in order to support older catalogs (whether sqlite or FBC) since the `opm` images do not define a USER. #### Bundle Unpack Job The [Bundle Unpack job](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/bundle/bundle_unpacker.go#L89) is composed of the OLM, opm, and the bundle (that needs to be unpacked) images. Neither while we could guarantee that the olm and opm images are baked with a USER, we cannot make the same guarantee for the bundle image. Therefore, `runAsUser` will also need to be included in the `securityContext`: ```yaml securityContext: runAsNonRoot: true runAsUser: 1001 seccompProfile: type: RuntimeDefault # for each container, inc. init containers containers: - name: container securityContext: privileged: false readOnlyRootFilesystem: true allowPrivilegeEscalation: false capabilities: drop: ["ALL"] ``` Note: I noticed that we still [reference](https://github.com/operator-framework/operator-lifecycle-manager/blob/83e3ebf96856643286e2e3e9438352c975923858/cmd/catalog/main.go#L30) `quay.io/operator-framework/upstream-opm-builder:latest` as the opm image used by the catalog source controller. We may also want to update this to `quay.io/operator-framework/opm:latest`. #### operator-sdk run bundle It is worth calling out that the pod baking `operator-sdk run bundle` for sqlite-based catalogs has already been [updated](https://github.com/operator-framework/operator-sdk/blame/master/internal/olm/operator/registry/index/registry_pod.go#L222) to `runAsNonRoot = false` and `runAsUser = 0`. This should ensure all legacy catalogs should run without problems (as long as the namespace where it get deployed does not enforce the `restricted` profile). Though, it may be worthwhile printing a warning for the user explaining the risks and linking to the appropriate migration documentation. ### Downstream #### OLM For OLM, the same changes should be made in the downstream as described for the upstream with the following differences: * Add the `openshift.io/scc: "restricted-v2"` to the `openshift-operator-lifecycle-manager` and `openshift-operators` namespaces * Remove the `runAsUser` and `seccompProfile` from the `securityContext`s of the controller deployments, CatalogSource and bundle unpack pods - SCC will inject these values in the pods #### operator-marketplace Update the [`operator-marketplace`](https://github.com/operator-framework/operator-marketplace/blob/master/manifests/01_namespace.yaml) namespace with the `openshift.io/scc: "restricted-v2"` to the `openshift-operator-lifecycle-manager` labels, and remove the pod-`security.kubernetes.io/audit: baseline` and `pod-security.kubernetes.io/warn: baseline` labels. #### Operator Workload Support: OLM Auto Labeler Controller Operators distributed and installed by OLM need to continue working in OpenShift 4.12 without author intervention (i.e. updating their resources for the PSA changes). The [Pod Security Admission autolabeling controller](https://github.com/openshift/enhancements/blob/master/enhancements/authentication/pod-security-admission-autolabeling.md) will monitor namespaces whose name is *not* prefixed with `openshift-` and namespaces with the `security.openshift.io/scc.podSecurityLabelSync=true` label to determine the right PSA profile to allow the workloads to be admitted. This is a problem for olm operators as many reside in the shared `openshift-operators` namespace of have a suggested namespace prefixed with `openshift-`. To bridge this gap, OLM will ship with an additional controller that will label `openshift-*` namespaces that contain operators (i.e. ClusterServiceVersion resources) with the `security.openshift.io/scc.podSecurityLabelSync=true` label *unless* the namespace has the `security.openshift.io/scc.podSecurityLabelSync=false` label. The controller will watch `Namespace` and `ClusterServiceVersion` resources on the cluster and act on `Namespace` change, `ClusterServiceVersion` created, and `ClusterServiceVersion` deleted events. The controller cache will only watch `ClusterServiceVersion` resources that *do not* contain the `olm.copiedFrom` label (i.e. `oc get csv --all-namespaces -l "!olm.copiedFrom"`) and to filter namespaces whose name is not prefixed with `openshift-`. ##### Business Logic The controller must keep track of the namespaces it manages since the admin could manually control the `security.openshift.io/scc.podSecurityLabelSync` labels on particular `openshift-*` namespaces and the controller should not override the admin. The controller must label unmanaged `openshift-*` namespaces with `security.openshift.io/scc.podSecurityLabelSync=true` and reset the label to `true` in case it is manually changed. To indicate a namespaces is managed by the OLM Auto Labeler, the following annotation will be applied to the namespace: `olm.auto-labeler.managed=true`. The admin can opt-out of management by the OLM Auto Labeler by either removing the annotation; or setting it to 'false'. To summarize, the OLM Auto Labeler will manage namespaces prefixed with `openshift-` that are either annotated with `olm.auto-labeler.managed=true` or are *not* labeled with `security.openshift.io/scc.podSecurityLabelSync` (whether set to 'true' or 'false') - always ensuring these namespaces contain the `security.openshift.io/scc.podSecurityLabelSync=true` label. The following pseudo-code illustrates the controller logic: ``` switch event: case csv-created: # label and annotate unmaneged ns if its not already labeled if(!hasLabel(ns, "scc.podSecurityLabelSync") && !hasAnnotation(ns, "olm.auto-labeler.managed", false)): labelNs(ns, "scc.podSecurityLabelSync=true") annotateNs(ns, "olm.auto-labeler.managed=true") case csv-deleted: # remove labels and annotations from managed ns that no longer has csvs and is managed by this controller if(hasAnnotation(ns, "olm.auto-labeler.managed", true) && countCSVs(ns) == 0): removeLabel(ns, "scc.podSecurityLabelSync") removeAnnotation(ns, "olm.auto-labeler.managed") case ns-changed: # correct labels and annotations for ns managed by this controller, if necessary if(hasAnnotation(ns, "olm.auto-labeler.managed", true) && !hasLabelWithValue(ns, "scc.podSecurityLabelSync", "true")): labelNs(ns, "scc.podSecurityLabelSync=true") default: # https://i.kym-cdn.com/entries/icons/original/000/030/359/cover4.jpg return ``` ##### RBAC requirements Controller must be able to: 1. List all `Namespace` resources 2. Edit all `Namespace` resources 3. Get `Namespace` resources 4. List all `ClusterServiceVersion` resources 5. Get `ClusterServiceVersion` resources ### User Stories [optional] Detail the things that people will be able to do if this is implemented. Include as much detail as possible so that people can understand the "how" of the system. The goal here is to make this feel real for users without getting bogged down. #### Story 1 #### Story 2 ### Implementation Details/Notes/Constraints [optional] #### SCC and PSA Openshift's [Security Context Constraints](https://docs.openshift.com/container-platform/3.11/admin_guide/manage_scc.html) (SCC) mutates pods to conform to the most permissive SCC profile available to the service account creating the pod - more information can be found [here](https://docs.openshift.com/container-platform/4.10/authentication/managing-security-context-constraints.html#admission_configuring-internal-oauth). #### SQLite-based Catalog Source Issues What are the caveats to the implementation? What are some important details that didn't come across above. Go in to as much detail as necessary here. This might be a good place to talk about core concepts and how they relate. ### Risks and Mitigations What are the risks of this proposal and how do we mitigate. Think broadly. For example, consider both security and how this will impact the larger Operator Framework ecosystem. How will security be reviewed and by whom? How will UX be reviewed and by whom? Consider including folks that also work outside your immediate sub-project. ## Design Details ### Test Plan **Note:** *Section not required until targeted at a release.* Consider the following in developing a test plan for this enhancement: - Will there be e2e and integration tests, in addition to unit tests? - How will it be tested in isolation vs with other components? No need to outline all of the test cases, just the general strategy. Anything that would count as tricky in the implementation and anything particularly challenging to test should be called out. All code is expected to have adequate tests (eventually with coverage expectations). ### Graduation Criteria **Note:** *Section not required until targeted at a release.* Define graduation milestones. These may be defined in terms of API maturity, or as something else. Initial proposal should keep this high-level with a focus on what signals will be looked at to determine graduation. Consider the following in developing the graduation criteria for this enhancement: - Maturity levels - `Dev Preview`, `Tech Preview`, `GA` - Deprecation Clearly define what graduation means. #### Examples These are generalized examples to consider, in addition to the aforementioned [maturity levels][maturity-levels]. ##### Dev Preview -> Tech Preview - Ability to utilize the enhancement end to end - End user documentation, relative API stability - Sufficient test coverage - Gather feedback from users rather than just developers ##### Tech Preview -> GA - More testing (upgrade, downgrade, scale) - Sufficient time for feedback - Available by default **For non-optional features moving to GA, the graduation criteria must include end to end tests.** ##### Removing a deprecated feature - Announce deprecation and support policy of the existing feature - Deprecate the feature ### Upgrade / Downgrade Strategy If applicable, how will the component be upgraded and downgraded? Make sure this is in the test plan. Consider the following in developing an upgrade/downgrade strategy for this enhancement: - What changes (in invocations, configurations, API use, etc.) is an existing cluster required to make on upgrade in order to keep previous behavior? - What changes (in invocations, configurations, API use, etc.) is an existing cluster required to make on upgrade in order to make use of the enhancement? ### Version Skew Strategy How will the component handle version skew with other components? What are the guarantees? Make sure this is in the test plan. Consider the following in developing a version skew strategy for this enhancement: - During an upgrade, we will always have skew among components, how will this impact your work? - Does this enhancement involve coordinating behavior in the control plane and in the kubelet? How does an n-2 kubelet without this feature available behave when this feature is used? - Will any other components on the node change? For example, changes to CSI, CRI or CNI may require updating that component before the kubelet. ## Implementation History Major milestones in the life cycle of a proposal should be tracked in `Implementation History`. ## Drawbacks The idea is to find the best form of an argument why this enhancement should _not_ be implemented. ## Alternatives Similar to the `Drawbacks` section the `Alternatives` section is used to highlight and record other possible approaches to delivering the value proposed by an enhancement. ## Infrastructure Needed [optional] Use this section if you need things from the project. Examples include a new subproject, repos requested, github details, and/or testing infrastructure. Listing these here allows the community to get the process for these resources started right away.