--- title: Descoping Plan authors: - "@ecordell" reviewers: - TBD approvers: - TBD creation-date: 2022-07-28 last-updated: 2022-07-28 status: provisional tags: Descoped Operators --- # Descoping Plan 1. Background / why this is necessary 2. A brief description of the end state 3. Per-release transition plan 5. Operator Patterns to replace "scoping" ## Definitions **Descoped Operator**: An operator in a cluster that is expected to be the sole owner of an API in a cluster. **Scoped Operator**: An operator that has been installed under a previous version of OLM and includes scoping assumptions (including metadata like `installMode`). ## Background ### History When OLM was first written, CRDs defined only the existence of a `GVK` in a cluster. Operators developed for OLM could only install in a namespace, watching that namespace - this delivered on the self-service, operational-encoding story of operators. The same operator could be installed in every namespace of a cluster. Privilege escalation became a concern - since operators are run with a service account in a namespace, anyone with the ability to create workloads in that namespace could escalate to the permissions of the operator. This made service provider/consumer relationships a difficult sell for operators in OLM. At the same time, CRDs continued to add features. With version schemas and admission and conversion webhooks, CRDs no longer simply registered a global name for a type, and operators in separate namespaces had lots of options to interfere with one another if they shared the same CRD. OLM also expanded to support `APIService`s in addition to operators based on CRDs, and so required a notion of cluster-wide operators. To address these concerns, a notion of `scoping` operators was introduced via the `OperatorGroup` object. An `OperatorGroup` would specify a set of namespaces within a cluster in which all operators installed would share the same scope. OLM would ensure that only one operator within a namespace owned a particular CRD to avoid collision problems, and more installation options were provided to allow separating operators from their managed workloads. ### Problem But `OperatorGroups` do not alter the fundamental problem: that apis in a kubernetes cluster are cluster-scoped. They are visible via discovery to any user that wishes to see them. Even operators that agree on a particular GVK may have differences of opinion in how those objects should be admitted to a cluster, or how conversion between api versions should happen. With Operator Framework, we want to build an ecosystem of high-quality operators that can be re-used across different projects, whether they're in the same cluster or not. But re-using operators compounds the scoping problems within a cluster - it increases the likelihood that more than one "opinion" about an API exists in the cluster. For these reasons we are looking to entirely remove the notion of scoping from OLM. ### What does this mean? It means that (in the near future) for any operator installed via OLM, we expect that: - If an operator provides an API, it provides it for the entire cluster. - At a minimum, it should write into the `status` of that API (if the API has a `status` section, as most do) - Only one operator will provide that API in a cluster, without an administrator explicitly allowing shared ownership (i.e. ingress controllers). It does **not** mean that: - Every operator needs to have permission to do its job in every namespace - Every user in a cluster needs to have permission to use the operator's APIs - Only one controller pod needs to run for that api in a cluster - Only one controller can be installed to manage an API (i.e. ingress-style) If you are an operator author and the above statements are concerning, please review the [Operator Patterns](#operator-patterns) section for suggestions on how to acheive your goals in a descoped world. ## End State For the most part, the final state for de-scoping will be a transition away from namespace-scoped APIs like `Subscription`, `InstallPlan`, and `ClusterServiceVersion` to a different set of cluster-scoped APIs like `Operator` and `Install`. These newer apis avoid much of the complexity introduced by scoping, and are already in progress. The `Operator` api is availabe as a read-only API in 4.6. The document fully describing this end state is the [Simplify OLM APIs](https://github.com/operator-framework/enhancements/blob/master/enhancements/simplify-olm-apis.md) enhancement **Note: the enhancement is currently pending an update to call out the scoping issues**. The only change as a result of the decision to de-scope is to always assume that operators are descoped in the new APIs. At some point, it is likely that OLM will introduce some namespaced APIs again for the installation of non-operator content. But this will be accompanied by its own enhancement. ## Transition Plan ### RBAC Operators may still request the cluster- and namespace-scoped permissions they need to run within their installation namespace at installtime (i.e. today, via `clusterPermissions` and `permissions`). But unlike with scoped operators, the namespace-scoped permissions do not get copied to a pre-defined set of namespaces and no bindings are created by default. De-scoped operators need to be concerned about RBAC in two general areas: #### 1 - Access to do work in a particular namespace A descoped operator will not be allowed to install any bindings - this is the job of an administrator. Operators will provide a set of `ClusterRoles` for the work that they need to do. For example, an operator with serviceaccount `foo-sa` might provide this `ClusterRole`: ```yaml= apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: generateName: foo-sa-required-perms rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "watch", "list"] ``` An administrator can bind this with a `ClusterRoleBinding` if they want to operator to be able to perform its tasks in all namespaces. Or, they may drop a `RoleBinding` for this role binding the operator's `ServiceAccount` in each desired namespace. An operator author may provide more or less granular `ClusterRoles`. Most operators likely just need one per serviceaccount, but others may wish to provide granular feature-based ClusterRoles so that an administrator can enable/disable portions of the operator's functionality. A convention will be established: a single ClusterRole for a ServiceAccount will be assumed to be required for the operator to operate. Multiple ClusterRoles will be considered optional, such that creating/deleting them will enable/disable certain aspects of the operator. One exception: De-scoped operators are always expected to have `read,update /status` permission on the APIs they own in all namespaces. This is to ensure that the operator has a communication channel with users (to communicate, for example, that they do not have the proper permission to do work in a particular namespace). OLM will raise alerts when there are CRs in a cluster with no controller capabable of updating their status. Note: This is a degradation of the install experience, see below for the interim solution. #### 2 - Granting a user permission to use the operator's provided APIs For scoped operators, OLM automatically generates ClusterRoles and automatically aggregates them to the default `admin`, `edit`, and `view` ClusterRoles, based on the availability of an operator in a particular namespace. For de-scoped operators, OLM will leave this to the administrator. Note: This is a degradation of the install experience, see below for the interim solution. #### OperatorGroups The above changes for de-scoped operators, without any additional tooling, degrades the operator installation experience. There is no declarative way to indicate that an operator should be permitted to work in a set of namespaces (it becomes a two step process: install, bind). For now, we will repurpose `OperatorGroups` for RBAC management. An `OperatorGroup` will not be required for the installation of a de-scoped operator (as they are today for scoped operators). If de-scoped operator is installed in an `OperatorGroup`, the namespace list on an operatorgroup is used to determine which namespaces will get automatic bindings for the operator's serviceaccounts, and will generate and aggregate API access RBAC roles to `view`, `edit`, and `admin`. This should be considered an interim solution. A more configurable and more supported API may look something like [RBACManager](https://github.com/FairwindsOps/rbac-manager#dynamic-namespaces-and-labels). ### Determining Scope Since CSVless bundles are not yet available, we will indicate that an operator is descoped by marking `installModes` as optional. Any CSV with an empty `installModes` block will be treated as a de-scoped operator. Any CSV with an `installModes` block that supports AllNamespace-mode **only** will be treated as a de-scoped operator that provides APIs to all users by default (i.e. OLM will generate an AllNamespace OperatorGroup for it). ### Operator Visibility The `Operator` API is a cluster-scoped API to reflect the cluster-scoped nature of api extensions. Users of the provided APIs can tell that they exist via discovery, but admins may be hesitant to grant `read` on the Operator API itself to learn more about the operator and its services. Discovery of available operators for UI will flow through discovery: 1. Query to discover which APIs are readable/writable by the current user 1. Query discovery for those APIs 1. Read the schema for the API. The top level description of the API resource should include: the name of the owning operator, a description of the owning operator, along with links, descriptors, and other metadata. 1. UI groups apis based on the operator identity in their schema descriptions. TODO: this will not work well for apis shared between operators, and makes it easier to hit etcd key limits. ### Dependency Resolution / Updates Dependency resolution will continue to take place at the namespace scope for operators intalled via `Subscriptions`. Any operators installed this way will have an `Operator` object created automatically for visibility, but the `Operator` may begin to emit warnings about the scoped nature of the installation. Resolution will also take place among de-scoped operators at the cluster scope. Note that resolution or updates of de-scoped operators can be blocked by issues with scoped operators. ### Updating from a scoped operator to a descoped operator TODO ### Installation-specific migrations steps #### Single Install of an Operator Affects: OwnNamespace, SingleNamespace, MultiNamespace Transition: - Remove the namespace suffix from the existing generated Operator object (i.e. operatorname-ns-foo -> operatorname) - Emit a warning message on the Operator objects that the operator is not de-scoped. - If CSV supports multinamespace mode, suggest migrating to a single instance of the multinamespace operator - which will then be picked up by the single instance transition. - An incoming update may de-scope this operator, or users can manually migrate to a de-scoped alternative. #### AllNamespace Operators Transition: - De-namespace the Operator object (i.e. operator-ns-foo -> operator) - Ensure the default bindings are available in all namespaces. #### Multiple Installs of the Same Version Affects: OwnNamespace, SingleNamespace, MultiNamespace Transition: - Emit a warning message on the Operator objects that the operator is not de-scoped. - If CSV supports multinamespace mode, suggest migrating to a single instance of the multinamespace operator - which will then be picked up by the single instance transition. - Otherwise, requires manual cleanup. #### Multiple Installs at Different Versions Affects: OwnNamespace, SingleNamespace, MultiNamespace - Emit a warning message on the Operator objects that the operator is not de-scoped and requires manual clean-up (selection of new alternative operators). ## Operator Patterns Many of the use-cases for scoped operators are better suited as features within the operator itself. ### Limit the Blast Radius of Operators One of the primary reasons for scoping operators is to reason about their Blast Radius - i.e. in the case of a bug or malicious control, limit the worst-case scenario for the cluster. Limiting blast radius for de-scoped operators is generally simpler to reason about, because it relies heavily on auditable RBAC policy. In this example, the de-scoped operator is installed with a ClusterRole only, and an administrator must explicitly bind each namespace that the operator should be allowed to use via a RoleBinding: ```mermaid graph TD subgraph Cluster subgraph Operator Namespace operator("operator") cr("ClusterRole: Pod R/W") sa("ServiceAccount") operator -->|uses| sa end subgraph User Namespace rb("RoleBinding") wk("Workload Pod") rb -->|uses| cr rb -->|bound to| sa end operator --> |allowed|wk subgraph kube-system sp("Sensitive Pod") no("No RoleBinding<br />No Access") end end style operator fill:#8addf2,stroke:#333,stroke-width:4px ``` ### Internal APIs Operators scoped to a single namespace are often used to provide internal APIs that should not be available anywhere else in the cluster. These may be config apis that configure cluster operation or APIs that are otherwise sensitive. This differs from example above of limiting an operator's blast radius - the operator author doesn't want it to be possible for an administrator to expose the APIs to other users and namespaces, or wants a guarantee that the operator is not given permission outside of its installation namespace. In this example, an operator is granted restricted permissions for a single namespace, and provides a Cluster-Scoped API. This ensures that anyone with access to write the API has been vetted, and that the operator itself can only perform operations within its own namespace. ```mermaid graph TD subgraph Cluster subgraph Operator Namespace operator("operator") cr("Role: Pod R/W") rb("RoleBinding") wk("Workload Pod") sa("ServiceAccount") operator -->|uses| sa rb -->|uses| cr rb -->|bound to| sa operator --> |creates|wk end api("InternalOperatorAPI<br />(Cluster Scoped)") operator -->|watches|api end style operator fill:#8addf2,stroke:#333,stroke-width:4px ``` ### Isolate Tenants Scoping is often desired in order to deal with tenant isolation and prevent noisy-neighbor effects. Without mutliple clusters or a first-class tenancy effort within kubernetes, this will never be truly possible (i.e. scoped operators may provide isolation for their APIs, but not for underlying kubernetes APIs, etcd access, or physical cluster resources). For operators that still wish to isolate tenants, however, this is possible by having a single parent operator spin up multiple control loops or even operator pods. These architectures may also be a strategy to scale operators horizontally. #### Controller Per Tenant Spin up controllers per tenant within a running process. ```mermaid graph TD subgraph Cluster subgraph Operator Namespace subgraph Operator Pod operator("operator") operator -->|starts| child1("Controller for 1") operator -->|starts| child2("Controller for 2") end end subgraph User Namespace 1 wk("Workload Pod") end subgraph User Namespace 2 wk2("Workload Pod") end child1 --> |manages|wk child2 --> |manages|wk2 end style operator fill:#8addf2,stroke:#333,stroke-width:4px style child1 fill:#8addf2,stroke:#333,stroke-width:4px style child2 fill:#8addf2,stroke:#333,stroke-width:4px ``` #### Child Operator Pod Per Tenant This is similar to above, but spins up controllers in their own pod. This may be valuable to leverage the cluster scheduler, or to have specific tenants mangaged with an (auditable) reduction is permission scope. ```mermaid graph TD subgraph Cluster subgraph Operator Namespace subgraph Operator Pod operator("operator") end subgraph Child Pod 1 child1("Controller for 1") end subgraph Child Pod 2 child2("Controller for 2") end end subgraph User Namespace 1 wk("Workload Pod") end subgraph User Namespace 2 wk2("Workload Pod") end operator -->|starts| child1("Controller for 1") operator -->|starts| child2("Controller for 2") child1 --> |manages|wk child2 --> |manages|wk2 end style operator fill:#8addf2,stroke:#333,stroke-width:4px style child1 fill:#8addf2,stroke:#333,stroke-width:4px style child2 fill:#8addf2,stroke:#333,stroke-width:4px ``` #### Sidecar Operator Pods This is a similar pattern, but places the pods near to the workloads they manage. This has different visibility and permission implications that may be desirable depending on the tasks being performed. ```mermaid graph TD subgraph Cluster subgraph Operator Namespace subgraph Operator Pod operator("operator") end end subgraph User Namespace 1 subgraph Child Pod 1 child1("Controller for 1") end wk("Workload Pod 1") end subgraph User Namespace 2 subgraph Child Pod 2 child2("Controller for 2") end wk2("Workload Pod 2") end operator -->|starts| child1("Controller for 1") operator -->|starts| child2("Controller for 2") child1 --> |manages|wk child2 --> |manages|wk2 end style operator fill:#8addf2,stroke:#333,stroke-width:4px style child1 fill:#8addf2,stroke:#333,stroke-width:4px style child2 fill:#8addf2,stroke:#333,stroke-width:4px ``` ### Canary Rollouts Scoping has also been considered as a solution for canary rollouts of new operator versions. A new version may be released to manage one namespace, while the previous release manages the rest of the cluster's namespaces. During an upgrade, there will be two operators running at one time. This provides an opportunity for both operators to transfer ownership, handoff locks, or otherwise coordinate the rollout of operator resources - combined with a strategy for [limiting the blast radius](#limit-the-blast-radius) and use of [operator conditions](https://olm.operatorframework.io/docs/advanced-tasks/communicating-operator-conditions-to-olm/#upgradeable), effective and domain-specific canary or other rollout strategies can be implemented. ```mermaid graph TD subgraph Cluster subgraph Operator Namespace subgraph Previous Operator Pod operator("operator") end subgraph New Operator Pod new("operator<br />Upgradeable: false") end end subgraph User Namespace 1 wk("Workload Pod 1") end subgraph User Namespace 2 wk2("Workload Pod 2") end subgraph User Namespace 3 wk3("Workload Pod 3") end operator --> |manages|wk operator --> |manages|wk2 new --> |manages|wk3 end style operator fill:#8addf2,stroke:#333,stroke-width:4px style new fill:#8addf2,stroke:#333,stroke-width:4px ``` The precise mechanism for determining when and how the new version should take control can be determined by the operator (or a separate or external tool), which can allow automated rollout based on domain-specific metrics. For example, an operator might use ownerreferences or labels on CRs to indicate which version is currently managing that instance, with ownership flipped manually. Additional automation can be added to automatically flip management based on metrics / percent rollout, i.e. "for every 2hours that metrics still look healthy, increase rollout to management by the new operator by 10%". This is a strategy for limiting the impact of rolling out the operator itself, but similarly domain-specific rollout strategies may be defined for any operands (rules about which versions can be upgraded to which others, etc). The only part of this that OLM needs to be informed of is that the rollout is taking place - it is important to set the `Upgradeable=false OperatorCondition` so that OLM does not attempt to interfere with the handoff.