owned this note
owned this note
Published
Linked with GitHub
# Descoping Plan
**NOTE: A [copy](https://hackmd.io/DTVukSyGSkSLwD9WgZsdbw) has been made available in the operator-framework hackmd organization.**
1. Background / why this is necessary
2. A brief description of the end state
3. Per-release transition plan
5. Operator Patterns to replace "scoping"
## Definitions
**Descoped Operator**: An operator in a cluster that is expected to be the sole owner of an API in a cluster.
**Scoped Operator**: An operator that has been installed under a previous version of OLM and includes scoping assumptions (including metadata like `installMode`).
## Background
### History
When OLM was first written, CRDs defined only the existence of a `GVK` in a cluster. Operators developed for OLM could only install in a namespace, watching that namespace - this delivered on the self-service, operational-encoding story of operators. The same operator could be installed in every namespace of a cluster.
Privilege escalation became a concern - since operators are run with a service account in a namespace, anyone with the ability to create workloads in that namespace could escalate to the permissions of the operator. This made service provider/consumer relationships a difficult sell for operators in OLM.
At the same time, CRDs continued to add features. With version schemas and admission and conversion webhooks, CRDs no longer simply registered a global name for a type, and operators in separate namespaces had lots of options to interfere with one another if they shared the same CRD. OLM also expanded to support `APIService`s in addition to operators based on CRDs, and so required a notion of cluster-wide operators.
To address these concerns, a notion of `scoping` operators was introduced via the `OperatorGroup` object. An `OperatorGroup` would specify a set of namespaces within a cluster in which all operators installed would share the same scope. OLM would ensure that only one operator within a namespace owned a particular CRD to avoid collision problems, and more installation options were provided to allow separating operators from their managed workloads.
### Problem
But `OperatorGroups` do not alter the fundamental problem: that apis in a kubernetes cluster are cluster-scoped. They are visible via discovery to any user that wishes to see them. Even operators that agree on a particular GVK may have differences of opinion in how those objects should be admitted to a cluster, or how conversion between api versions should happen.
With Operator Framework, we want to build an ecosystem of high-quality operators that can be re-used across different projects, whether they're in the same cluster or not. But re-using operators compounds the scoping problems within a cluster - it increases the likelihood that more than one "opinion" about an API exists in the cluster.
For these reasons we are looking to entirely remove the notion of scoping from OLM.
### What does this mean?
It means that (in the near future) for any operator installed via OLM, we expect that:
- If an operator provides an API, it provides it for the entire cluster.
- At a minimum, it should write into the `status` of that API (if the API has a `status` section, as most do)
- Only one operator will provide that API in a cluster, without an administrator explicitly allowing shared ownership (i.e. ingress controllers).
It does **not** mean that:
- Every operator needs to have permission to do its job in every namespace
- Every user in a cluster needs to have permission to use the operator's APIs
- Only one controller pod needs to run for that api in a cluster
- Only one controller can be installed to manage an API (i.e. ingress-style)
If you are an operator author and the above statements are concerning, please review the [Operator Patterns](#operator-patterns) section for suggestions on how to acheive your goals in a descoped world.
## End State
For the most part, the final state for de-scoping will be a transition away from namespace-scoped APIs like `Subscription`, `InstallPlan`, and `ClusterServiceVersion` to a different set of cluster-scoped APIs like `Operator` and `Install`.
These newer apis avoid much of the complexity introduced by scoping, and are already in progress. The `Operator` api is availabe as a read-only API in 4.6.
The document fully describing this end state is the [Simplify OLM APIs](https://github.com/operator-framework/enhancements/blob/master/enhancements/simplify-olm-apis.md) enhancement **Note: the enhancement is currently pending an update to call out the scoping issues**. The only change as a result of the decision to de-scope is to always assume that operators are descoped in the new APIs.
At some point, it is likely that OLM will introduce some namespaced APIs again for the installation of non-operator content. But this will be accompanied by its own enhancement.
## Transition Plan
### RBAC
Operators may still request the cluster- and namespace-scoped permissions they need to run within their installation namespace at installtime (i.e. today, via `clusterPermissions` and `permissions`). But unlike with scoped operators, the namespace-scoped permissions do not get copied to a pre-defined set of namespaces and no bindings are created by default.
De-scoped operators need to be concerned about RBAC in two general areas:
#### 1 - Access to do work in a particular namespace
A descoped operator will not be allowed to install any bindings - this is the job of an administrator.
Operators will provide a set of `ClusterRoles` for the work that they need to do.
For example, an operator with serviceaccount `foo-sa` might provide this `ClusterRole`:
```yaml=
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
generateName: foo-sa-required-perms
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
```
An administrator can bind this with a `ClusterRoleBinding` if they want to operator to be able to perform its tasks in all namespaces. Or, they may drop a `RoleBinding` for this role binding the operator's `ServiceAccount` in each desired namespace.
An operator author may provide more or less granular `ClusterRoles`. Most operators likely just need one per serviceaccount, but others may wish to provide granular feature-based ClusterRoles so that an administrator can enable/disable portions of the operator's functionality.
A convention will be established: a single ClusterRole for a ServiceAccount will be assumed to be required for the operator to operate. Multiple ClusterRoles will be considered optional, such that creating/deleting them will enable/disable certain aspects of the operator.
One exception: De-scoped operators are always expected to have `read,update /status` permission on the APIs they own in all namespaces. This is to ensure that the operator has a communication channel with users (to communicate, for example, that they do not have the proper permission to do work in a particular namespace). OLM will raise alerts when there are CRs in a cluster with no controller capabable of updating their status.
Note: This is a degradation of the install experience, see below for the interim solution.
#### 2 - Granting a user permission to use the operator's provided APIs
For scoped operators, OLM automatically generates ClusterRoles and automatically aggregates them to the default `admin`, `edit`, and `view` ClusterRoles, based on the availability of an operator in a particular namespace.
For de-scoped operators, OLM will leave this to the administrator.
Note: This is a degradation of the install experience, see below for the interim solution.
#### OperatorGroups
The above changes for de-scoped operators, without any additional tooling, degrades the operator installation experience. There is no declarative way to indicate that an operator should be permitted to work in a set of namespaces (it becomes a two step process: install, bind).
For now, we will repurpose `OperatorGroups` for RBAC management. An `OperatorGroup` will not be required for the installation of a de-scoped operator (as they are today for scoped operators).
If de-scoped operator is installed in an `OperatorGroup`, the namespace list on an operatorgroup is used to determine which namespaces will get automatic bindings for the operator's serviceaccounts, and will generate and aggregate API access RBAC roles to `view`, `edit`, and `admin`.
This should be considered an interim solution. A more configurable and more supported API may look something like [RBACManager](https://github.com/FairwindsOps/rbac-manager#dynamic-namespaces-and-labels).
### Determining Scope
Since CSVless bundles are not yet available, we will indicate that an operator is descoped by marking `installModes` as optional.
Any CSV with an empty `installModes` block will be treated as a de-scoped operator.
Any CSV with an `installModes` block that supports AllNamespace-mode **only** will be treated as a de-scoped operator that provides APIs to all users by default (i.e. OLM will generate an AllNamespace OperatorGroup for it).
### Operator Visibility
The `Operator` API is a cluster-scoped API to reflect the cluster-scoped nature of api extensions.
Users of the provided APIs can tell that they exist via discovery, but admins may be hesitant to grant `read` on the Operator API itself to learn more about the operator and its services.
Discovery of available operators for UI will flow through discovery:
1. Query to discover which APIs are readable/writable by the current user
1. Query discovery for those APIs
1. Read the schema for the API. The top level description of the API resource should include: the name of the owning operator, a description of the owning operator, along with links, descriptors, and other metadata.
1. UI groups apis based on the operator identity in their schema descriptions.
TODO: this will not work well for apis shared between operators, and makes it easier to hit etcd key limits.
### Dependency Resolution / Updates
Dependency resolution will continue to take place at the namespace scope for operators intalled via `Subscriptions`. Any operators installed this way will have an `Operator` object created automatically for visibility, but the `Operator` may begin to emit warnings about the scoped nature of the installation.
Resolution will also take place among de-scoped operators at the cluster scope. Note that resolution or updates of de-scoped operators can be blocked by issues with scoped operators.
### Updating from a scoped operator to a descoped operator
TODO
### Installation-specific migrations steps
#### Single Install of an Operator
Affects: OwnNamespace, SingleNamespace, MultiNamespace
Transition:
- Remove the namespace suffix from the existing generated Operator object (i.e. operatorname-ns-foo -> operatorname)
- Emit a warning message on the Operator objects that the operator is not de-scoped.
- If CSV supports multinamespace mode, suggest migrating to a single instance of the multinamespace operator - which will then be picked up by the single instance transition.
- An incoming update may de-scope this operator, or users can manually migrate to a de-scoped alternative.
#### AllNamespace Operators
Transition:
- De-namespace the Operator object (i.e. operator-ns-foo -> operator)
- Ensure the default bindings are available in all namespaces.
#### Multiple Installs of the Same Version
Affects: OwnNamespace, SingleNamespace, MultiNamespace
Transition:
- Emit a warning message on the Operator objects that the operator is not de-scoped.
- If CSV supports multinamespace mode, suggest migrating to a single instance of the multinamespace operator - which will then be picked up by the single instance transition.
- Otherwise, requires manual cleanup.
#### Multiple Installs at Different Versions
Affects: OwnNamespace, SingleNamespace, MultiNamespace
- Emit a warning message on the Operator objects that the operator is not de-scoped and requires manual clean-up (selection of new alternative operators).
## Operator Patterns
Many of the use-cases for scoped operators are better suited as features within the operator itself.
### Limit the Blast Radius of Operators
One of the primary reasons for scoping operators is to reason about their Blast Radius - i.e. in the case of a bug or malicious control, limit the worst-case scenario for the cluster.
Limiting blast radius for de-scoped operators is generally simpler to reason about, because it relies heavily on auditable RBAC policy.
In this example, the de-scoped operator is installed with a ClusterRole only, and an administrator must explicitly bind each namespace that the operator should be allowed to use via a RoleBinding:
```mermaid
graph TD
subgraph Cluster
subgraph Operator Namespace
operator("operator")
cr("ClusterRole: Pod R/W")
sa("ServiceAccount")
operator -->|uses| sa
end
subgraph User Namespace
rb("RoleBinding")
wk("Workload Pod")
rb -->|uses| cr
rb -->|bound to| sa
end
operator --> |allowed|wk
subgraph kube-system
sp("Sensitive Pod")
no("No RoleBinding<br />No Access")
end
end
style operator fill:#8addf2,stroke:#333,stroke-width:4px
```
### Internal APIs
Operators scoped to a single namespace are often used to provide internal APIs that should not be available anywhere else in the cluster. These may be config apis that configure cluster operation or APIs that are otherwise sensitive.
This differs from example above of limiting an operator's blast radius - the operator author doesn't want it to be possible for an administrator to expose the APIs to other users and namespaces, or wants a guarantee that the operator is not given permission outside of its installation namespace.
In this example, an operator is granted restricted permissions for a single namespace, and provides a Cluster-Scoped API. This ensures that anyone with access to write the API has been vetted, and that the operator itself can only perform operations within its own namespace.
```mermaid
graph TD
subgraph Cluster
subgraph Operator Namespace
operator("operator")
cr("Role: Pod R/W")
rb("RoleBinding")
wk("Workload Pod")
sa("ServiceAccount")
operator -->|uses| sa
rb -->|uses| cr
rb -->|bound to| sa
operator --> |creates|wk
end
api("InternalOperatorAPI<br />(Cluster Scoped)")
operator -->|watches|api
end
style operator fill:#8addf2,stroke:#333,stroke-width:4px
```
### Isolate Tenants
Scoping is often desired in order to deal with tenant isolation and prevent noisy-neighbor effects.
Without mutliple clusters or a first-class tenancy effort within kubernetes, this will never be truly possible (i.e. scoped operators may provide isolation for their APIs, but not for underlying kubernetes APIs, etcd access, or physical cluster resources). For operators that still wish to isolate tenants, however, this is possible by having a single parent operator spin up multiple control loops or even operator pods.
These architectures may also be a strategy to scale operators horizontally.
#### Controller Per Tenant
Spin up controllers per tenant within a running process.
```mermaid
graph TD
subgraph Cluster
subgraph Operator Namespace
subgraph Operator Pod
operator("operator")
operator -->|starts| child1("Controller for 1")
operator -->|starts| child2("Controller for 2")
end
end
subgraph User Namespace 1
wk("Workload Pod")
end
subgraph User Namespace 2
wk2("Workload Pod")
end
child1 --> |manages|wk
child2 --> |manages|wk2
end
style operator fill:#8addf2,stroke:#333,stroke-width:4px
style child1 fill:#8addf2,stroke:#333,stroke-width:4px
style child2 fill:#8addf2,stroke:#333,stroke-width:4px
```
#### Child Operator Pod Per Tenant
This is similar to above, but spins up controllers in their own pod. This may be valuable to leverage the cluster scheduler, or to have specific tenants mangaged with an (auditable) reduction is permission scope.
```mermaid
graph TD
subgraph Cluster
subgraph Operator Namespace
subgraph Operator Pod
operator("operator")
end
subgraph Child Pod 1
child1("Controller for 1")
end
subgraph Child Pod 2
child2("Controller for 2")
end
end
subgraph User Namespace 1
wk("Workload Pod")
end
subgraph User Namespace 2
wk2("Workload Pod")
end
operator -->|starts| child1("Controller for 1")
operator -->|starts| child2("Controller for 2")
child1 --> |manages|wk
child2 --> |manages|wk2
end
style operator fill:#8addf2,stroke:#333,stroke-width:4px
style child1 fill:#8addf2,stroke:#333,stroke-width:4px
style child2 fill:#8addf2,stroke:#333,stroke-width:4px
```
#### Sidecar Operator Pods
This is a similar pattern, but places the pods near to the workloads they manage. This has different visibility and permission implications that may be desirable depending on the tasks being performed.
```mermaid
graph TD
subgraph Cluster
subgraph Operator Namespace
subgraph Operator Pod
operator("operator")
end
end
subgraph User Namespace 1
subgraph Child Pod 1
child1("Controller for 1")
end
wk("Workload Pod 1")
end
subgraph User Namespace 2
subgraph Child Pod 2
child2("Controller for 2")
end
wk2("Workload Pod 2")
end
operator -->|starts| child1("Controller for 1")
operator -->|starts| child2("Controller for 2")
child1 --> |manages|wk
child2 --> |manages|wk2
end
style operator fill:#8addf2,stroke:#333,stroke-width:4px
style child1 fill:#8addf2,stroke:#333,stroke-width:4px
style child2 fill:#8addf2,stroke:#333,stroke-width:4px
```
### Canary Rollouts
Scoping has also been considered as a solution for canary rollouts of new operator versions. A new version may be released to manage one namespace, while the previous release manages the rest of the cluster's namespaces.
During an upgrade, there will be two operators running at one time. This provides an opportunity for both operators to transfer ownership, handoff locks, or otherwise coordinate the rollout of operator resources - combined with a strategy for [limiting the blast radius](#limit-the-blast-radius) and use of [operator conditions](https://olm.operatorframework.io/docs/advanced-tasks/communicating-operator-conditions-to-olm/#upgradeable), effective and domain-specific canary or other rollout strategies can be implemented.
```mermaid
graph TD
subgraph Cluster
subgraph Operator Namespace
subgraph Previous Operator Pod
operator("operator")
end
subgraph New Operator Pod
new("operator<br />Upgradeable: false")
end
end
subgraph User Namespace 1
wk("Workload Pod 1")
end
subgraph User Namespace 2
wk2("Workload Pod 2")
end
subgraph User Namespace 3
wk3("Workload Pod 3")
end
operator --> |manages|wk
operator --> |manages|wk2
new --> |manages|wk3
end
style operator fill:#8addf2,stroke:#333,stroke-width:4px
style new fill:#8addf2,stroke:#333,stroke-width:4px
```
The precise mechanism for determining when and how the new version should take control can be determined by the operator (or a separate or external tool), which can allow automated rollout based on domain-specific metrics. For example, an operator might use ownerreferences or labels on CRs to indicate which version is currently managing that instance, with ownership flipped manually. Additional automation can be added to automatically flip management based on metrics / percent rollout, i.e. "for every 2hours that metrics still look healthy, increase rollout to management by the new operator by 10%".
This is a strategy for limiting the impact of rolling out the operator itself, but similarly domain-specific rollout strategies may be defined for any operands (rules about which versions can be upgraded to which others, etc).
The only part of this that OLM needs to be informed of is that the rollout is taking place - it is important to set the `Upgradeable=false OperatorCondition` so that OLM does not attempt to interfere with the handoff.