Try   HackMD

Descoping Plan

  1. Background / why this is necessary
  2. A brief description of the end state
  3. Per-release transition plan
  4. Operator Patterns to replace "scoping"

Definitions

Descoped Operator: An operator in a cluster that is expected to be the sole owner of an API in a cluster.

Scoped Operator: An operator that has been installed under a previous version of OLM and includes scoping assumptions (including metadata like installMode).

Background

History

When OLM was first written, CRDs defined only the existence of a GVK in a cluster. Operators developed for OLM could only install in a namespace, watching that namespace - this delivered on the self-service, operational-encoding story of operators. The same operator could be installed in every namespace of a cluster.

Privilege escalation became a concern - since operators are run with a service account in a namespace, anyone with the ability to create workloads in that namespace could escalate to the permissions of the operator. This made service provider/consumer relationships a difficult sell for operators in OLM.

At the same time, CRDs continued to add features. With version schemas and admission and conversion webhooks, CRDs no longer simply registered a global name for a type, and operators in separate namespaces had lots of options to interfere with one another if they shared the same CRD. OLM also expanded to support APIServices in addition to operators based on CRDs, and so required a notion of cluster-wide operators.

To address these concerns, a notion of scoping operators was introduced via the OperatorGroup object. An OperatorGroup would specify a set of namespaces within a cluster in which all operators installed would share the same scope. OLM would ensure that only one operator within a namespace owned a particular CRD to avoid collision problems, and more installation options were provided to allow separating operators from their managed workloads.

Problem

But OperatorGroups do not alter the fundamental problem: that apis in a kubernetes cluster are cluster-scoped. They are visible via discovery to any user that wishes to see them. Even operators that agree on a particular GVK may have differences of opinion in how those objects should be admitted to a cluster, or how conversion between api versions should happen.

With Operator Framework, we want to build an ecosystem of high-quality operators that can be re-used across different projects, whether they're in the same cluster or not. But re-using operators compounds the scoping problems within a cluster - it increases the likelihood that more than one "opinion" about an API exists in the cluster.

For these reasons we are looking to entirely remove the notion of scoping from OLM.

What does this mean?

It means that (in the near future) for any operator installed via OLM, we expect that:

  • If an operator provides an API, it provides it for the entire cluster.
  • At a minimum, it should write into the status of that API (if the API has a status section, as most do)
  • Only one operator will provide that API in a cluster, without an administrator explicitly allowing shared ownership (i.e. ingress controllers).

It does not mean that:

  • Every operator needs to have permission to do its job in every namespace
  • Every user in a cluster needs to have permission to use the operator's APIs
  • Only one controller pod needs to run for that api in a cluster
  • Only one controller can be installed to manage an API (i.e. ingress-style)

If you are an operator author and the above statements are concerning, please review the Operator Patterns section for suggestions on how to acheive your goals in a descoped world.

End State

For the most part, the final state for de-scoping will be a transition away from namespace-scoped APIs like Subscription, InstallPlan, and ClusterServiceVersion to a different set of cluster-scoped APIs like Operator and Install.

These newer apis avoid much of the complexity introduced by scoping, and are already in progress. The Operator api is availabe as a read-only API in 4.6.

The document fully describing this end state is the Simplify OLM APIs enhancement Note: the enhancement is currently pending an update to call out the scoping issues. The only change as a result of the decision to de-scope is to always assume that operators are descoped in the new APIs.

At some point, it is likely that OLM will introduce some namespaced APIs again for the installation of non-operator content. But this will be accompanied by its own enhancement.

Transition Plan

RBAC

Operators may still request the cluster- and namespace-scoped permissions they need to run within their installation namespace at installtime (i.e. today, via clusterPermissions and permissions). But unlike with scoped operators, the namespace-scoped permissions do not get copied to a pre-defined set of namespaces and no bindings are created by default.

De-scoped operators need to be concerned about RBAC in two general areas:

1 - Access to do work in a particular namespace

A descoped operator will not be allowed to install any bindings - this is the job of an administrator.

Operators will provide a set of ClusterRoles for the work that they need to do.

For example, an operator with serviceaccount foo-sa might provide this ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: generateName: foo-sa-required-perms rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "watch", "list"]

An administrator can bind this with a ClusterRoleBinding if they want to operator to be able to perform its tasks in all namespaces. Or, they may drop a RoleBinding for this role binding the operator's ServiceAccount in each desired namespace.

An operator author may provide more or less granular ClusterRoles. Most operators likely just need one per serviceaccount, but others may wish to provide granular feature-based ClusterRoles so that an administrator can enable/disable portions of the operator's functionality.

A convention will be established: a single ClusterRole for a ServiceAccount will be assumed to be required for the operator to operate. Multiple ClusterRoles will be considered optional, such that creating/deleting them will enable/disable certain aspects of the operator.

One exception: De-scoped operators are always expected to have read,update /status permission on the APIs they own in all namespaces. This is to ensure that the operator has a communication channel with users (to communicate, for example, that they do not have the proper permission to do work in a particular namespace). OLM will raise alerts when there are CRs in a cluster with no controller capabable of updating their status.

Note: This is a degradation of the install experience, see below for the interim solution.

2 - Granting a user permission to use the operator's provided APIs

For scoped operators, OLM automatically generates ClusterRoles and automatically aggregates them to the default admin, edit, and view ClusterRoles, based on the availability of an operator in a particular namespace.

For de-scoped operators, OLM will leave this to the administrator.

Note: This is a degradation of the install experience, see below for the interim solution.

OperatorGroups

The above changes for de-scoped operators, without any additional tooling, degrades the operator installation experience. There is no declarative way to indicate that an operator should be permitted to work in a set of namespaces (it becomes a two step process: install, bind).

For now, we will repurpose OperatorGroups for RBAC management. An OperatorGroup will not be required for the installation of a de-scoped operator (as they are today for scoped operators).

If de-scoped operator is installed in an OperatorGroup, the namespace list on an operatorgroup is used to determine which namespaces will get automatic bindings for the operator's serviceaccounts, and will generate and aggregate API access RBAC roles to view, edit, and admin.

This should be considered an interim solution. A more configurable and more supported API may look something like RBACManager.

Determining Scope

Since CSVless bundles are not yet available, we will indicate that an operator is descoped by marking installModes as optional.

Any CSV with an empty installModes block will be treated as a de-scoped operator.

Any CSV with an installModes block that supports AllNamespace-mode only will be treated as a de-scoped operator that provides APIs to all users by default (i.e. OLM will generate an AllNamespace OperatorGroup for it).

Operator Visibility

The Operator API is a cluster-scoped API to reflect the cluster-scoped nature of api extensions.

Users of the provided APIs can tell that they exist via discovery, but admins may be hesitant to grant read on the Operator API itself to learn more about the operator and its services.

Discovery of available operators for UI will flow through discovery:

  1. Query to discover which APIs are readable/writable by the current user
  2. Query discovery for those APIs
  3. Read the schema for the API. The top level description of the API resource should include: the name of the owning operator, a description of the owning operator, along with links, descriptors, and other metadata.
  4. UI groups apis based on the operator identity in their schema descriptions.

TODO: this will not work well for apis shared between operators, and makes it easier to hit etcd key limits.

Dependency Resolution / Updates

Dependency resolution will continue to take place at the namespace scope for operators intalled via Subscriptions. Any operators installed this way will have an Operator object created automatically for visibility, but the Operator may begin to emit warnings about the scoped nature of the installation.

Resolution will also take place among de-scoped operators at the cluster scope. Note that resolution or updates of de-scoped operators can be blocked by issues with scoped operators.

Updating from a scoped operator to a descoped operator

TODO

Installation-specific migrations steps

Single Install of an Operator

Affects: OwnNamespace, SingleNamespace, MultiNamespace

Transition:

  • Remove the namespace suffix from the existing generated Operator object (i.e. operatorname-ns-foo -> operatorname)
  • Emit a warning message on the Operator objects that the operator is not de-scoped.
  • If CSV supports multinamespace mode, suggest migrating to a single instance of the multinamespace operator - which will then be picked up by the single instance transition.
  • An incoming update may de-scope this operator, or users can manually migrate to a de-scoped alternative.

AllNamespace Operators

Transition:

  • De-namespace the Operator object (i.e. operator-ns-foo -> operator)
  • Ensure the default bindings are available in all namespaces.

Multiple Installs of the Same Version

Affects: OwnNamespace, SingleNamespace, MultiNamespace

Transition:

  • Emit a warning message on the Operator objects that the operator is not de-scoped.
  • If CSV supports multinamespace mode, suggest migrating to a single instance of the multinamespace operator - which will then be picked up by the single instance transition.
  • Otherwise, requires manual cleanup.

Multiple Installs at Different Versions

Affects: OwnNamespace, SingleNamespace, MultiNamespace

  • Emit a warning message on the Operator objects that the operator is not de-scoped and requires manual clean-up (selection of new alternative operators).

Operator Patterns

Many of the use-cases for scoped operators are better suited as features within the operator itself.

Limit the Blast Radius of Operators

One of the primary reasons for scoping operators is to reason about their Blast Radius - i.e. in the case of a bug or malicious control, limit the worst-case scenario for the cluster.

Limiting blast radius for de-scoped operators is generally simpler to reason about, because it relies heavily on auditable RBAC policy.

In this example, the de-scoped operator is installed with a ClusterRole only, and an administrator must explicitly bind each namespace that the operator should be allowed to use via a RoleBinding:

Cluster

User Namespace

Operator Namespace

uses

uses

bound to

allowed

kube-system

Sensitive Pod

No RoleBinding
No Access

operator

ServiceAccount

RoleBinding

ClusterRole: Pod R/W

Workload Pod

Internal APIs

Operators scoped to a single namespace are often used to provide internal APIs that should not be available anywhere else in the cluster. These may be config apis that configure cluster operation or APIs that are otherwise sensitive.

This differs from example above of limiting an operator's blast radius - the operator author doesn't want it to be possible for an administrator to expose the APIs to other users and namespaces, or wants a guarantee that the operator is not given permission outside of its installation namespace.

In this example, an operator is granted restricted permissions for a single namespace, and provides a Cluster-Scoped API. This ensures that anyone with access to write the API has been vetted, and that the operator itself can only perform operations within its own namespace.

Cluster

Operator Namespace

uses

uses

bound to

creates

watches

operator

ServiceAccount

RoleBinding

Role: Pod R/W

Workload Pod

InternalOperatorAPI
(Cluster Scoped)

Isolate Tenants

Scoping is often desired in order to deal with tenant isolation and prevent noisy-neighbor effects.

Without mutliple clusters or a first-class tenancy effort within kubernetes, this will never be truly possible (i.e. scoped operators may provide isolation for their APIs, but not for underlying kubernetes APIs, etcd access, or physical cluster resources). For operators that still wish to isolate tenants, however, this is possible by having a single parent operator spin up multiple control loops or even operator pods.

These architectures may also be a strategy to scale operators horizontally.

Controller Per Tenant

Spin up controllers per tenant within a running process.

Cluster

User Namespace 2

User Namespace 1

Operator Namespace

Operator Pod

starts

starts

manages

manages

Workload Pod

operator

Controller for 1

Controller for 2

Workload Pod

Child Operator Pod Per Tenant

This is similar to above, but spins up controllers in their own pod. This may be valuable to leverage the cluster scheduler, or to have specific tenants mangaged with an (auditable) reduction is permission scope.

Cluster

User Namespace 2

User Namespace 1

Operator Namespace

Child Pod 2

Child Pod 1

Operator Pod

starts

starts

manages

manages

Workload Pod

operator

Controller for 1

Controller for 2

Workload Pod

Sidecar Operator Pods

This is a similar pattern, but places the pods near to the workloads they manage. This has different visibility and permission implications that may be desirable depending on the tasks being performed.

Cluster

User Namespace 2

User Namespace 1

Operator Namespace

Child Pod 2

Child Pod 1

Operator Pod

starts

starts

manages

manages

Controller for 2

operator

Controller for 1

Workload Pod 1

Workload Pod 2

Canary Rollouts

Scoping has also been considered as a solution for canary rollouts of new operator versions. A new version may be released to manage one namespace, while the previous release manages the rest of the cluster's namespaces.

During an upgrade, there will be two operators running at one time. This provides an opportunity for both operators to transfer ownership, handoff locks, or otherwise coordinate the rollout of operator resources - combined with a strategy for limiting the blast radius and use of operator conditions, effective and domain-specific canary or other rollout strategies can be implemented.

Cluster

User Namespace 3

User Namespace 2

User Namespace 1

Operator Namespace

New Operator Pod

Previous Operator Pod

manages

manages

manages

Workload Pod 3

operator

Workload Pod 1

Workload Pod 2

operator
Upgradeable: false

The precise mechanism for determining when and how the new version should take control can be determined by the operator (or a separate or external tool), which can allow automated rollout based on domain-specific metrics. For example, an operator might use ownerreferences or labels on CRs to indicate which version is currently managing that instance, with ownership flipped manually. Additional automation can be added to automatically flip management based on metrics / percent rollout, i.e. "for every 2hours that metrics still look healthy, increase rollout to management by the new operator by 10%".

This is a strategy for limiting the impact of rolling out the operator itself, but similarly domain-specific rollout strategies may be defined for any operands (rules about which versions can be upgraded to which others, etc).

The only part of this that OLM needs to be informed of is that the rollout is taking place - it is important to set the Upgradeable=false OperatorCondition so that OLM does not attempt to interfere with the handoff.