Evan Cordell
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
2
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# Descoping Plan **NOTE: A [copy](https://hackmd.io/DTVukSyGSkSLwD9WgZsdbw) has been made available in the operator-framework hackmd organization.** 1. Background / why this is necessary 2. A brief description of the end state 3. Per-release transition plan 5. Operator Patterns to replace "scoping" ## Definitions **Descoped Operator**: An operator in a cluster that is expected to be the sole owner of an API in a cluster. **Scoped Operator**: An operator that has been installed under a previous version of OLM and includes scoping assumptions (including metadata like `installMode`). ## Background ### History When OLM was first written, CRDs defined only the existence of a `GVK` in a cluster. Operators developed for OLM could only install in a namespace, watching that namespace - this delivered on the self-service, operational-encoding story of operators. The same operator could be installed in every namespace of a cluster. Privilege escalation became a concern - since operators are run with a service account in a namespace, anyone with the ability to create workloads in that namespace could escalate to the permissions of the operator. This made service provider/consumer relationships a difficult sell for operators in OLM. At the same time, CRDs continued to add features. With version schemas and admission and conversion webhooks, CRDs no longer simply registered a global name for a type, and operators in separate namespaces had lots of options to interfere with one another if they shared the same CRD. OLM also expanded to support `APIService`s in addition to operators based on CRDs, and so required a notion of cluster-wide operators. To address these concerns, a notion of `scoping` operators was introduced via the `OperatorGroup` object. An `OperatorGroup` would specify a set of namespaces within a cluster in which all operators installed would share the same scope. OLM would ensure that only one operator within a namespace owned a particular CRD to avoid collision problems, and more installation options were provided to allow separating operators from their managed workloads. ### Problem But `OperatorGroups` do not alter the fundamental problem: that apis in a kubernetes cluster are cluster-scoped. They are visible via discovery to any user that wishes to see them. Even operators that agree on a particular GVK may have differences of opinion in how those objects should be admitted to a cluster, or how conversion between api versions should happen. With Operator Framework, we want to build an ecosystem of high-quality operators that can be re-used across different projects, whether they're in the same cluster or not. But re-using operators compounds the scoping problems within a cluster - it increases the likelihood that more than one "opinion" about an API exists in the cluster. For these reasons we are looking to entirely remove the notion of scoping from OLM. ### What does this mean? It means that (in the near future) for any operator installed via OLM, we expect that: - If an operator provides an API, it provides it for the entire cluster. - At a minimum, it should write into the `status` of that API (if the API has a `status` section, as most do) - Only one operator will provide that API in a cluster, without an administrator explicitly allowing shared ownership (i.e. ingress controllers). It does **not** mean that: - Every operator needs to have permission to do its job in every namespace - Every user in a cluster needs to have permission to use the operator's APIs - Only one controller pod needs to run for that api in a cluster - Only one controller can be installed to manage an API (i.e. ingress-style) If you are an operator author and the above statements are concerning, please review the [Operator Patterns](#operator-patterns) section for suggestions on how to acheive your goals in a descoped world. ## End State For the most part, the final state for de-scoping will be a transition away from namespace-scoped APIs like `Subscription`, `InstallPlan`, and `ClusterServiceVersion` to a different set of cluster-scoped APIs like `Operator` and `Install`. These newer apis avoid much of the complexity introduced by scoping, and are already in progress. The `Operator` api is availabe as a read-only API in 4.6. The document fully describing this end state is the [Simplify OLM APIs](https://github.com/operator-framework/enhancements/blob/master/enhancements/simplify-olm-apis.md) enhancement **Note: the enhancement is currently pending an update to call out the scoping issues**. The only change as a result of the decision to de-scope is to always assume that operators are descoped in the new APIs. At some point, it is likely that OLM will introduce some namespaced APIs again for the installation of non-operator content. But this will be accompanied by its own enhancement. ## Transition Plan ### RBAC Operators may still request the cluster- and namespace-scoped permissions they need to run within their installation namespace at installtime (i.e. today, via `clusterPermissions` and `permissions`). But unlike with scoped operators, the namespace-scoped permissions do not get copied to a pre-defined set of namespaces and no bindings are created by default. De-scoped operators need to be concerned about RBAC in two general areas: #### 1 - Access to do work in a particular namespace A descoped operator will not be allowed to install any bindings - this is the job of an administrator. Operators will provide a set of `ClusterRoles` for the work that they need to do. For example, an operator with serviceaccount `foo-sa` might provide this `ClusterRole`: ```yaml= apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: generateName: foo-sa-required-perms rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "watch", "list"] ``` An administrator can bind this with a `ClusterRoleBinding` if they want to operator to be able to perform its tasks in all namespaces. Or, they may drop a `RoleBinding` for this role binding the operator's `ServiceAccount` in each desired namespace. An operator author may provide more or less granular `ClusterRoles`. Most operators likely just need one per serviceaccount, but others may wish to provide granular feature-based ClusterRoles so that an administrator can enable/disable portions of the operator's functionality. A convention will be established: a single ClusterRole for a ServiceAccount will be assumed to be required for the operator to operate. Multiple ClusterRoles will be considered optional, such that creating/deleting them will enable/disable certain aspects of the operator. One exception: De-scoped operators are always expected to have `read,update /status` permission on the APIs they own in all namespaces. This is to ensure that the operator has a communication channel with users (to communicate, for example, that they do not have the proper permission to do work in a particular namespace). OLM will raise alerts when there are CRs in a cluster with no controller capabable of updating their status. Note: This is a degradation of the install experience, see below for the interim solution. #### 2 - Granting a user permission to use the operator's provided APIs For scoped operators, OLM automatically generates ClusterRoles and automatically aggregates them to the default `admin`, `edit`, and `view` ClusterRoles, based on the availability of an operator in a particular namespace. For de-scoped operators, OLM will leave this to the administrator. Note: This is a degradation of the install experience, see below for the interim solution. #### OperatorGroups The above changes for de-scoped operators, without any additional tooling, degrades the operator installation experience. There is no declarative way to indicate that an operator should be permitted to work in a set of namespaces (it becomes a two step process: install, bind). For now, we will repurpose `OperatorGroups` for RBAC management. An `OperatorGroup` will not be required for the installation of a de-scoped operator (as they are today for scoped operators). If de-scoped operator is installed in an `OperatorGroup`, the namespace list on an operatorgroup is used to determine which namespaces will get automatic bindings for the operator's serviceaccounts, and will generate and aggregate API access RBAC roles to `view`, `edit`, and `admin`. This should be considered an interim solution. A more configurable and more supported API may look something like [RBACManager](https://github.com/FairwindsOps/rbac-manager#dynamic-namespaces-and-labels). ### Determining Scope Since CSVless bundles are not yet available, we will indicate that an operator is descoped by marking `installModes` as optional. Any CSV with an empty `installModes` block will be treated as a de-scoped operator. Any CSV with an `installModes` block that supports AllNamespace-mode **only** will be treated as a de-scoped operator that provides APIs to all users by default (i.e. OLM will generate an AllNamespace OperatorGroup for it). ### Operator Visibility The `Operator` API is a cluster-scoped API to reflect the cluster-scoped nature of api extensions. Users of the provided APIs can tell that they exist via discovery, but admins may be hesitant to grant `read` on the Operator API itself to learn more about the operator and its services. Discovery of available operators for UI will flow through discovery: 1. Query to discover which APIs are readable/writable by the current user 1. Query discovery for those APIs 1. Read the schema for the API. The top level description of the API resource should include: the name of the owning operator, a description of the owning operator, along with links, descriptors, and other metadata. 1. UI groups apis based on the operator identity in their schema descriptions. TODO: this will not work well for apis shared between operators, and makes it easier to hit etcd key limits. ### Dependency Resolution / Updates Dependency resolution will continue to take place at the namespace scope for operators intalled via `Subscriptions`. Any operators installed this way will have an `Operator` object created automatically for visibility, but the `Operator` may begin to emit warnings about the scoped nature of the installation. Resolution will also take place among de-scoped operators at the cluster scope. Note that resolution or updates of de-scoped operators can be blocked by issues with scoped operators. ### Updating from a scoped operator to a descoped operator TODO ### Installation-specific migrations steps #### Single Install of an Operator Affects: OwnNamespace, SingleNamespace, MultiNamespace Transition: - Remove the namespace suffix from the existing generated Operator object (i.e. operatorname-ns-foo -> operatorname) - Emit a warning message on the Operator objects that the operator is not de-scoped. - If CSV supports multinamespace mode, suggest migrating to a single instance of the multinamespace operator - which will then be picked up by the single instance transition. - An incoming update may de-scope this operator, or users can manually migrate to a de-scoped alternative. #### AllNamespace Operators Transition: - De-namespace the Operator object (i.e. operator-ns-foo -> operator) - Ensure the default bindings are available in all namespaces. #### Multiple Installs of the Same Version Affects: OwnNamespace, SingleNamespace, MultiNamespace Transition: - Emit a warning message on the Operator objects that the operator is not de-scoped. - If CSV supports multinamespace mode, suggest migrating to a single instance of the multinamespace operator - which will then be picked up by the single instance transition. - Otherwise, requires manual cleanup. #### Multiple Installs at Different Versions Affects: OwnNamespace, SingleNamespace, MultiNamespace - Emit a warning message on the Operator objects that the operator is not de-scoped and requires manual clean-up (selection of new alternative operators). ## Operator Patterns Many of the use-cases for scoped operators are better suited as features within the operator itself. ### Limit the Blast Radius of Operators One of the primary reasons for scoping operators is to reason about their Blast Radius - i.e. in the case of a bug or malicious control, limit the worst-case scenario for the cluster. Limiting blast radius for de-scoped operators is generally simpler to reason about, because it relies heavily on auditable RBAC policy. In this example, the de-scoped operator is installed with a ClusterRole only, and an administrator must explicitly bind each namespace that the operator should be allowed to use via a RoleBinding: ```mermaid graph TD subgraph Cluster subgraph Operator Namespace operator("operator") cr("ClusterRole: Pod R/W") sa("ServiceAccount") operator -->|uses| sa end subgraph User Namespace rb("RoleBinding") wk("Workload Pod") rb -->|uses| cr rb -->|bound to| sa end operator --> |allowed|wk subgraph kube-system sp("Sensitive Pod") no("No RoleBinding<br />No Access") end end style operator fill:#8addf2,stroke:#333,stroke-width:4px ``` ### Internal APIs Operators scoped to a single namespace are often used to provide internal APIs that should not be available anywhere else in the cluster. These may be config apis that configure cluster operation or APIs that are otherwise sensitive. This differs from example above of limiting an operator's blast radius - the operator author doesn't want it to be possible for an administrator to expose the APIs to other users and namespaces, or wants a guarantee that the operator is not given permission outside of its installation namespace. In this example, an operator is granted restricted permissions for a single namespace, and provides a Cluster-Scoped API. This ensures that anyone with access to write the API has been vetted, and that the operator itself can only perform operations within its own namespace. ```mermaid graph TD subgraph Cluster subgraph Operator Namespace operator("operator") cr("Role: Pod R/W") rb("RoleBinding") wk("Workload Pod") sa("ServiceAccount") operator -->|uses| sa rb -->|uses| cr rb -->|bound to| sa operator --> |creates|wk end api("InternalOperatorAPI<br />(Cluster Scoped)") operator -->|watches|api end style operator fill:#8addf2,stroke:#333,stroke-width:4px ``` ### Isolate Tenants Scoping is often desired in order to deal with tenant isolation and prevent noisy-neighbor effects. Without mutliple clusters or a first-class tenancy effort within kubernetes, this will never be truly possible (i.e. scoped operators may provide isolation for their APIs, but not for underlying kubernetes APIs, etcd access, or physical cluster resources). For operators that still wish to isolate tenants, however, this is possible by having a single parent operator spin up multiple control loops or even operator pods. These architectures may also be a strategy to scale operators horizontally. #### Controller Per Tenant Spin up controllers per tenant within a running process. ```mermaid graph TD subgraph Cluster subgraph Operator Namespace subgraph Operator Pod operator("operator") operator -->|starts| child1("Controller for 1") operator -->|starts| child2("Controller for 2") end end subgraph User Namespace 1 wk("Workload Pod") end subgraph User Namespace 2 wk2("Workload Pod") end child1 --> |manages|wk child2 --> |manages|wk2 end style operator fill:#8addf2,stroke:#333,stroke-width:4px style child1 fill:#8addf2,stroke:#333,stroke-width:4px style child2 fill:#8addf2,stroke:#333,stroke-width:4px ``` #### Child Operator Pod Per Tenant This is similar to above, but spins up controllers in their own pod. This may be valuable to leverage the cluster scheduler, or to have specific tenants mangaged with an (auditable) reduction is permission scope. ```mermaid graph TD subgraph Cluster subgraph Operator Namespace subgraph Operator Pod operator("operator") end subgraph Child Pod 1 child1("Controller for 1") end subgraph Child Pod 2 child2("Controller for 2") end end subgraph User Namespace 1 wk("Workload Pod") end subgraph User Namespace 2 wk2("Workload Pod") end operator -->|starts| child1("Controller for 1") operator -->|starts| child2("Controller for 2") child1 --> |manages|wk child2 --> |manages|wk2 end style operator fill:#8addf2,stroke:#333,stroke-width:4px style child1 fill:#8addf2,stroke:#333,stroke-width:4px style child2 fill:#8addf2,stroke:#333,stroke-width:4px ``` #### Sidecar Operator Pods This is a similar pattern, but places the pods near to the workloads they manage. This has different visibility and permission implications that may be desirable depending on the tasks being performed. ```mermaid graph TD subgraph Cluster subgraph Operator Namespace subgraph Operator Pod operator("operator") end end subgraph User Namespace 1 subgraph Child Pod 1 child1("Controller for 1") end wk("Workload Pod 1") end subgraph User Namespace 2 subgraph Child Pod 2 child2("Controller for 2") end wk2("Workload Pod 2") end operator -->|starts| child1("Controller for 1") operator -->|starts| child2("Controller for 2") child1 --> |manages|wk child2 --> |manages|wk2 end style operator fill:#8addf2,stroke:#333,stroke-width:4px style child1 fill:#8addf2,stroke:#333,stroke-width:4px style child2 fill:#8addf2,stroke:#333,stroke-width:4px ``` ### Canary Rollouts Scoping has also been considered as a solution for canary rollouts of new operator versions. A new version may be released to manage one namespace, while the previous release manages the rest of the cluster's namespaces. During an upgrade, there will be two operators running at one time. This provides an opportunity for both operators to transfer ownership, handoff locks, or otherwise coordinate the rollout of operator resources - combined with a strategy for [limiting the blast radius](#limit-the-blast-radius) and use of [operator conditions](https://olm.operatorframework.io/docs/advanced-tasks/communicating-operator-conditions-to-olm/#upgradeable), effective and domain-specific canary or other rollout strategies can be implemented. ```mermaid graph TD subgraph Cluster subgraph Operator Namespace subgraph Previous Operator Pod operator("operator") end subgraph New Operator Pod new("operator<br />Upgradeable: false") end end subgraph User Namespace 1 wk("Workload Pod 1") end subgraph User Namespace 2 wk2("Workload Pod 2") end subgraph User Namespace 3 wk3("Workload Pod 3") end operator --> |manages|wk operator --> |manages|wk2 new --> |manages|wk3 end style operator fill:#8addf2,stroke:#333,stroke-width:4px style new fill:#8addf2,stroke:#333,stroke-width:4px ``` The precise mechanism for determining when and how the new version should take control can be determined by the operator (or a separate or external tool), which can allow automated rollout based on domain-specific metrics. For example, an operator might use ownerreferences or labels on CRs to indicate which version is currently managing that instance, with ownership flipped manually. Additional automation can be added to automatically flip management based on metrics / percent rollout, i.e. "for every 2hours that metrics still look healthy, increase rollout to management by the new operator by 10%". This is a strategy for limiting the impact of rolling out the operator itself, but similarly domain-specific rollout strategies may be defined for any operands (rules about which versions can be upgraded to which others, etc). The only part of this that OLM needs to be informed of is that the rollout is taking place - it is important to set the `Upgradeable=false OperatorCondition` so that OLM does not attempt to interfere with the handoff.

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully