owned this note
owned this note
Published
Linked with GitHub
---
toc:
maxLevel: 4
---
Cloud Pak OLM Channel Taxonomy Specification
# Cloud Pak OLM Channel Taxonomy Specification for On-Line
OBSOLETE. See: https://github.ibm.com/cdjohnson/specs/blob/master/ibm-channel-taxonomy.md
### Disposition
- [x] **Draft**
- [ ] **Under Review**
- links
- [ ] **Approved** (date)
- Approval Record:
- Notes:
- [ ] **In Playbook**
- Page:
---
[toc]
## Overview
Operator Lifecycle Manager (OLM) currently uses a *Subscription* model with *Channels* to provision an operator. Each Operator Package has one or more channel. Each channel has a Directed Acyclical Graph (DAG) with a single Head, where the Head represents the Current Version of that channel. Customers choose the Operator Package (e.g. `ibm-cp-integration`) and the Channel to subscribe to (e.g. v1.1).
Packages also define a _default_ channel, which is used by the OpenShift UI to recommend a channel and is used to select channels for dependant Operators that are provisioned automatically when a static dependency is defined for an operator.
Through the use of Channels, OLM will automatically transition/upgrade operators to the Head version through the upgrade graph edges. OLM will NOT, however, automatically switch a channel to satisfy a dependency. This can result in upgrade paths that are not obvious to the customer.
This specification defines:
1. An update to the channel naming conventions to be used by Cloud Pak container software to satisfy additional customer use cases, allowing for a consistent user experience.
2. A specification to model the DAG to satisfy the channel intent
3. A specification for defining operator versions using semantic versioning (semver)
4. Instructions on how to guide customers to provision and upgrade their operators successfully when there are interdependencies.
Prerequisites:
1. This specification requires the use of File Based Catalogs.
### Current Specification and History
Most IBM Cloud Paks currently use an Operator Lifecycle Manager (OLM) channel taxonomy of `vX.Y` or `vMAJOR.MINOR` which coincides with the version of the operator in the `ClusterServiceVersion`. This gives customers the ability to PIN an operator to a specific minor version, controlling the risk of larger features breaking their deployments, but continuing to support patch updates which include bug and security fixes.
We purposefully did not adopt stability-based(alpha, beta, production) and frequency-based (dev, rc, candidate, fast, stable) because of the limitations that the OLM tooling had with regard to maintaining Bundles and Catalog Index Images. The Channel definition was wired into the Bundle, meaning it was very difficult to promote or move a bundle between channels. It was also very difficult to have different channel upgrade rules for different channels.
Typical Channels:
* `v1.0`
* `v1.1`
* `v1.2-eus`
* `v2.0`
We also estabished a CSV versioning taxonomy to enable a common set of use cases:
* All operator bundles use semver.
* All operator bundles (versions) uses `replaces` for all versions in the channel except for the origin version.
* All operator bundles specify `olm.skipRange` to allow upgrading from one or more previous versions.
This enabled the following use cases:
* Customers can install any old version.
* Customers can upgrade to any version from any version.
The side effects include:
* Many more images must be mirrored.
* Difficult to remove old operator versions.
* Upgrading from any to any is not always possible.
### File Based Catalog Overview
File Based Catalogs (FBC) is a declarative method of describing a Catalog Index. Instead of an imperative, edge-based approach using `olm registry add` where the bundle is the source of truth that adds a bundle to an existing catalog, the index is declaratively described and is the source of truth for: the packages, channels and bundles in the catalog. This gives us much more flexibility for describing and changing the taxonomy of a catalog without needing to rebuild and re-publish bundle images.
This also enables us to create DIFFERENT DAGs by Channel rather than one DAG across all channels
### Customer Use Cases
Customers are any consumer of products in the IBM Public Catalog.
#### In Scope
1. I want the latest and greatest version automatically installed on my cluster as soon as IBM releases new versions without disruption (continuous delivery)
- Continuous Delivery
2. I want the latest security patches and bug fixes installed on my cluster as soon as IBM releases new versions without disruption (major.minor)
- Automatic z-build
3. I want to clearly understand what version(s) and intent of the operators a channel represents.
- Version Scope (1.x, 1.0.x)
- Maturity (dev, beta, production, ltsr)
- Frequency (candidate, fast, stable)
4. I want to understand how to upgrade my operators regardless of which channel I currently have selected.
#### Out of Scope:
The following use cases are handled by other specifications.
1. Software Promotion scenarios
2. Curated Catalog scenarios
3. Offline Scenarios (Airgap 2.0)
4. Contextual Default Channel selection (produdction begets production)
5. Defining dependency relationships (static and dynamic dependencies)
6. Detecting which Channels and Operators are Deprecated (requires OLM and OCP function)
7. Defining the Operand version
### Workload Use Cases
Workloads are operator producers such as Cloud Pak or Certified Container Software products and components.
#### In Scope
1. I want to be able to interoperate with my existing channel taxonomy
2. I want to interoperate with other dependent operators
3. I want to be compatible with strategic initiatives such as AirGap 2.0, Software Promotion and File Based Catalogs.
4. I want to avoid maintaining obsolete versions (e.g. versions with security problems)
5. I want to test and validate my catalogs before publishing to customers to make sure new/upgrade paths work properly.
#### Out of Scope
### CP CICD Use Cases
#### In Scope
1. I want to support the current Bundle Add process in parallel to the Sparse FBC process allowing workloads to adopt when ready.
#### Out of Scope
### Channel Strategies
There are three different channel naming strategies employed by products today (showing current usage in the red hat catalogs):
- **Version Scope** (latest, 1.x, 1.0.x)
- redhat-operator: Used by most operators in an inconsistent way:`4.9`, `8.2.x`, `1.24-stable`, `release-2.3`, `stable-4.9`, `alpha-0.11`,`latest`
- redhat-marketplace: `v5` (crunchy)
- certified-operators: `v5`, `v2.2`, `v1.1.0`, `release-v1.20`
- **Maturity** (dev, alpha, beta, stable, ltsr)
- redhat-operator: alpha, alpha-0.11
- redhat-marketplace: `alpha`, `beta`, `stable`, `lts` (`lts` is used by zabbix)
- Some use `stable` for production.
- certified-operators: `alpha`, `beta`, `beta2` (openliberty), lts
- **Frequency** (candidate, preview, fast, stable)
- redhat-operator: candidate, preview, preview-1.0, stable, fast
- redhat-marketplace: `candidate`, `fast`, `stable` (eamli is the only operator using these three)
- certified-operators: `stable`
The `stable` strategy is used in both the Maturity and Frequency strategies to signify a production intent.
There are examples if mixing Version with Maturity or Version with Frequency, but not Maturity with Frequency.
OpenShift itself uses a Version + Frequency strategy
#### Version Strategy
The version strategy gives customers a way to choose a `Major` or a `Major.Minor` version of the operator (also known as `X` and `X.Y`). Operators do not move from channel to channel with this model unless combined with other strategies. Customers can also choose the `latest` channel to allow transitioning between Major releases.
When using this naming strategy, the operators within the channel must match the version of the channel. i.e. channel `v1` should not have `operator-2.1.4`.
**Format:**
```
version-channel = latest | (v" (major | major-minor))
latest = "latest" # Latest CD release
major-minor = major "." minor
major = DIGIT # 0 implies pre-release
minor. = DIGIT # 0 indicates the initial version of a major
```
**Example:**
Each channel has a separate graph with it's own head (in **bold**):
- `latest`
- 1.0.0
- 1.0.1
- 1.1.1
- **2.0.2** (head)
- `v1`
- 1.0.0
- 1.0.1
- 1.1.0
- **1.1.1**
- `v1.0`
- 1.0.0
- **1.0.1**
- `v1.1`
- 1.1.0
- **1.1.1**
- `v2`
- 2.0.0
- 2.0.1
- **2.0.2**
- `v2.0`
- 2.0.0
- 2.0.1
- **2.0.2**
#### Frequency Strategy
The frequency strategy allows products to promote their operator versions from channel to channel once the operator version has had more testing and feedback from quality and customers. This strategy is used in conjunction with a Version strategy and optionally with a maturity strategy, and can be used for those operators who have an efficient delivery pipeline or an LTSR release.
When this strategy is used in combination with the Version strategy, the `stable` frequency is implied when absent.
**Format:**
```
frequency = "candidate" | "fast" | "stable" | ltsr"
```
#### Maturity Strategy (OPTIONAL)
The maturity strategy allows products to introduce various levels of support and intent for an operator version. This allows development pipelines to see and use pre-release versions. This strategy is used in conjunction with a Version strategy and optionally with a frequency strategy.
**Format: **
```
maturity = "dev" | "beta" | "stable"
```
**dev**: An unsupported, development channel used for testing early releases. May be unstable or not work at all.
**beta**: An unsupported, beta-level development channel used for testing early releases. The level of support is determined by the product and may or may not be part of an official beta program.
**stable**: Fully supported production-level channel (default)
When this strategy is used in combination with the Version strategy, the `stable` maturity is implied when absent.
#### Version + Frequency Strategy
This strategy is modeled after [OpenShift's versioning strategy](https://docs.openshift.com/container-platform/4.9/updating/understanding-upgrade-channels-release.html), but inverts the order opting for Version before Frequency, which makes more sense from a grouping and sorting perspective. That is: OpenShift is: `fast-4.9`, but this specification is `v4.9-fast`
To decrease complexity, ease readability and maintain backwards-compatiblity, `stable` frequency is implied and is not included. The version by-itself implies a stable, production channel.
```
channel-name = version-channel ["-" nonstable-frequency]
nonstable-frequency = "candidate" | "fast" | "ltsr"
```
Example 1 (three frequencies for agile pipelines):
- v1.0-candidate
- v1.0.0
- v1.0.1
- **v1.0.2**
- v1.0-fast
- v1.0.1
- **v1.0.2**
- v1.0
- **v1.0.1**
- v1.1-candidate
- v1.1.0
- v1.1.1
- **v1.1.2**
- v1.1-fast
- v1.1.0
- **v1.1.2**
- v1.1
- **v1.1.0**
Example 2 (two frequencies for non-agile pipelines, with an LTSR):
- v1.0
- **v1.0.1**
- v1.1
- v1.1.0
- v1.1.1
- **v1.1.2**
- v1.1-ltsr
- **v1.1.1**
### Frankenstein Example with all taxonomies
This example shows all combinations of version + maturity + frequency. This is a bit busy and and confusing for customers to consume:
- `candidate`
- `fast`
- `stable` (used for latest continuous delivery, stable is implied
- `alpha`
- `beta`
- `v1-dev-candidate`
- `v1-dev-fast`
- `v1-dev` (stable is implied)
- `v1-beta`
- `v1-beta`-candidate
- `v1-beta`-fast
- `v1-beta` (stable is implied)
- `v1` (production and stable are implied)
- `v1-candidate` (production is implied)
- `v1-fast`
- `v1.0-dev-candidate`
- `v1.0-dev-fast`
- `v1.0-dev` (stable is implied)
- `v1.0-beta`
- `v1.0-beta-candidate`
- `v1.0-beta-fast`
- `v1.0-beta`
- `v1.0`
- `v1.0-candidate`
- `v1.0-fast`
### Minimal Taxonomy (Recommended)
Support Continous Delivery for **latest**, **latest major** and **latest minor**. This is an extension of what we have today, adding in `vMAJOR` and `latest`.
Example:
- `latest` (continuous delivery production)
- `v1` (production, continuous delivery for v1.x stream)
- `v1.0` (production, automatic z-build for version 1.0.x)
- `v1.1-ltsr` (production, automatic z-build for version 1.1.x, a LTS release)
### IBM Public Catalog Integration OBSOLETE
This secion is no longer valid. We are instead NOT going to be merging catalogs in our pipeline and will be instead accepting full catalogs as inputs and requiring product teams to pre-merge all of their streams into a single, cumulative product catalog, which the IBM Operator Catalog will _replace_ the included packages within the centralized catalog.
#### Overview
All Cloud Paks and Certified Container Software are included in the IBM Operator Catalog. This catalog is the central location for customers to discover and install OLM-enabled products and is the source of truth for these products.
Today (2021), we use the metadata within the CASE Bundle and OLM Bundle to update the common catalog using the imperative `opm registry add` commands and other custom scripts to handle some use cases that `opm` does not handle. This process will be refered here as the *Bundle Add* process
This specification introduces a new method using FBC to allow:
1. Updated channel taxonomies presented in this specification
2. Deprecation and removal of operator versions and channels
3. Operator promotion
4. Setting the default channel
5. Supporting multiple version streams.
#### Sparse File Based Catalog
To support these use cases, workloads will supply a Sparse File Based Catalog (Sparse FBC) in the CASE. This FBC will contain the authoritative set the Packages, Channels and Bundle definitions required for the public catalog. This catalog will conditionally overlay/replace all previous operators in the catalog that it specifies.
The Sparse FBC is a fully-functional catalog representing the current head of the stream, in the context of a CASE version delivery stream as it works today.
Multi-Stream Example, removing old bundle versions:
```
ibm-foo-case-v1-bundle
case
ibm-foo
inventory
fooOperatorSetup
files
ibm-public-catalog.yaml
ibm-foo-case-v2-bundle
case
ibm-foo
inventory
fooOperatorSetup
files
ibm-public-catalog.yaml
```
##### How to handle multiple streams with merge conflicts
Since the CP CICD process expects independent delivery streams for each continuous delivery version, there are two cases where the sparse catalog will collide: The `olm.package` and overlapping `olm.channel` definitions (e.g. `latest`).
To resolve this, we use a *Replace by Version* strategy. We will add metadata to the catalog (a version) that conditionally replaces the entire object only if the new object has a greater version. This allows products to choose which stream is the authority for the contents of the object.
In the OCP 4.10 timeframe, there is only one way to add metadata to the catalog, and that is by adding new "meta" objects. These are Blobs in the catalog that the OPM tooling will blindly persist and propagate. In the future, we will be able to add `properties` to all of the existing objects. See [operator-registry issue 898](https://github.com/operator-framework/operator-registry/issues/895).
##### Example Multi-Stream Sparse Catalogs
**Stream v1 ibm-public-catalog.yaml**
Includes the `foo` package (versioned), channels and bundles for this stream.
```yaml
---
schema: olm.package
name: foo
description: This operator does 1,2 and 3....
defaultChannel: v1
---
schema: ibm.package
name: foo
version: 1
---
schema: olm.channel
package: foo
name: latest
entries:
- name: foo.v1.1.21
skipRange: <v1.1.21
ibmMetadata:
version: 1
___
schema: ibm.channel
package: foo
name: latest
version: 1
---
schema: olm.channel
package: foo
name: v1
entries:
- name: foo.v1.1.21
skipRange: <v1.1.21
---
schema: olm.channel
package: foo
name: v1.0
entries:
- name: foo.v1.0.7
skipRange: <v1.0.7
---
schema: olm.channel
package: foo
name: v1.1
entries:
- name: foo.v1.1.21
skipRange: <v1.1.21
---
schema: olm.bundle
name: foo.v1.0.7
package: foo
---
schema: olm.bundle
name: foo.v1.1.21
package: foo
...
```
**Stream v2 ibm-public-catalog.yaml**
Includes the `foo` package (versioned), channels and bundles for this stream.
```yaml
---
schema: olm.package
name: foo
description: This operator does 1,2,3 and now 4....
defaultChannel: v2
ibmMetadata:
version: 2 # If this version is greater than the existing package, overlay it.
---
schema: olm.channel
package: foo
name: latest
entries:
- name: foo.v2.0.1
skipRange: <v2.0.1
ibmMetadata:
version: 2 # If this version is greater than the existing channel, overlay
---
schema: olm.channel
package: foo
name: v2
entries:
- name: foo.v2.0.0
skipRange: <v2.0.0
- name: foo.v2.0.1
skipRange: <v2.0.1
---
schema: olm.channel
package: foo
name: v2.0
entries:
- name: foo.v2.0.1
skipRange: <v2.0.1
---
schema: olm.bundle
name: foo.v2.0.1
package: foo
...
```
The `resources.yaml` CASE specification will be updated to include metadata to identify to teh CICD process which file(s) are to be used for the index:
```
resources:
resourceDefs:
files:
- ref: ibm-public-catalog.yaml
mediaType: application/vnd.case.resource.olm.catalog.v1+yaml
metadata:
operators_operatorframework_io:
catalog:
mediaType:catalog+v2
cpcicd_ibm_com:
targetCatalogs:
catalogRefs:
- ibm-public:
versions:
- "4.6"
- "4.8"
```
#### Sparse FBC Merge Flow
Here's the flow from CICD's perspective:
1. Process the next version of the CASE
2. If there are Sparse FBC Catalogs to merge:
1. Extract public catalog directories to `localcatalog`
1. TODO: How:
- opm render? custom option? Need to preserve/extract custom metadata.
- There are some handy script/jq samples from the veneer markdown.
- What parts of opm and the outputs can be relied opon?
3. For each CASE Inventory Item:
1. For each CASE File Based Catalog that is marked to be merged:
1. Verify that `opm validate <filename>` passes without error.
2. For each Package (CASE FBC package)
1. Fetch the Package by name from the IBM Public Package (public FBC package)
2. If the package name exists:
1. If the CASE FBC `olm.package` `ibmMetadata.version` is greater than the public package `ibmMetadata.version`, replace the `olm.package`
1. This allows overriding the description, default channel, icon etc. from select streams.
3. For each CASE FBC channel:
1. If the CASE FBC `olm.channel` `ibmMetadata.version` is greater than the public channel `ibmMetadata.version`, replace the `olm.package`
1. This allows updating the channel graph from selected streams.
4. For each CASE FBC bundle:
1. PUT (add/replace) each `olm.bundle`
2. Omitting bundles MAY remove them in the subsequent prune step if no other channels are referencing them.
4. For each Public FBC bundle:
1. If the bundle is not part of a channel, delete it (prune orphans)
5. `opm validate`
6. Build the new IBM Public Operator Image and push it.
![Flow Diagram](https://i.imgur.com/nzkSSED.png)
---
## Specifications TODO
- Which taxonomies do I choose?
- What is the DAG model I need to use to satisfy each taxonomy?
- How do I provide the channel information to the CP CICD?
- catalog.json file
- Defines bundles, channels and packages.
- Merge with existing? Replace?
- Stream management?
- case-1.0.x
- package
- Which one to use?
- Version the package metadata?
- Use the case version?
- Default channel?
- channels
- Only include the channels for this stream
- Replace
- Reference packages
- Need a version to allow selective replace.
- Delete using a `deleted` marker
- Packages don't have channel references, so there is no garbage collection that can happen.
- The problem with this, is that the sparse catalog can't be used directly since the Channel would fail the validator since there are no
- bundles
- PUT only (add/replace).
- Orphans are garbage collected
- To delete a bundle, stop referencing it from all channels.
- References packages but not channels
- CASE annotations in resources.yaml for catalog.json
- name of catalog.json doesn't matter
- Adding metadata to the resources.yaml file is what triggers consumption.
- How do I remove a bundle I no longer need?
- Add a deleted marker to the bundle
- How can I use the new FBC Veneer APIs to help here?
---
## References
* IBM Cloud Pak Playbook: [OLM Versioning](https://playbook.cloudpaklab.ibm.com/cm/olmversioning/)
* OLM Upstream: [Channel Naming Best Practices](https://github.com/operator-framework/olm-docs/blob/master/content/en/docs/best-practices/channel-naming.md)
* Upstream: [Creating an upgrade graph (DAG)](https://github.com/operator-framework/olm-docs/blob/5145dbe53f9f925cfeb7fe328285cee029f298e2/content/en/docs/Concepts/olm-architecture/operator-catalog/creating-an-update-graph.md)* [OLM Upstream: FBC Docs](https://olm.operatorframework.io/docs/reference/file-based-catalogs/)
* OCP 4.9: [FBC Docs](https://docs.openshift.com/container-platform/4.9/operators/understanding/olm-packaging-format.html#olm-file-based-catalogs_olm-packaging-format)
* Red Hat: [FBC Veneer Thoughts](https://hackmd.io/ixIuRBNURV-7K4C1aGkzxQ)