owned this note
owned this note
Published
Linked with GitHub
---
title: Platform Operators - Lingering Questions
authors:
- "@tylerslaton"
reviewers:
- TBD
approvers:
- TBD
creation-date: 2022-05-18
last-updated: 2022-04-18
status: provisional
tags: platform-operators
---
# Platform Operators - Q/A
## `ResolveSet` bundle integration
### How do we define the `ResolveSet` bundle format?
#### Answer
##### Proposed Design
The format of a `ResolveSet` bundle will be an aggregation of `BundleDeployments` that are created derrived from a `Resolution` status. Under the hood, this aggregation will be stored as a `ConfigMap` on cluster which will in turn serve as the source for a wrapping `BundleDeployment`. In essence, we are creating a `BundleDeployment` that contains multiple `BundleDeployment`s. This will serve as an effect user experience when it comes to tracking resolutions and deleting them.
This `ConfigMap` is created as an optional output of a `Resolution` completing from `Deppy`. The `ResolveSet` provisioner is aware of a new resolution (likely via watched annotations or some other mechanic) and creates a new `BundleDeployment` that points to that resolution `ConfigMap` as a source. From there it iterates through its list of bundle images and installs them. In essence, this makes it so **each `ResolveSet` bundle is a grouping of multiple Bundles being installed**. When a `ResolveSet` gets updated, it updates its grouping. When it gets deleted, its relevant resources also get deleted.
> **Open Question:** We will likely want to have a history of decisions made by resolution and thus Deppy. Should the `ConfigMaps` created here serve as a tracker for that?
##### Workflow
The main intent behind this implementation is to keep Deppy and Rukpak with as little coupling as possible while also allowing for logical cascade deletion.
1. User creates a `Resolution` object named `foo`.
2. The controller for the `Resolution` api detects this new object, processes it, and writes its result to `foo`'s status. Along with this, a `ConfigMap` gets created that stores the resolution decision an has an annotation indicating it is a resolution.
3. The `ResolveSet` provisioner is watching for `ConfigMaps` with this annotation and acts when a new one is detected. Its action is to interpet/unpack the `ConfigMap` and install whatever `Bundles` it defines.
4. The `ConfigMap` created from the `Resolution` is the only one that it writes to, multiple are not created. As a result, editing a `Resolution` object results in a retrigger of resources being provisioned or deprovisioned.
#### Notes
- This requires [the `ConfigMap` source work](https://hackmd.io/pChFoobdQNOW911zRK6L6Q) to complete on the Rukpak side. It also requires that the `Resolution Activator` component exists as well.
- For phase 0 we are going to focus on the underlying `Bundles` being in the `registry+v1` format. After phase 0 we will iterate on this.
- Dependencies will be handled in Deppy and at a later phase of the project.
- Deppy is going to be outputting a resolution which will be a set of Inputs which we need to create `BundleDeployments` out of, pack into a `ConfigMap` and finally create `BundleDeployment` with the `ConfigMap` as a source.
- We are okay with some light coupling of Rukpak and Deppy to start but want to keep is as minimal as possible overtime.
- One to One for Resolution to BundleInstance? Or One to N for each resolved Input.
### How does pivoting work between `ResolveSet` bundles?
#### Answer
This will work how a `BundleDeployment` works today. When a `Resolution` gets produced from `Deppy` it outputs the result the `Resolution`'s status. From there, the `Resolution Activator` component will read the result, create a `ConfigMap` of `BundleDeployments` and then provide `Rukpak` with a `BundleDeployment` using the `ConfigMap` as a source.
When the `Resolution` object gets edited, it will trigger a new generation of the `ConfigMap` from the `Resolution Activator`. This triggers a pivot in the same way that the other two `Rukpak` sources do today in that it will delete any unneeded resources and install the new ones.
#### Notes
- In order to keep track of historical resolutions as well as update a singular resource we can update the resolution's original `ConfigMap` and create a copy of the old one up to `n` (configurable?) number of times.
### What are the UX gaps during failed rollouts?
#### Answer
- In a world where we are aggregating `Bundles` via `ConfigMaps` created from the `Resolution` API, being able to manage the installation state of all the resources gets tricky. What if a single resource fails? All of them?
- How/should we watch the actual health of a resource and emit a failure status to the `BundleDeployment`?
- This could be done via specified health endpoints, potentially.
- Hard to determine the health of none-executable resources.
- How do we emit what caused a pivot to failed?
- Main issues seem to be around determining how to manage the resources from a `ResolveSet` in a way that feels natural.
### How do we garbage collect stale resources during pivots?
#### Answer
When the `Resolution Activator` creates the `BundleDeployment` for a `Resolution`, we can have the relevant provisioner controller watch for changes to the this `BundleDeployment`'s source. When a change occurs, we delete any resources that are no longer provisioned by the `ResolveSet` and install any new ones that were missing.
## Operator API (a.k.a - Resolution Activator)
> **Note:** Basically another repo entirely - aware of both Deppy and Rukpak but neither aware of this component.
> **Note:** Don't have yet an awareness of the major components and interfaces between this. We will want to get a very concrete definition of the `Resolution` and `BundleDeployment` APIs before making progress here.
### What are the responsibilities for this component?
The goal of the `Operator API` is to do 5 things:
1. Unclear, should we create a source for `Deppy`? Should we create all potential `Inputs`? Do we just create a `Resolution`?
2. Create a `Resolution` into `Deppy` for desired resources
3. Read the `Resolution` status and parse the results into a `BundleDeployment` of `BundleDeployment`s (for each resolved `Input` image) backed into a `ConfigMap`/`PersistentVolume` source. The containing `BundleDeployment` will use the `plain` provisioner while the children will use `registry` for phase 0.
4. Take the `BundleDeployment` of `BundleDeployment`s and pass it over to Rukpak for installation.
5. Surface the status of the installation process to the user in an intuitive way. [Declaritive content health probing](https://hackmd.io/lUnrQHaKTsCLZ6j3Q52hSQ#) will be important to make sure this status is comprehensive.
For phase 0, this will live inside of the Platform Operator's repository. However, we should design this logic in such a way that we can pull it out into its own unique component at a later date.
### How is this different from the Platform Operators API?
First, it is important to establish that the Operator API's job is to ask for resolutions from `Deppy`, parse them, pass them to `Rukpak` and then report back status. `Platform Operators` is a unique API from this in that it will be utilizing the `Operator API` to surface status back to `OpenShift` components. In other words, the `Operator API` will be the glue between `Deppy`/`Rukpak` while `Platform Operators` will be the glue between the `Operator API` and `OpenShift`
### What are the inputs and outputs of this component?
Input will be recieved from the `PlatformOperator` API which will specify which packages `Deppy` needs to resolve. This resolution will then be fed by the `Operator API` into `Rukpak` in the form of `BundleDeployments` in order to install the content.
### How does it integrate with other components?
This will be useful to look at it from the context of its 3 relative partner components.
#### Deppy
The `Operator API` will be interacting with `Deppy` solely through its `Resolution` API. In essence, it will be creating `Resolution`s for packages that a client will define. These `Resolution`s will then be read and passed along to `Rukpak` for installation. For this reason **it is essential that we establish a concrete definition of what the `Resolution` API will look like to ensure asynchronous development is possible across repositories**.
#### Rukpak
Installation of packages resolved from `Deppy` will be handled by Rukpak. The status of this installation is handled by `Rukpak` and inevitably passed back to the `Operator API`. This allows for a clear view of installing a package that the `Operator API` can then surface back to a user. Status that is provided to the user will be clearer and simpler once we implent custom bundle content probes.
#### Platform Operators
Input will be recieved from `Platform Operators` into the `Operator API`. As previously stated, the `Operator API` is responsible for interacting with both `Deppy` and `Rukpak` which is why clean and accurate reports of status are so important. These status updates will then be fed into `OpenShift` in the relevant capacities via the `Platform Operators` API.
### Do we need a lookup service?
There will be a need for reading the status from `Resolution`, which will likely just be a collection of IDs, and parsing the necassary data out of them to build a `BundleDeployment` (source, reference, name, etc). As a result, there will need to be some component that is able to perform this parsing which could take the form of a lookup service.
### What are the missing gaps in the overall system?
The main gap will be around how we determine what type of provisioner to create the `BundleDeployments` with. There are a few ideas in this area but we'll ultimately need to choose one to move forward past phase 0.
### How do we handle surfacing resolved content into rukpak resources?
Deppy will provide us a `ResolveSet` that the `Operator API` will be responsible for reading and turning into `BundleDeployments`. As an initial phase, we will assume that all `BundleDeployments` created this way are to be used by the `Registry` provisioners. However, in the long term, we will need a way for `Bundles` to self-identify what provisioners they will need to use so that this component can accurately pass them to Rukpak.
## Handling upgrade graphs *(not addressing this sprint)*
- How do we encode upgrade graph semantics with deppy in the picture?
- Should we model the skips/replaces/skipRange semantics as constraints?
- Do we need surface upgrade graph semantics as properties?
- How do we account for existing resources during resolution?
## Tech preview guidelines *(not addressing this sprint)*
- What do we need to do to ensure we're all set for 4.12 TP?
---
## Other questions
### What's the use case for legacy OLM when the marketplace component is disabled?
Users provide their own catalog content.
### What's the use case for legacy OLM to be installed through this PO mechanism?
Better support resource constrained environments, e.g. edge devices, cluster invariants like microshift, etc.
### What's the use case for removing a PO in day 2 operations?
No support around removing an operator after cluster installation results in a poor UX for cluster administrators as they would need to spin down the cluster, and re-install using a new openshift-installer configuration.
### Can we introduce "internal" APIs into the core payload?
https://github.com/openshift/openshift-docs/pull/41018#issuecomment-1027327520
If there is something we want to be an internal api, we make it v1alpha1 and then never plan to promote it. However, we still need to not break people who upgrade their clusters/handle migration as we evolve the api.
### What happens if another operator in the catalog has a dependency on a platform operator?
We expect OLM to be aware of all the operators installed, whether or not they are installed/managed as `PlatformOperators`. Ultimately they are still olm operators that are installed/running on the cluster. To do this, OLM is going to need to grow a central registry of what operators (potential dependencies) that are installed on a cluster.