owned this note
owned this note
Published
Linked with GitHub
## Boxcutter
### Overview
https://github.com/package-operator/boxcutter
Boxcutter is a Go library for use in controllers. It does not define its own CRDs or implement any controllers.
Boxcutter installs a set of resources defined by a Revision.
A Revision has:
* A name
* An object (arbitrary, of any type) which owns it
* A monotonically increasing revision number
* A list of phases
* Each with a list of objects
Objects created or adopted by a Revision are owned by the Revision's object.
A new Revision with a higher revision number can take ownership of objects owned by a previous Revision.
You can Teardown an old revision, which will remove any objects which were not adopted by a newer revision.
Other notable Boxcutter features:
* Pause (reports status without making modifications)
* Object validation (does a dry run of all objects in the phase before installing anything)
* 'Probes', which cause a phase to wait until all Probes pass, e.g.
* Deployment has status replicas matching spec
* CRD is admitted
### Usage details
We will initially use v0.8.0.
We will use the Annotation owner strategy.
When creating a boxcutter revision engine we will do:
```go
re, err := boxcutter.NewRevisionEngine(boxcutter.RevisionEngineOptions{
Scheme: ...,
FieldOwner: <CAPI Operator prefix>
SystemPrefix: <CAPI Operator prefix>
DiscoveryClient: ...,
RestMapper: ...,
Writer: ...,
Reader: ...,
OwnerStrategy: ownerhandling.NewAnnotation(..., <annotation with CAPI Operator Prefix>)
})
```
We will need to keep a handle to the annotation owner strategy so we can use its Event mapper in SetupWithManager.
### Potential enhancements
The `boxcutter-managed` label does not appear to use a prefix. We should add a prefix.
We could remove the requirement for an owner object with an enhancement to OwnerStrategy. This will be somewhat invasive as it would also require removing the hardcoded 'Owner' from the Revision struct. We may also have to revisit anything which returns an EventHandler, as without an Owner object we might want to Reconcile a non-default type in controller-runtime. We are unlikely to do this in advance of release.
## Open questions
What prefix should CAPI operator use? In the first instance, look for other cases in CAPI operator where we're setting:
* a label
* an annotation
* a field owner
* a finalizer
These should all use the same prefix.
## Use in Cluster API Operator
### Extension to Cluster API Operator Config API
```yaml
spec:
installerMode: (Managed|Paused)
unmanagedCRDs:
- clusters.x-k8s.whatever
status:
currentVersion: 4.22.2-foo-bar-5
desiredVersion: 4.22.3-foo-bar-6
versions:
- name: 4.22.3-foo-bar-<revision number>
revision: 6
contentID: <derived from contentIDs of every manifest bundle>
unmanagedCRDs:
- <copy of unmanagedCRDs at revision creation time>
components:
- configMaps:
- cluster-api-4.22.3-foo-bar-00
- configMaps:
- cluster-api-provider-aws-4.22.3-foo-bar-00
- cluster-api-provider-aws-4.22.3-foo-bar-01
- name: 4.22.2-foo-bar
revision: 5
contentID: ...
unmanagedCRDs:
- ...
components:
- configMaps:
- cluster-api-4.22.3-foo-bar-00
- configMaps:
- cluster-api-provider-aws-4.22.3-foo-bar-00
- cluster-api-provider-aws-4.22.3-foo-bar-01
```
### Controllers
We define a new entrypoint for the Cluster API Operator with its own (extensive!) RBAC and 2 controllers:
* Version controller: watches operator Config spec and CVO-created manifest bundles and creates versions in status.
* Installer controller: watches operator Config and reconciles new versions in the status
Each controller will own a `<Controller>Progressing` and `<Controller>Degraded` condition on the cluster operator object, which will be aggregated into the `Progressing` and `Degraded` conditions.
### Version controller
The version controller reconciles the CAPI Operator config object.
The version controller verifies that it can see at least 1 complete manifest bundle for each target component type matching the current cluster version. Initially there will be 2 of these:
* Core CAPI
* Infrastructure component for the current platform
It reads the content ID of all included manifest bundles and computes a combined content ID.
It creates a new version if either of the following do not match the `currentVersion`:
* the combined contentID
* the list of unmanaged CRDs in the Config spec
Each `component` lists all slices of a single manifest bundle.
For each new version it creates a 'revision config map' which has the same name as the revision and a 'revision' label (with the appropriate prefix) with the same value as its name. Revision config maps are owned by the config object. These will be used as the Owner of a boxcutter revision, and will also be the reconciliation target of the installer controller.
If currentVersion matches desiredVersion it deletes all entries in `versions` except currentVersion after first deleting their revision config maps.
### Installer controller
The installer controller reconciles revision configmaps.
The installer controller internally constructs a boxcutter Revision corresponding to the reconciled ConfigMap. The metadata for this revision is taken from status.versions of the config object.
The revision contains:
* A pre-flight phase which creates a CompatibilityRequirement for unmanaged CRDS in any component.
* One phase for each component, in order, with unmanaged CRDs filtered out.
#### Create/update
For create/update, we Reconcile() the Revision.
We use the following probes:
* Deployment -> Status matches Spec replicas
* CRDs -> admitted
* CompatibilityRequirement -> Admitted and Compatible
In all cases, probe failure prevents installation from progressing to the next phase.
In the case of a CompatibilityRequirement probe observing Compatible=False we would need to set the InstallerDegraded condition on the cluster operator in addition to failing the pre-flight phase. Boxcutter does not currently give us a good way to communicate that a failure is terminal (Unknown means that it was unable to check, False implies that it is not yet met and should retry) so we may need to communicate this via a side-effect of the probe. Perhaps a closure which sets the value of a closed variable (assuming we can ensure only a single thread runs the probes).
#### Delete
For delete, we Teardown() the Revision.
When calling Teardown(), we must ensure to pass WithObjectTeardownOptions(obj, WithOrphan()) for every unmanaged CRD.
### Fetching manifests directly from payload images
We like the images approach and feel it is strategic. However, to reduce risk of further slippage we will stick with the current transport config maps approach in the short term. We will add a card to come back to this, hopefully in the current release cycle. We will mitigate the identified potential for privilege escalation by writing a VAP which matches transport config maps in the openshift-cluster-api namespace. The VAP will ensure that they can only be created or updated by CVO.
We discussed how we would like to implement the image-based manifest solution. The cluster-capi-operator image will:
* Know its own release version (it already knows this via an environment variable)
* Have an images.json containing the substituted payload image for every provider operand
Provider images will contain a new /capi-manifests directory. This directory will contain 2 files:
* metadata.json, containing the following metadata
* providerName e.g. 'cluster-api-provider-aws'
* providerType e.g. 'infrastructure'
* providerVersion e.g. 'v2.10.0'
* ocpPlatform e.g. 'AWS'
* contentID: <SHA256 of manifests>
* imageName e.g. registry.ci.openshift.org/openshift:aws-cluster-api-controllers
* manifests, containing the provider's KRM as a single file
When CAPI Operator scans for new manifests, it fetches /capi-manifests from every payload image in images.json (which contains only operand images). It selects images for inclusion based on metadata.json from each image. It constructs the following metadata for an operand set:
* name: <release version>-<revision number>
contentID: <derived from contentIDs of every operand's manifests>
components:
- image: <full, substituted payload image of operand by sha>
# And some other fields not relevant here
revision:
unmanagedCRDs:
The current proposed design has 'configMaps' in place of 'image' in the above, so a simple 'oneOf' in the Cluster API Operator Config API will support both during a transition period.
Because we are no longer obtaining the manifests via the release image they will no longer have image substitutions, so we will need to do that at runtime. When we fetch the manifests from a specific payload image, we will substitute the value of 'imageName' with the fully qualified image we fetched it from.
To enable testing, development, and potentially hotfixes, we will also need to add an image override to the Cluster API Operator Config API.