Implementation of App Platform to Flux migration

date: 2023.03.30
author: piontec
important notes: https://github.com/giantswarm/roadmap/issues/2361

References

This doc presents some implementation ideas for app platform to flux migration discussed also in this write-up.

How to replace our current HTTP helm repo functionality with ORAS

Tested with

  • flux 0.41.1
  • oras 1.0

Reference:

Catalog content discovery

Note on compatibility

All the tests below were performed on ACR, GHCR and dockerhub. The summary of what works looks like this:

registry discovery with oras repo ls helm charts arbitrary objects (metadata)
ACR y y y
GHCR n (you can list all public repos and all images) y y
dockerhub n y n
google n y y
aliyun n y y

Flux based

Flux offers some (limited) degree of content discovery by providing ImageRepository that can discover tags for a single specific artifact:

apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
  name: upgrade-bundle
  namespace: test
spec:
  exclusionList:
  - ^.*\.sig$
  image: giantswarmpublic.azurecr.io/test/hello-world-bundle
  interval: 5m
  provider: generic
status:
  canonicalImageName: giantswarmpublic.azurecr.io/test/hello-world-bundle
  conditions:
  - lastTransitionTime: "2023-03-28T15:12:10Z"
    message: 'successful scan: found 7 tags'
    observedGeneration: 1
    reason: Succeeded
    status: "True"
    type: Ready
  lastScanResult:
    latestTags:
    - 0.0.7
    - 0.0.6
    - 0.0.5
    - 0.0.4
    - 0.0.3
    - 0.0.2
    - 0.0.1
    scanTime: "2023-03-29T07:57:50Z"
    tagCount: 7

Querying OCI repo for discovery metadata

How to get a list of available artifcats

$ oras repo ls giantswarmpublic.azurecr.io | head
cluster-catalog/capa-internal-proxy-stack
cluster-catalog/cluster-aws
cluster-catalog/cluster-azure
cluster-catalog/cluster-cloud-director
cluster-catalog/cluster-gcp
cluster-catalog/cluster-openstack
cluster-catalog/cluster-shared
cluster-catalog/cluster-vsphere
cluster-catalog/default-apps-aws
cluster-catalog/default-apps-azure

How to get info about a helm chart

$ oras repo tags giantswarmpublic.azurecr.io/control-plane-catalog/crossplane
0.1.0
0.2.0
0.2.1
0.3.0
0.3.1
0.4.0
0.4.1
0.4.2
0.4.3
1.0.0
1.1.0
2.0.0
2.1.0
$ oras manifest fetch giantswarmpublic.azurecr.io/control-plane-catalog/crossplane:2.1.0 | jq
{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.cncf.helm.config.v1+json",
    "digest": "sha256:22aa9eeae15a5908fe5a4c4a9d108ab53a82db92884f3975b55de5d4d5a03503",
    "size": 1458
  },
  "layers": [
    {
      "mediaType": "application/vnd.cncf.helm.chart.content.v1.tar+gzip",
      "digest": "sha256:3d969d8c0b7660b2fb20a4b3eb0d20b6e956d16a3c944e56285b9285408b1856",
      "size": 60849
    }
  ]
}
$ oras manifest fetch-config giantswarmpublic.azurecr.io/control-plane-catalog/crossplane:2.1.0 | jq
{
  "name": "crossplane",
  "home": "https://crossplane.io",
  "sources": [
    "https://github.com/crossplane/crossplane",
    "https://github.com/crossplane/crossplane-runtime",
    "https://github.com/crossplane/provider-aws",
    "https://github.com/crossplane/provider-azure",
    "https://github.com/crossplane/provider-gcp",
    "https://github.com/crossplane/provider-rook"
  ],
  "version": "2.1.0",
  "description": "Crossplane is an open source Kubernetes add-on that enables platform teams to assemble infrastructure from multiple vendors, and expose higher level self-service APIs for application teams to consume.",
  "keywords": [
    "cloud",
    ...  
  ],
  "maintainers": [
    {
      "name": "Crossplane Maintainers",
      "email": "info@crossplane.io"
    }
  ],
  "icon": "https://docs.crossplane.io/android-chrome-192x192.png",
  "apiVersion": "v1",
  "appVersion": "1.11.0",
  "annotations": {
    "application.giantswarm.io/metadata": "https://giantswarm.github.io/control-plane-catalog/crossplane-2.1.0.tgz-meta/main.yaml",
    "application.giantswarm.io/readme": "https://giantswarm.github.io/control-plane-catalog/crossplane-2.1.0.tgz-meta/README.md",
    "application.giantswarm.io/team": "honeybadger",
    "application.giantswarm.io/values-schema": "https://giantswarm.github.io/control-plane-catalog/crossplane-2.1.0.tgz-meta/values.schema.json",
    "config.giantswarm.io/version": "1.x.x"
  }
}

Pushing helm chart's meta-data and friends to ORAS

Singing and pushing chart (currently cosign uploads the signature as a separate layer with a well-known tag format, instead of attaching it to existing artifact)

$ helm push hello-world-app-1.3.0-32a97bcc47456fbb5c718f454e884c5629fed7cb.tgz oci://giantswarmpublic.azurecr.io/test
Pushed: giantswarmpublic.azurecr.io/test/hello-world-app:1.3.0-32a97bcc47456fbb5c718f454e884c5629fed7cb
Digest: sha256:da27820091d73861454be71a30e3baf4d20555d297e9d9be8272c2cad21c2426
$ cosign sign giantswarmpublic.azurecr.io/test/hello-world-app@sha256:da27820091d73861454be71a30e3baf4d20555d297e9d9be8272c2cad21c2426
Generating ephemeral keys...
Retrieving signed certificate...

	Note that there may be personally identifiable information associated with this signed artifact.
	This may include the email address associated with the account with which you authenticate.
	This information will be used for signing this artifact and will be stored in public transparency logs and cannot be removed later.

By typing 'y', you attest that you grant (or have permission to grant) and agree to have this information stored permanently in transparency logs.
Are you sure you would like to continue? [y/N] y
Your browser will now be opened to:
https://oauth2.sigstore.dev/auth/auth?....
Successfully verified SCT...
tlog entry created with index: 16041808
Pushing signature to: giantswarmpublic.azurecr.io/test/hello-world-app
$ oras manifest fetch giantswarmpublic.azurecr.io/test/hello-world-app:sha256-da27820091d73861454be71a30e3baf4d20555d297e9d9be8272c2cad21c2426.sig     
{"schemaVersion":2,"mediaType"
...

Attaching arbitrary meta-data files (helm/example-chart-meta is an arbitrary string):

$ cp -a hello-world-app-1.3.0-32a97bcc47456fbb5c718f454e884c5629fed7cb.tgz-meta/ meta
$ oras attach giantswarmpublic.azurecr.io/test/hello-world-app:1.3.0-32a97bcc47456fbb5c718f454e884c5629fed7cb --artifact-type 'helm/example-chart-meta' ./meta
Uploading 467731c54adf meta
Uploaded  467731c54adf meta
Attached to [registry] giantswarmpublic.azurecr.io/test/hello-world-app@sha256:da27820091d73861454be71a30e3baf4d20555d297e9d9be8272c2cad21c2426
Digest: sha256:48eb00ecf2769850f917f6498db50af940dd45c2f2905040a819ee062df4c1f5
$ oras discover -o tree giantswarmpublic.azurecr.io/test/hello-world-app:1.3.0-32a97bcc47456fbb5c718f454e884c5629fed7cb
giantswarmpublic.azurecr.io/test/hello-world-app@sha256:da27820091d73861454be71a30e3baf4d20555d297e9d9be8272c2cad21c2426
└── helm/example-chart-meta
    └── sha256:48eb00ecf2769850f917f6498db50af940dd45c2f2905040a819ee062df4c1f5

Checking and downloading metadata:

$ oras discover -o json --artifact-type 'helm/example-chart-meta' giantswarmpublic.azurecr.io/test/hello-world-app:1.3.0-32a97bcc47456fbb5c718f454e884c5629fed7cb
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:48eb00ecf2769850f917f6498db50af940dd45c2f2905040a819ee062df4c1f5",
      "size": 817,
      "annotations": {
        "org.opencontainers.image.created": "2023-03-22T14:20:20Z"
      },
      "artifactType": "helm/example-chart-meta"
    }
  ]
}
$ oras manifest fetch giantswarmpublic.azurecr.io/test/hello-world-app@sha256:48eb00ecf2769850f917f6498db50af940dd45c2f2905040a819ee062df4c1f5 | jq
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": {
    "mediaType": "helm/example-chart-meta",
    "digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
    "size": 2
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:467731c54adfa0ac43c81a4964051fba6b562d9e5659922d22c0eac26ac453da",
      "size": 1016,
      "annotations": {
        "io.deis.oras.content.digest": "sha256:d79b1a93714f3465bd35d7acafa8429eb26a887d79c627ddfa52f1c2e5a87a6d",
        "io.deis.oras.content.unpack": "true",
        "org.opencontainers.image.title": "meta"
      }
    }
  ],
  "subject": {
    "mediaType": "application/vnd.oci.image.manifest.v1+json",
    "digest": "sha256:da27820091d73861454be71a30e3baf4d20555d297e9d9be8272c2cad21c2426",
    "size": 354
  },
  "annotations": {
    "org.opencontainers.image.created": "2023-03-22T14:20:20Z"
  }
}
$ mkdir meta-out
$ oras pull -o ./meta-out giantswarmpublic.azurecr.io/test/hello-world-app@sha256:48eb00ecf2769850f917f6498db50af940dd45c2f2905040a819ee062df4c1f5
Downloading 467731c54adf meta
Downloaded  467731c54adf meta
Pulled [registry] giantswarmpublic.azurecr.io/test/hello-world-app@sha256:48eb00ecf2769850f917f6498db50af940dd45c2f2905040a819ee062df4c1f5
Digest: sha256:48eb00ecf2769850f917f6498db50af940dd45c2f2905040a819ee062df4c1f5

$ ll meta-out/meta 
.rw-r--r--   912 piontec 22 mar 10:00 main.yaml
.rw-rw-r--   466 piontec 22 mar 10:00 README.md
.rw-rw-r-- 1,2Ki piontec 28 cze  2022 values.schema.json

$ oras manifest fetch giantswarmpublic.azurecr.io/test/hello-world-app:sha256-48eb00ecf2769850f917f6498db50af940dd45c2f2905040a819ee062df4c1f5.sig                    
{"schemaVersion":2,"mediaType":
...

Replacing app bundles with native Flux+Helm capabilities

Direct App CR to HelmRelease CR replacement

Obviously, we can use the same pattern as we did with App Bundles, just with HelmRelease CRs. Still, we will duplicate also all the issues we had with it, and as such we would like to avoid the pattern of intermediary wrapping Helm Chart.

Using OCI artifacts as bundles

Flux has now functionality to use any OCI object as a source, the same way it can use S3 bucket or a git repo.

We can use this new functionality in the following way:

  • to release: pack all the resources, mainly HelmReleases, that should be in a bundle into a single OCI artifact, upload it to OCI registry (optionally sign with cosign)
  • to install: download the artifact as a source of Kustomization, patch necessary configs and a target cluster and let Flux do the rest

Example

In this example, we'll prepare a bundle consisting of two apps, deployed with 1 shared and 1 individual config.

Preparing the bundle - source

We prepare 3 files: 2 HelmReleases with app definitions and a HelmRepository they can be downloaded from.

  • hello-world-helmrepo.yaml
    ​apiVersion: source.toolkit.fluxcd.io/v1beta2
    ​kind: HelmRepository
    ​metadata:
    ​  name: gspublic
    ​spec:
    ​  interval: 10m
    ​  type: oci
    ​  url: oci://giantswarmpublic.azurecr.io/test
    
  • hello-world-1-release.yaml
    ​apiVersion: helm.toolkit.fluxcd.io/v2beta1 ​kind: HelmRelease ​metadata: ​ name: hello-1 ​ labels: ​ app: hello-world ​ test1: val1 ​spec: ​ interval: 10m ​ targetNamespace: test # <- on the target cluster ​ chart: ​ spec: ​ chart: podinfo ​ version: 6.3.5 # <- can be also semver range ​ sourceRef: ​ kind: HelmRepository ​ name: gspublic # <- must exist ​ verify: ​ provider: cosign # <- optional verification ​ valuesFrom: [] # <- important for easier patching
  • hello-world-2-release.yaml
    This file is exactly the same as the one above except of the name
    ​apiVersion: helm.toolkit.fluxcd.io/v2beta1
    ​kind: HelmRelease
    ​metadata:
    ​  name: hello-2
    ​​​​...
    

Please note a few important settings in the manifest hello-world-1-release.yaml, in particular

  • version: can be set to a semver regexp (like 6.x or >=6.5.0)
  • valuesFrom:, even if not used, should be set to an empty list [], as it will allow patching it later by appending to the array, without checking if it exists

Next, we're ready to pack, tag, and upload the content of our bundle to OCI registry:

$ flux push artifact oci://giantswarmpublic.azurecr.io/test/hello-world-bundle:0.0.8 --path="." --source="the-repo-URL-goes-herels" --revision="0.0.8"
► pushing artifact to giantswarmpublic.azurecr.io/test/hello-world-bundle:0.0.8
✔ artifact successfully pushed to giantswarmpublic.azurecr.io/test/hello-world-bundle@sha256:7336d4e4452817260a39d78b270b1239a2fbb06ed88217323e3f33312c95680d

Optionally, we can already sign it with cosign:

$ cosign sign giantswarmpublic.azurecr.io/test/hello-world-bundle@sha256:7336d4e4452817260a39d78b270b1239a2fbb06ed88217323e3f33312c95680d
Generating ephemeral keys...
Retrieving signed certificate...

	Note that there may be personally identifiable information associated with this signed artifact.
	This may include the email address associated with the account with which you authenticate.
	This information will be used for signing this artifact and will be stored in public transparency logs and cannot be removed later.

By typing 'y', you attest that you grant (or have permission to grant) and agree to have this information stored permanently in transparency logs.
Are you sure you would like to continue? [y/N] y
Your browser will now be opened to:
https://oauth2.sigstore.dev/auth/auth?access_type=online&client_id=sigstore&code_challenge=XZwN99sMKUioOFSk6Er6s0ZzTgdsqyhzP-1MLlXwzfk&code_challenge_method=S256&nonce=2NjQ2tlirnpVHvzX26WIyFBUHTW&redirect_uri=http%3A%2F%2Flocalhost%3A38965%2Fauth%2Fcallback&response_type=code&scope=openid+email&state=2NjQ2uhTHTsshYZzyUz7zZ2L6ID
Successfully verified SCT...
tlog entry created with index: 16675980
Pushing signature to: giantswarmpublic.azurecr.io/test/hello-world-bundle

Deploying the bundle

We can now use the bundle as a Kustomization source. To do that, we have to prepare Flux objects like below.

  • bundle-repo.yaml - includes the info about where to get the bundle and in which version; oprionally can check for the bundle signature
    ​apiVersion: source.toolkit.fluxcd.io/v1beta2
    ​kind: OCIRepository
    ​metadata:
    ​  name: gs-azure
    ​  namespace: test
    ​spec:
    ​  interval: 10m
    ​  url: oci://giantswarmpublic.azurecr.io/test/hello-world-bundle
    ​  ref:
    ​    tag: 0.0.5
    ​  verify:  # <- optional, enables cosign verification
    ​    provider: cosign
    
  • optional ConfigMaps that are needed to configure the bundle; for testing we use a file cm.yaml
    ​---
    ​apiVersion: v1
    ​data:
    ​  values.yaml: |
    ​    a: b
    ​    b: 2
    ​kind: ConfigMap
    ​metadata:
    ​  name: bundle-config
    ​  namespace: test
    ​---
    ​apiVersion: v1
    ​data:
    ​  values.yaml: |
    ​    app1: b
    ​    app2: 2
    ​kind: ConfigMap
    ​metadata:
    ​  name: bundle-config-app1
    ​  namespace: test
    
  • actual Kustomization that deploys the bundle and applies patches needed to set a destination cluster and configuration (bundle.yaml)
    ​apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
    ​kind: Kustomization
    ​metadata:
    ​  name: test-bundle
    ​  namespace: test
    ​​​​  labels:
    ​​​​      gs.app.kustomization-type: bundle
    ​spec:
    ​  interval: 1m
    ​  targetNamespace: test
    ​  prune: true
    ​  sourceRef:
    ​    kind: OCIRepository
    ​    name: gs-azure
    ​  path: ./
    ​  patches:
    ​    - patch: |-
    ​        apiVersion: helm.toolkit.fluxcd.io/v2beta1
    ​        kind: HelmRelease
    ​        metadata:
    ​          name: not-used
    ​​​​          labels:
    ​​​​              part-of-bundle.name: test-bundle
    ​​​​              part-of-bundle.version: 0.0.5
    ​        spec:
    ​          kubeConfig:
    ​            secretRef:
    ​              name: wc-kubeconfig-cm-name
    ​      target: #  <- must select all HelmReleases in the bundle to set the target clsuter
    ​        kind: HelmRelease
    ​        labelSelector: "app=hello-world"
    ​    - patch: |
    ​        - op: add
    ​          path: "/spec/valuesFrom/-" #  <- this appends to the end of existing list
    ​          value:
    ​            kind: ConfigMap
    ​            name: bundle-config
    ​      target:
    ​        kind: HelmRelease
    ​        labelSelector: "app=hello-world" #  <- we can target multiple HelmReleases
    ​    - patch: |
    ​        - op: add
    ​          path: /spec/valuesFrom/-
    ​          value:
    ​            kind: ConfigMap
    ​            name: bundle-config-app1
    ​      target:
    ​        kind: HelmRelease
    ​        name: "hello-1" #  <- we can target a single HelmRelease
    

After deploying this Kustomization, an OCI artifact will be downloaded, verified with cosign (if used in OCIRepository), extracted, patched and then deployed to the local k8s server.

Automatic upgrades - simple semver regex

This approach automatically detects new versions of the bundle in OCI registry and deploys it to the cluster, but it doesn't save the information about the version used to any spec: part of any object, so it's also impossible to save the version information to the git repo or create a PR. Still, it's super simple and requires just one minor change in the OCIRepository object:

apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: OCIRepository
metadata:
  name: gs-azure
  namespace: test
spec:
  interval: 10m
  url: oci://giantswarmpublic.azurecr.io/test/hello-world-bundle
  ref:
    semver: 0.0.x  # <- instead of simple 'tag:'
  verify:
    provider: cosign

Automatic upgrades using flux image update automation

This requires more configuration, but uses the templating mechanism that allows to set a version in the spec: section of OCIRepository, then commit it to repo or create a PR.

Note: during my testing I failed to make it work, but it's supported and listed in official docs.

We need to change OCIRepository as follows

apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: OCIRepository
metadata:
  name: gs-azure
  namespace: test
spec:
  interval: 10m
  url: oci://giantswarmpublic.azurecr.io/test/hello-world-bundle
  ref:
    tag: 0.0.5 # {"$imagepolicy": "test:upgrade-bundle-policy:tag"}
  verify:
    provider: cosign

Then add additional objects that flux uses for discovery and pushing back to git:

---
apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
  name: upgrade-bundle-automation
  namespace: test
spec:
  interval: 10m
  sourceRef:
    kind: GitRepository
    name: flux-system
    namespace: flux-system
  git:
    push:
      branch: test-gs-flux 
    commit:
      author:
        email: fluxcdbot@users.noreply.github.com
        name: fluxcdbot
      messageTemplate: '{{range .Updated.Images}}{{println .}}{{end}}'
  update:
    path: tmp/
    strategy: Setters
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
  name: upgrade-bundle-repo
  namespace: test
spec:
  image: giantswarmpublic.azurecr.io/test/hello-world-bundle
  interval: 5m
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
  name: upgrade-bundle-policy
  namespace: test
spec:
  imageRepositoryRef:
    name: upgrade-bundle-repo
  policy:
    semver:
      range: ">=0.0.5"

Analysis

Pros

  • uses native Flux objects without the need for intermediary wrapping Helm Chart
    • as a result, allows configuration ConfigMaps to be passed directly to the objects inside the OCI artifact, without the need to go through values.yaml of the wrapping Helm Chart
  • natively supports cosign for delivery chain security
  • no extra controller/operator needed
  • it's deployed using a Kustomization API object, not a HelmRelease, which makes it clear that they are not the same and don't work the same; makes this separation clearer
  • ownership of resources included in the bundle just works, everything is cleaned up once the bundle Kustomization is deleted
  • patches required in the Kustomization object are very repetitive and should be easy to standardize and template using external tools
  • upgrade automation is delivered by Flux and has even 2 ways of configuring, one of which integrates nicely with GitOps
  • we can set up upgrade automation for HelmReleases within the bundle; this means that components in the bundle can get for example patch fix releases without releasing the overall bundle (which is also a disadvantage, as it breaks the immutability of a bundle in a specific version).

Cons

  • it's deployed using a Kustomization API object, not a HelmRelease, which means UIs need to support both of them
  • Kustomizations can be also used for many different purposes, so we need a way to automatically tell which Kustomization is a bundle - but this can be easily done with a label like gs.app.kustomization-type: bundle
  • requires excessive patching in the Kustomization object to set configuration, target workload cluster and also probably some labels on included resources to denote that they were installed by a bundle

Discarding app bundles and using helm charts with sub-charts

Helm allows managing dependencies directly, using the helm dep command. This pattern is used by Bitnami to produce virtually all of their charts. It works by just including all the manifests of the dependent charts into the umbrella chart (it's even implemented as downloading and extracting dependent charts).

Note: this approach might seem to be the same, as the app-of-apps approach above, but there's a very important difference: there's just 1 Helm chart, composed by "gluing" together the sub-charts.

Note: our group that was working on CAPI packaging also analyzed the idea, but rejected it in favor of app-of-apps approach (at that time, without the experience we have now).

Pros

  • it's a single helm chart
    • as such, information about the target cluster needs to be given just once, for the umbrella chart itself
  • simple to manage: Helm allows to define semver ranges for dependencies, then it just needs helm dup update to check for newer versions, check if they match the semver range and save it in Chart.lock
  • we can mark it with some Chart.yaml extra attributes to easily differentiate from normal "single" charts

Cons

  • helm chart releases are saved in a Secret and they are limited in size to 1 MB; we already crossed that with a single chart a few times and that was not an easy problem; including multiple sub-charts in a single umbrella chart is going to make this problem bigger
  • it requires to manage and update an extra chart every time any of the sub-charts needs to be updated
  • it requires the umbrella chart to have its own values.yaml that proxies the values to the necessary sub-chart
  • there's no easy way to capture dependencies (like "is it ready?" or "did it fail?") between chart components
  • it's very hard (although at least to some extent possible using the lookup function) to manage deployment of the chart based on what is already deployed in the cluster (no direct support for inter-chart dependencies).

Recommendation

There's a big difference in complexity and functionality between the two approaches above. Fortunately, both can happily coexist at the same time:

  • For simple scenarios (no big charts, no intra-chart dependencies, no need to patch the nested charts) we can use Helm Subcharts. Mixing this with a renovate action that will automatically detect and upgrade sub-charts, managing the release process should be easy. Then, we combine it with Flux using just a single HelmRelease. This all was also tested with OCI registries and works fine.
  • For more complex scenarios, we can use OCI bundles, where we can add to the mix whatever we want, patch using kustomize and use all the feature Flux offers.
Select a repo