---
title: AMI Resolution with Flavors
authors:
- "@sedefsavas"
- "@randomvariable"
- "@imikushin"
reviewers:
- "@randomvariable"
creation-date: 2021-06-28
last-updated: 2021-06-28
status: provisional
---
# AMI Resolution with Flavors
## Table of Contents
## Glossary
### AMI
AWS machine image
### AMI Flavors
AMIs could be categorized based on many aspects like OS type and CPU architecture. An AMI flavor is a particular selection of those options.
## Summary
This proposal redesigns the way AMIs are resolved during management/workload cluster creations.
We are proposing a new AMI object which will have a 1-1 mapping with the AMIs on AWS accounts that are of interest.
With this new object, AMI flavors will be expanded and advanced AMI resolutions based on various will be possible.
During AMI resolution, the source of truth becomes the AMI objects in Kubernetes as opposed to the AMIs on AWS accounts.
## Motivation
Currently, in cluster-api-provider-aws, very limited support exists for different AMI flavors (OS/ARCH); the only supported variants are OS types and Kubernetes versions.
Having an AMI CRD will help us resolve the following issues:
- There is no mechanism in place to detect when new AMIs are published for a certain flavor and Kubernetes version. Hence, when a new AMI is published due to security reasons, the rollout process is manual.
- Auto-resolved images are not being captured in any CAPA field, so the information on which images are used for AWSMachines are lost.
- Since images are resolved at different points in time for different AWSMachines, they may end up using different AMIs. Need a way to ensure deterministic AMI resolution for consistency in clusters.
### Goals
- Support various AMI flavors (OS variants, bootmode etc.)
- Same AMI for a particular K8s version for all Machines (deterministic AMI resolution)
- During cluster upgrade: Deterministic roll out of a machine image
- During cluster creation: all machines should have the same AMI
- A mechanism that when a user issues some command or update to roll over the machines in a cluster, that all of the machines are launched with the same new AMI.
- To be able to do a deterministic roll out of machines in a cluster with the same Kubernetes version, but a newer OS image.
- e.g. Kernel CVE . Requirement from security to update all machines within 48 hours.
- Org takes image from cloud security, runs it through image builder to add kubernetes components, then updates cluster with new AMI
- Same kubernetes version rollout to new AMIs
- Air-gapped environments should be supported
- Support both public and custom images
- Each AWSMachineTemplate will have a particular AMI set per Kubernetes version
### Future Work
- A controller will automatically create AMIs resources when new AMIs are created.
- Auto-rollout will be supported when a new AMI for the same Kubernetes version becomes available.
- Creation of AMI in a single region and copying encrypted AMIs to user’s AWS account to the region the cluster is in.
## Proposal
This proposal enhances the way AMIs are resolved during ec2 instance creation by introducing a new AMI resource.
### User Stories
#### Story 1 - Having AMIs as Kubernetes objects
As an end user, I would like to list the available AMIs for various flavors (Kubernetes version, OS, Arch etc.) by using Kubernetes constructs.
#### Story 2 - Easier UX for AMI upgrades
As an end user, I would like to know when there are new AMIs published for a Kubernetes version
so that I can trigger a rollout when needed.
#### Story 3 - Deterministic AMI resolution
As an end user, I would like all machines that belongs to the same group (MachineDeployment/KCP) use the same AMI.
### Implementation Details/Notes/Constraints
Main changes are as follows:
- A new CRD for AMIs is created (AWSImage).
- AWSMachineTemplate is populated with (Kubernetes version-AMI) labels for each Kubernetes version that template is used with.
- There is a new AWSMachineTemplate controller that checks if version-AMI tuple is the latest or not.
- A new CRD for AMILookupConfig which will keep track of AWS accounts that will be used to automate AMI creation.
#### New API types
##### AWSImage
A cluster scoped AWSImage CR which has 1-1 mapping with the AMI in the published AWS account.
**Example AWSImage resource**
```yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4
kind: AWSImage
metadata:
name: ami-0f2e5eec7ae0a1986
spec:
kubernetesVersion: 1.21.2
os: ubuntu-20.04
image:
ref:
id: ami-0f2e5eec7ae0a1986
region: ap-northeast-1
```
An AWSImage controller will reconcile the objects based on image ref and add all available filter values as filters.
```yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4
kind: AWSImage
metadata:
name: ami-0f2e5eec7ae0a1986
labels:
latest: true
os-name: ubuntu-20.04
architecture: amd64
bootMode: uefi
.
.
.
# all available filters are added here for this particular image: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeImages.html
region: ap-northeast-1
kubernetes-version: 1.21.2
spec:
kubernetesVersion: 1.21.2
os: ubuntu-20.04
image:
ref:
id: ami-0f2e5eec7ae0a1986
region: ap-northeast-1
```
##### AMILookupConfig
We define a new CRD for automating AWSImage creation from the AMIs published in AWS accounts.
This may be implemented in the future, not a breaking change;
but having it in this proposal gives a perspective about the direction we would like to go.
A cluster scoped AMILookupConfig CR keeps all configuration needed for AMI lookups from an AWS account that AMIs are published in.
**Example AMILookupConfig resource**
```yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AMILookupConfig
metadata:
name: "lookup-dev"
spec:
identityRef: <blah> (default to AWSClusterControllerIdentity)
imagePrefix: "capa-ami"
lookupFormat: "-{{.OS}}-{{.ARCH}}-?{{.K8sVersion}}-*"
os:
- "ubuntu-18.04"
- "centos7"
arch:
- "amd64"
- "arm64"
customFields:
- name: "bootType"
validValues:
- "UEFI"
- "BIOS"
# Maybe support something like Gemfile? >=1.18
kubernetesVersions: "~1.18" # equal to >=1.18 && <=2.0 (there should be a library for this)
imageLookupOrg: 728748729
status:
lastReconcile: <date>
```
Along with the CRD, we propose having an auto AWSImage object creation controller (Feature-gated), which does the following:
- Periodically checking available images in the ImageLookupOrg, pulls all images that are created after the lastReconcile that matches with lookupFormat.
- Generating new AMIs when new versions are detected.
- If it detects a new version is available for an existing AMI resource, creates new AMI resources with the new AMI ID and sets the latest label to false in the existing resources and to true for the new AMI resource with the latest ID.
- Cleaning up AWSimage resources when AMI is deleted.
#### Changes in existing API types
##### AWSMachineTemplate
Removed imageLookupOrg, lookupFormat fields.
Added version-AMI labels and status.
```yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSMachineTemplate
metadata:
name: "machine-template-1"
labels:
cluster-x-k8s.io/version-ami-1.20.4: ami-0f2e5eec7ae0a1986
cluster-x-k8s.io/version-ami-1.21.2: ami-0f2e5eec7ae0a1442
spec:
template:
spec:
os: ubuntu
arch: amd64
additionalFilters: # []v1alpha4.Filter
instanceType: "g4dn.8xlarge"
status:
AWSImages:
- kubernetesVersion: 1.20.4
ami: ami-0f2e5eec7ae0a1986
UpgradeAvailable: false
- kubernetesVersion: 1.21.2
ami: ami-0f2e5eec7ae0a1986
UpgradeAvailable: true
```
Resolved AMIs will be added as labels (Name being the Kubernetes version and the value being the name of the AMI object to be used) to the AWSMachineTemplate resource.
All the referees of a particular AWSMachineTemplate will use the same AMIs.
First time a version is being resolved by an AWSMachine that is cloned from an AWSMachineTemplate (this information is added as annotations -cluster.x-k8s.io/cloned-from-name- by core cluster-api controllers during cloning),
- AWSMachine controller checks the AWSMachineTemplate if it has a label with the version of interest.
- If no labels exist for the version, AWSMachine controller will resolve the AMI and update AWSMachineTemplate labels.
- During resolution, AWSMachine controller will sort the available AMI resource and sort by creation timestamps, use the latest.
- All other AWSMachines will use that version.
Eventually, an AWSMachineTemplate object will have all Kubernetes versions that its referees are using.
If there are multiple referees to same AWSMachineTemplate such as multiple MachineDeployment/KubeadmControlPlane objects, all of the machines created by them will use the first resolved AMI that is added as label.
We introduce a new experimental controller, which periodically reconciles AWSMachineTemplates
and checks if there is a newer AMI CR for the versions in that template. If there is, sets `UpgradeAvailable` to true in its `Status`.
This is only informational, but in the future `UpgradeAvailable` field can be utilized during automating rollouts when a new AMI is available for a particular Kubernetes version.
### Security Model
If users have the ability to create AWSMachineTemplates, they can launch arbitrary AMIs into their environments. Some organisations may wish to lock that down.
### Risks and Mitigations
## Alternatives
## Upgrade Strategy
## Additional Details
### Test Plan [optional]
### Graduation Criteria [optional]
### Version Skew Strategy [optional]
## Implementation History
## Open Questions
- Same AMI flavor in 2 different accounts, different clusters that use different roles can only reach one of them. Do we want to support this workflow
- We need to add org field to AWSImage and AWSMachineTemplate. Only clusters that have access to that org should use those.
- Which Kubernetes versions should be created automatically once controller first starts? AMIs created for all 1.18 versions?
- Create AWSImage objects only when used?
- When an image is deleted from the account, AMI should be cleaned as well.