# Labels and annotations
## Goal
- Have labels and annotations consistent all across the chain down to the nodes
Note:
- labels != labels selector
## Current propagation rules
https://cluster-api.sigs.k8s.io/developer/architecture/controllers/metadata-propagation.html?highlight=propagation#metadata-propagation
https://cluster-api.sigs.k8s.io/tasks/experimental-features/cluster-class/change-clusterclass.html#reference
## Main Asks
1. Set node labels
- https://github.com/kubernetes-sigs/cluster-api/issues/493 (issue)
- https://github.com/kubernetes-sigs/cluster-api/pull/6255 (Proposal)
- https://github.com/kubernetes-sigs/cluster-api/pull/7173 (PR)
This is apparently the simplest part of the problem. How lables/annotations flows down the chain is the fun part
Current status: https://docs.google.com/presentation/d/1kotJg9_ql3znlDirAodQYr_CysPSiZOKqknHqXomDyw/edit#slide=id.p
2. Labels from CC/Topology to KCP, MD, MS (templates)
- https://github.com/kubernetes-sigs/cluster-api/issues/7006 (issue)
- https://github.com/kubernetes-sigs/cluster-api/pull/7088 (PR)
## Related asks
1. Metadata/Fields propagation without rollout
- In place field propagation
- https://github.com/kubernetes-sigs/cluster-api/issues/5880
2. Rollout after every where
- Rollout after for Machine deployments
- https://github.com/kubernetes-sigs/cluster-api/issues/4536 (Issue)
- https://github.com/kubernetes-sigs/cluster-api/pull/7053 (PR)
- Rollout after for Cluster Class
- https://github.com/kubernetes-sigs/cluster-api/issues/5218 (PR)
## Questions ???
- Should we stop propagating labels from CC templates to MD templates, given that they do not go down to templates used by machines or else
- MD hash / rollouts
- Should label propagation be authoritative vs allow manually added labels down the chain (and how do we manage this / SSA)
- Should label propagation apply to templates
- (Adoption??)
### Notes
#### Why MD.spec.template.labels are propagated to MS and annotations no
This is for historical reasons, this how Kubernetes Deployment works
#### KCP
TL;DR; it is required a change in to avoid rollouts in KCP due to labels/annotations changes
TODO: nodeDrainTimeout and nodeDeletionTimeout changes are not propagated to machines. is this a bug? I think yes
------ Rollout rules (current)
A CP machine require rollout if:
- !already-being-deleted && (should-rollout-after || !machine-spec-match)
Should Rollout After if
- machine.CreationTimestamp.Before(rolloutAfter) && rolloutAfter.Before(reconciliationTime)
Machine Spec Match
- match-template-metadata && match-kubernetes-version && match-KubeadmBootstrapConfig && match-Template-Cloned-From
Match Template Metadata
- machine has all the labels and the annotations in spec.machineTemplate.Metadata
=> it can have more
=> it triggers rollout on add/change, not on delete
match Kubernetes Version
- machines' version equal to KCP version
Match KubeadmBootstrapConfig
- kcp's controlplane.cluster.x-k8s.io/kubeadm-cluster-configuration annotation equal to kcp.Spec.KubeadmConfigSpec.ClusterConfiguration
=> TODO check if this will trigger rollout on API version bump
- machines' bootstrap config has all the labels and the annotations in spec.machineTemplate.Metadata
=> it can have more
=> it triggers rollout on add/change, not on delete
- machines' bootstrap config spec equal to kcp.spec.kubeadmConfig.spec
Match Template Cloned From
- machines' infrastructure cloened from annotation equals kcp.spec.machineTemplate.infrastructureRef
#### Machine Deployments
------ MachineSet matching rules ~ Rollout rules (current)
Machine's template from MD and machine's template from MS are compared using apiequality.Semantic.Deep equal, escluding hash
=> it include machine.spec.template labels (minus HASH) and annotation as well as nodeDrainTimeout/nodeDrainTimeout
=> TODO: check what about changes to other things that flows down to ms like minReadySeconds, deletePolicy, selector etc
minReadySeconds and deletePolicy changes flows down to MachineSet, but not to machines. is this a bug? I think no (machines do not have those fields); also, the interesting part is that this change already happens "in-place"
what about selector changes
------ MachineSet naming rules (current)
It is based on an hash derived from MD.spec.template
=> it include machine.spec.template labels and annotation as well as nodeDrainTimeout/nodeDrainTimeout
=> it is used to identify the newMachine set (create or use)
=> it will change if there is an API change
------ MachineSet selectors (current)
It is uses the hash derived from MD.spec.template, which is also added to machine labels
## Next steps
- KCP
- [ ] experiment on removing labels from rollout rules
- [ ] experiment on in-place mutations, including fix for nodeDrainTimeout and nodeDeletionTimeout changes propagation
- MD
- [ ] try to disentangle naming, machine selectors, matching rulesĀ
- TL;DR
- current state
- hash is used for MachineSet names as a way to avoid the risk to create the same MS twice due to a slow cache
or a fast resync of the Deployment.
- hash is used in machine selector/labels by convenience (it seems there are no strong reason behind this choice)
- hash is excluded from func defining if rollout is required (EqualMachineTemplate), but this de facto rollbacks
change made when adding hash in machine selector/labels
- hash and EqualMachineTemplate must be consistent because they are two construct implementing the idea of
"same spec/same ms"
Note: Rollout is implemented by creating a new MS and making other MS old.
- option 1: we can exclude labels, annotating, in-place mutable fields from hash/EqualMachineTemplate; this should
not create breaking changes given that hash computation happens only when creating new machine set.
- TODO: check what if we move back and forth from a hash, given that we are increasing the chance for this to happen
- option 2: we can shift away from hash and use a uid, but we have to find a way to prevent creating the same MS twice
- [ ] experiment on in-place mutations
- what about co-authoring. should we use SSA?