# Design changes based on review
## Design keypoint
### From review decision
- CAPI is responsible for orchestrate machine update
- Split changes and allow multiple external updaters to collaborate
- InPlaceUpdatePlan CR is introduced to make update blueprint and then execute
- leverage existing CR (KubeadmControlPlane/MachineDeployment/MachineSet/Machine) to handle in-place update
### From my idea
- CAPI leverge updaterUpdateTask contract to exchange status
- implement webhook is complex. CR contract is a widely used techinque for commmuncating between capi core and capi providers.
- Define aborting updatePhase
- for inplace update, we need handle abort carefully to avoid machine run into chaos status
- when a new change comes while doing inplace update, we shall call `abort` to abort current on-going inplace update and wait unitl it completed.
## Chanlleges
- How to identify the changeSet when consider different situations
- start update on a cluster which all machines are on target state
- start update on a cluster with inconsistent state, some machines updated, some are not
- How to arrange the sequence of updater
- Are all current Infra providers assume Machine spec immutable?
- How to create a generic k8s component upgrade updater?
- How to expose shell interface from infra provider?
## Sequence diagram
```plantuml
@startuml
actor User as user
participant "Toplology\ncontroller" as top
participant "InPlaceUpdatePlan\ncontroller" as clustert
participant "ControlPlane\ncontroller" as cpt
participant "MachineDeployment\ncontroller" as mdt
participant "Machine\ncontroller" as mt
box external
collections updaters
end box
activate top
activate cpt
activate mdt
activate mt
user -> top: request update
loop iterate updaters
top -> updaters: call Lifecycle Runtime hook\n**ExternalUpdateRequest**
updaters --> top: accept with `updaterName` `responsibilities` and `updateTaskTemplate`
end
alt#white #pink not all change picked
top -> top: fallback to rollout\nor mark update failed
end
top -> clustert: create inPlaceUpdateTask
activate clustert
alt no ongoing update
clustert -> cpt: set `update-task-name` label
clustert -> mdt: set `update-task-name` label
else #pink has ongoing update
clustert -> clustert: wait and reenqueue
end
clustert -> cpt: set updatePhase to start\nsequence 1 update: [cluster1-cp]
loop iterate machines
cpt -> cpt: preflight check
cpt -> mt: pick 1 machine, set `update-task-name` label\nand updatePhase
loop sort by sequence and iterate updater
mt -> updaters: create updaterUpdateTask
activate updaters
updaters -> updaters: operate on\nmachine
updaters --> mt: report .status.updateStatus
deactivate updaters
end
mt --> cpt: report .status.updateStatus
end
cpt --> clustert: report .status.updateStatus
clustert -> mdt: set updatePhase to start\nsequence 2 update: [cluster-md1, cluster-md2]
mdt -> mdt: create MachineSet with new spec
loop iterate machines
mdt -> mdt: preflight check
mdt -> mt: pick 1 machine, set `update-task-name` label\nand updatePhase
loop sort by sequence and iterate updater
mt -> updaters: create updaterUpdateTask\nand set owner
activate updaters
updaters -> updaters: operate on\nmachine
updaters --> mt: report .status.updateStatus
deactivate updaters
end
mt --> mdt: report .status.updateStatus
mdt -> mt: move machine from old MachineSet to\nnew MachineSet
end
mdt --> clustert: report .status.updateStatus
clustert -> clustert: update .status.updateStatus
deactivate clustert
@enduml
```
## Webhook definitions
```
controlPlaneUpdaterequest {
clusterName
controlplaneName
changeSet
}
machineDeploymentUpdaterequest {
clusterName
machineDeploymentName
changeSet
}
response {
status: accept|reject|error
message
responsibilities
updaterName
updateTaskTemplateRef
}
```
## CRD definitions
```yaml
---
# InPlaceUpdatePlanController is responsible for reconcile in-place update on cluster level
# reconcile:
# - if updatePhase == `updating`
# - set cluster.x-k8s.io/update-task-name label for cp/md to be updated
# - aggregate cp/md update status
# - if cp/md in current update sequence finished, start cp/md update in next sequence num
# - if updatePhase == `aborting`
# - change updatePhase == `aborting` for cp/md which is in `updating` phase
# - remove cluster.x-k8s.io/update-task-name label
InPlaceUpdatePlan
spec:
updatePhase: updating|aborting
controlPlane:
name: cluster-cp1
sequence: 1
updaters:
- name: osUpdater
updaterUpdateTemplateRef:
apiVersion: capzupdate.cluster.x-k8s.io/v1beta1
kind: capzVmOsUpdateTaskTemplate
name: capzVmOsUpdateTask
sequence: 1
responsibilities:
- path: /spec/image
selectResource: InfraMachine
value: azurelinux3.0-240602-k8s
- name: versionUpdater
updaterUpdateTemplateRef:
apiVersion: update.cluster.x-k8s.io/v1beta1
kind: nodeVersionUpdateTaskTemplate
name: nodeVersionUpdateTask-control-plane
sequence: 3
responsibilities:
- path: /spec/version
selectResource: Machine
value: v1.30.1
- name: kubeadmNtpUpdater
updaterUpdateTemplateRef:
apiVersion: kubeadmupdate.cluster.x-k8s.io/v1beta1
kind: kubeadmNtpUpdateTaskTemplate
name: kubeadmNtpUpdateTask
sequence: 2
responsibilities:
- path: /spec/NTP
selectResource: Bootstrap
value: |
server:
- time-a-g.nist.gov
enabled: true
machineDeployments:
- name: cluster1-md1
sequence: 2
updaters: [...] # ignored
- name: cluster1-md2
sequence: 2
updaters: [...] # ignored
status:
updateStatus: updating|updated|aborting|aborted
conditions:
- type: controlPlaneUpdate
- type: machineDeploymentUpdate
---
# KubeadmControlPlaneController is responsible for reconcile in-place update on kubeadmControlPlane
# reconcile:
# - preflight check
# - aggregate machine in-inplace update status
# - pick a machine and do in-place update
# - if updatePhase == `aborting`, change updatePhase == `aborting` for machine which is in `updating` phase
KubeadmControlPlane
metadata:
labels:
cluster.x-k8s.io/update-task-name: updatetask1
spec:
updatePhase: updating|aborting
status:
updateStatus: updating|updated|aborting|aborted
conditions:
- type: InPlaceUpdate # aggregate from machine InPlaceUpdate condition
---
# MachineDeploymentController is responsible for reconcile in-place update on machinedeployment
# reconcile:
# - create machineSet with new Spec
# - move updated machine from old machineSet to new machineSet
# - preflight check
# - aggregate machine in-inplace update status
# - pick a machine and do in-place update
# - if updatePhase == `aborting`, change updatePhase == `aborting` for machine which is in `updating` phase
MachineDeployment
metadata:
labels:
cluster.x-k8s.io/update-task-name: updatetask1
spec:
updatePhase: updating|aborting
status:
updateStatus: updating|updated|aborting|aborted
conditions:
- type: InPlaceUpdate # aggregate from machine InPlaceUpdate condition
---
# MachineInPlaceUpdateController is responsible for reconcile in-place update on machine level
# reconcile:
# - orchestrate updater and create updaterUpdateTask
# - when updatePhase == `updating`, aggregate updaterUpdateTask status, then update `updateStatus` field and `InPlaceUpdate` condition
# - when updatePhase == `aborting`, abort updaterUpdateTask, aggregate updaterUpdateTask status, then then update `updateStatus` field and `InPlaceUpdate` condition (timeout may need)
Machine:
metadata:
labels:
cluster.x-k8s.io/update-task-name: updatetask1
spec:
updatePhase: updating|aborting
status:
updateStatus: updating|updated|aborting|aborted
condition:
- type: InPlaceUpdate # aggregate from updaterUpdateTasks
---
# the contract of UpdaterUpdateTask.
# it MUST has spec.updatePhase to indicate the reconcile target, and has updateStatus to indicate the reconcile result
# it SHALL have condition to report if error happens and detailed reason
MachineUpdateContract:
spec:
updatePhase: updating|aborting
status:
updateStatus: updating|updated|aborting|aborted
conditions:
---
# updaterUpdateTask is responsbile for part of update work on target machine
# responsibility is decided in ControlPlaneInPlaceUpdateTask/MachineDeploymentInPlaceUpdateTask responsibilities field.
# reoncile:
# - when updatePhase == `updating`, update machine based on responsiblity, then update `updateStatus` field and `ready` condition
# - when updatePhase == `aborting`, abort update operation if has, then then update `updateStatus` field and `ready` condition
CapzVmOsUpdateTask
metadata:
labels:
cluster.x-k8s.io/cluster-name: cluster1
cluster.x-k8s.io/control-plane-name: cluster1-cp
cluster.x-k8s.io/update-task-name: updatetask1
cluster.x-k8s.io/machine-name: cluster1-cp-1
cluster.x-k8s.io/updater-name: versionUpdater
spec:
updatePhase: updating
status:
updateStatus: updated
conditions:
- type:
```
### remediation snipt
One thing to notice, after using **external remediator** to recover machine, the machine may still use the old spec, so it's necessary to rerun the inplace update plan on remediated machine.
Another thing, Machine health check rules shall be extended so that it can monitor the machine in-place update progress.
#### Machine Health Check
- It shall be a way to identify if a machine is under inplace update, and when it's started.
- A inplace update timeout setting shall be added to MHC spec. When a machine stays at inplace update stage for too long and exceeds the timeout threshold, MHC shall mark machine `HealthCheckSucceeded` Condition to false.
- A contract with external updater is needed to determine if update operation fails and in terminate state. When a machine fall into this state, MHC shall mark machine `HealthCheckSucceeded` condition to false. [TODO: need consider this in contract design]
- When calculating allowed remediation count, it shall treat the updating machines as `unhealthy`, because it's expected that updating machine may offline during update. ` RemediationCount = maxAllowedUnhealthy - unhealthyMachineCount - inUpdateHealthyMachineCount`
- Corner case 1: A new machine which is not initialized yet and in updating stage, timeout shall be `initializeTimeout + inPlaceUpdateTimeout`