CAPI Updates for Airship PTG (April 2021)

# CAPI Updates for Airship PTG (April 2021) ## Roadmap Release of v0.4 (v1alpha4) is likely sometime in June. Initial goal was Q1, 2021. * [Weekly Meeting Notes](https://docs.google.com/document/d/1LdooNTbb9PZMFWy3_F-XAsl7Og5F2lvG3tCgQvoB5e4/edit) * [Roadmap doc](https://github.com/kubernetes-sigs/cluster-api/blob/master/docs/book/src/roadmap.md) (not very up-to-date) * [Provider side changes from v1alpha3 to v1alpha4](https://github.com/kubernetes-sigs/cluster-api/blob/master/docs/book/src/developer/providers/v1alpha3-to-v1alpha4.md) ## Upgrade to v0.4 (aka v1alpha4) ### Upgrading CAPI and Provider Components (e.g. v0.3.16 --> v0.4.0) The `clusterctl upgrade` command can be used to upgrade the version of the Cluster API providers (CRDs, controllers) installed into a management cluster. See [CAPI upgrade docs](https://cluster-api.sigs.k8s.io/clusterctl/commands/upgrade.html). ### Upgrading API object (e.g. v1alpha3 --> v1alpha4) clusterctl does not upgrade Cluster API objects (Clusters, MachineDeployments, Machine etc.); upgrading such objects are the responsibility of the provider’s controllers. Controllers like CAPM3 have conversion functions built in, so conversion *should* be seamless. ### Clusterctl Library airshipctl consumes clusterctl as a library. As of Apirl, 16th, we are importing v0.3.13. * Unless we plan to adopt the Provider Operator (see below), airshipctl code changes should be minimal with respect to clusterctl. ### K Version Upgrades * CAPI / KCP v1alpha4 won't be able to manage Kubernetes clusters < v1.18 * In general, supported versions will be limited. See [thread here](https://github.com/kubernetes-sigs/cluster-api/issues/4444). * Requirement comes primarly through dependence on kubeadm. Currently kubeadm bootstrapper imports kubeadm API types. * CAPI/kubeadm do not handle users API objects (e.g. Deployments, ConfigMaps, etc). For example, if an API is depreciated, the user must either use kubectl or helm to update their manifests. ## Features breakdown ### clusterctl #### CAPI Management Cluster Operator [Proposal Doc](https://github.com/kubernetes-sigs/cluster-api/blob/master/docs/proposals/20201020-capi-provider-operator.md) The goal is to provide a declarative alternative to clusterctl’s imperative design via an operator that handles the lifecycle of providers within the management cluster. * Imperative flow (today): * One or more `Provider` objects are created as a biproduct of `clusterctl init` command. It contains information such as type (core, bootstrap, infra), name (e.g. capbk, metal3) and version (e.g. 0.3.7). * Clusterctl fetches provider artifacts (e.g. components yaml). * Applies image overrides (if any). * Replaces variables via env-var or config file (e.g. credentials). * Applies the resulting yaml to the cluster. * For things like upgrades and deletion of providers, the user issues the appropriate CLI commands (e.g. `clusterctl upgrade/delete`). * Declarative flow: * The user defines in her manifests the desired provider(s) using new CRs:`CoreProvider`, `BootstrapProvider`, `ControlPlaneProvider` and `InfrastructureProvider`. * The `ProviderSpec` in each of the above CRs contains info such as name, and version (similar to the `Provider` CR in the imperative flow) * Controls for how to deploy the provider (e.g. replicas), manager flag overrides (e.g. enable debug mode) and where to fetch the components.yaml are also part of the `ProviderSpec`. * Upgrading a provider can be accomplished by patching the `Version` field in the `ProviderSpec`. * Secret reference is used instead of env-vars/config-file. * Deletion of a provider is accomplished by deleting the provider object. NOTE:`move` operation will not be driven by the operator but rather remain within the CLI for now. #### Re-define scope of clusterctl move `clusterctl move` is currently scoped to a single namespace. Unfortunately, moving resources in other namespaces cannot be done as part of a single move operation. Additionally, globally scoped resources are not moved. Discussions are [still ongoing](https://github.com/kubernetes-sigs/cluster-api/issues/3354). #### Multitenancy The issue is around using the mgmt cluster to create workload clusters for differnt teams/orgs each with their own credentials. Up until v1alpha4, the need of supporting multiple credentials was addressed by running multiple instances of the same provider, each one with its own set of credentials while watching different namespaces. The goal in v0.4.x is to get to a single manager (core and infra provider) that can create/manage workload clusters for a varied set of teams/orgs -- in other words, infrastructure provider should manage different credentials, each one of them corresponding to an infrastructure tenant. [Issue #3042](https://github.com/kubernetes-sigs/cluster-api/issues/3042). #### UX and Troubleshooting * Conditions [Issue #3005](https://github.com/kubernetes-sigs/cluster-api/issues/3005) * It's hard to understand when a Machine is fully healthy. For example, KCP performs kubeadm specific health checks (inspect the etcd node, static pods, and so on). The status of these individual checks can be exposed by writing specific conditions on Machines. * Similarly, it's hard to understand the state of the entire workload Cluster or the state of a specific infra provider object (e.g. Metal3Machine) -- the goal of conditions is to provide a uniform and in-depth view on the health of various objects in the cluster. * Providers need to add support for conditions. * This support is currently lacking in metal3. * `clusterctl rollout` [Issue #3439](https://github.com/kubernetes-sigs/cluster-api/issues/3439) * CLI commands to rollout updates to control-plane and worker nodes (`restart`), inspect a rollout as it occurs (`status`), rollback changes if needed (`undo`) and view the rollout `history`. * Currently, only MachineDeployments are supported and KCP support planed for the future. * Bootstrapping Failures [Issue #3716](https://github.com/kubernetes-sigs/cluster-api/issues/3716) * The basic problem is that with CAPBK, when cloud-init is run on the node during bootstrap, the unerlying commands such as kubeadm may fail; however, there is no way for providers to report this. Even if bootstrap failed, CAPI may continue on-wards. * A simple check for successful bootrapping has been added: CABPK writes a sentinel file to signal successful bootstrapping. The existence of the file is indication that kubeadm init/join succeded on the node. * It is left to the infra providers to check if this file was created before moving forward. Additionally, infra providers that support Conditions can bubble this info up to the user via conditions on the Infra Machine object -- for example, today CAPD sets the condition BootstrapExecSucceededCondition upon sucessful bootstrap of the node. * Support for Conditions is lacking in metal3. * `clusterctl describe cluster` [Issue #3802](https://github.com/kubernetes-sigs/cluster-api/issues/3802) * Provides an "at glance" view of Conditions in the cluster for quickly understanding if there are problems and where. ## CAPI Core ### KCP should be fully mutable We have come across use-cases where we want to reconfigure our control plane nodes (e.g. change a flag to the api server or change a kubelet parameter). The KubeadmConfig is fully mutable now and any changes will invoke a rolling upgrade of all the controlplane nodes. * https://github.com/kubernetes-sigs/cluster-api/issues/2083 #### Node Hardware Attestation using TPMs * When joining a kubeadm initialized cluster, as is the case with CAPBK, we need to establish bidirectional trust -- having the Node trust the Kubernetes Control Plane and having the Kubernetes Control Plane trust the Node. * For the latter, bootstrap tokens with a limited lifespan (e.g. 24 hours) are used. Note the tokens are not scoped down to a specific node, so in theory any arbitrary node can join using this token. If this token were to be compromised, then an an attacker can potentially add a rogue node to the cluster. * Further complicating matters is if the bootstrap token is specified in the KubeadmConfigTemplate then this template will need to be changed everytime the token expires. Workaround is to use long-lived tokens but that introduces security concerns. * The [proposal](https://github.com/kubernetes-sigs/cluster-api/pull/4219) is to add support for hardware based attestation for node identity. It would bypass the kubeadm token approach but requires hardware support. #### HTTP Proxy for Egress Traffic Production environments can deny direct access to the Internet and instead have an HTTP or HTTPS proxy available. The [following approach](https://github.com/kubernetes-sigs/cluster-api/issues/3751) should allow the the http proxy settings to be defined at the KubeadmConfig level and be propogated to each Node via cloud-init. #### Load Balancer (Support in 0.4.x seems unlikely at this time) * Supprt provisioning of a load balancer to sit in front of the Kubernetes control plane instances in order to provide a stable endpoint. * Support provisioning of a load balancer to front all workload services. * Will require support from the providers. * Unlikely to be supported in 0.4.x. Community needs help here. Proposal: [github](https://github.com/kubernetes-sigs/cluster-api/blob/21656ca2d1a3b69550db48a5f568c86c0625e5b5/docs/proposals/20210335-load-balancer-provider.md) / [google doc](https://docs.google.com/document/d/1wJrtd3hgVrUnZsdHDXQLXmZE3cbXVB5KChqmNusBGpE/edit#) #### Cluster Class Provide a super simple day-0 experience ([Issue #4430](https://github.com/kubernetes-sigs/cluster-api/issues/4430)). Users only needs to specify a `Cluster` and `ClusterClass` where the specific `ClusterClass` types may be provided by infra provider. ``` kind: Cluster spec: class: "dev-azure-simple-small" version: v1.19.1 managed: controlPlane: replicas: 3 workers: - role: Worker1 replicas: 10 --- kind: ClusterClass metadata: name: dev-azure-simple-small spec: infrastructure: workerNodeTypes: - type: worker-set-1 infrasturctureRef: worker-template-1 controlPlaneRef: infrastructureRef: my-cp-template ``` ### Features not currenlty scoped for v0.4 These are features that may be of interest to us. #### Label Sync between MachineDeployment <--> Node(s) - [Issue #493](https://github.com/kubernetes-sigs/cluster-api/issues/493) - We can push this proposal further if its useful to the airship community. - Note that we recently added BMH <--> Node [label sync mechanism in metal3](https://github.com/metal3-io/metal3-docs/blob/master/design/sync-labels-bmh-to-node.md). #### Kubelet Configuration * kubeadm allows central management of kubelet configs -- use a single KubeletConfiguration to customize the kubelets parameters for your cluster; unfortunately, CABPK uses a default KubeletConfiguration. * Users wishing to customize the kubelet must use kubeletExtraArgs via nodeRegistration or patch the kubelet config via prekubeadm hooks. * [Issue #4464](https://github.com/kubernetes-sigs/cluster-api/issues/4464) seeks to open the KubeletConfiguration to the user.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.