owned this note
owned this note
Published
Linked with GitHub
# Spike: Understand CoreDNS upgrades for an existing cluster
# Link for Story
https://itrack.web.att.com/browse/CPVYGR-491
# Description
To get a better understanding of the implications of upgrading the CoreDNS component in a brownfield scenario
# Understanding CoreDNS Deployment and Upgrade Scenario
Airship 2 uses CAPI which enables users to manage fleets of clusters across multiple infrastructure providers. In summary CAPI is a Kubernetes project to bring declarative, Kubernetes-style APIs to cluster creation, configuration, and management.
CAPI for baremetal infrastructure, uses Kubeadm for deployment of Kubernetes Control plane components which also includes CoreDNS pod and service.
```
ubuntu@airship2:~$ sudo -E kubectl --kubeconfig /home/ubuntu/.airship/kubeconfig --context target-cluster get po -n kube-system
NAME READY STATUS
coredns-66bff467f8-5b77g 1/1 Running
coredns-66bff467f8-dlc2x 1/1 Running
ubuntu@airship2:~$ sudo -E kubectl --kubeconfig /home/ubuntu/.airship/kubeconfig --context target-cluster get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
kube-dns ClusterIP 10.0.0.10 <none> 53/UDP,53/TCP,9153/TCP
```
Kubernetes resources for coreDNS are:
1. service account named coredns
2. cluster-roles named coredns and kube-dns
3. clusterrolebindings named coredns and kube-dns
4. deployment named coredns
5. configmap named coredns and a
6. service named kube-dns.
## CoreDNS upgrade Ease (Is CoreDNS upgrade disruptive i.e. no-op vs. reboot cluster?)
### CoreDNS Deployment and Upgrade Process
CoreDNS is deployed in a kubernetes cluster as a part of Deployment with a default replicaset of 2. Same applies to installation via CAPI or Kubeadm
Features of CoreDNS deployment as a part of Kubeadm/CAPI kubernetes deployment:
* Replicaset:
* 2 (default)
* StrategyType:
* RollingUpdate (default strategy for k8s deployments)
Since the CoreDNS is deployed with a replicaset of 2 and with RollingUpdate strategy, at any time upgrade/ downgrade of CoreDNS doesn’t causes any outage situations as atleast 1 replica is functionaing at all times.
* There is no reboot required as a part of CoreDNS upgrade, however with the upgrade it is possible that Corefile for CoreDNS becomes outdated. Under these circumstances user has to manually update the Corefile as per release to be deployed.
To identify possible backward incompatibilities, one need to review the CoreDNS release notes.
CoreDNS Release Notes are located in the CoreDNS blog (https://coredns.io/blog/ ). The CoreDNS deprecation policy is such that it introduces backward incompatibilities in x.x.0 and x.x.1 release. So, for example, if upgrading from 1.1.5, to 1.3.1, one should check release notes for 1.2.0, 1.2.1, 1.3.0, and 1.3.1 for any deprecation/backward incompatibility notices.
If there are any backward incompatibility notices, one should review Corefile to see the impact.
This Corefile is a part of configmap name “coredns” for CoreDNS pods.
```
ubuntu@airship2:~$ sudo -E kubectl --kubeconfig /home/ubuntu/.airship/kubeconfig --context target-cluster describe cm/coredns -n kube-system
Name: coredns
Namespace: kube-system
Labels: <none>
Annotations: <none>
Data
====
Corefile:
----
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
Events: <none>
```
health, errors etc are plugins for which deprecation needs to be validated with newer release.
## Handling CoreDNS upgrades
### Upgrading CoreDNS with Kubeadm
Kubeadm updates CoreDNS as a part of a Kubernetes cluster upgrade. In doing so, it will replace/reset any custom changes to your CoreDNS Deployment. For example, if the number of replicas in CoreDNS deployment is increased, after an upgrade it will be reset back to the default (2). Kubeadm will not however change CoreDNS Configmap. If CoreDNS Corefile contains any backward incompatible configurations, it has to be fixed manually before updating.
Note: To manage CoreDNS outside Kubeadm, Kubeadm can skip coreDNS installation using ``` --skip-phases=addon/coredns``` that allows end user to manage its own DNS.
### Upgrading CoreDNS with CAPI
* Using Kubernetes version for upgrade in KCP (KubernetesControlPlane) config
CAPI internally uses CAPBK (Cluster API Bootstrap Kubernetes Provider) to handle the kubernetes upgrades, for Baremetal servers it uses kubeadm.
CAPI handles CoreDNS installation/ upgrade depending upon the version of K8s specified in KCP config.
To skip CoreDNS upgrades as a part of Kubernetes upgrade, there is an annotation in KCP ```controlplane.cluster.x-k8s.io/skip-coredns```. However, this annotation is limited to upgrade scenarios.
Since CAPI automatically handles CoreDNS upgrades, it also updates Corefile by using CoreDNS corefile-migration library which is internally handled by KCP controller (https://github.com/coredns/corefile-migration).
This corefile-migration library helps with migration of Corefile as per release of CoreDNS unlike Kubeadm where this was a manual process.
* Upgrading just the CoreDNS
As per recent PRs in CAPI, now there is a feature in CAPI where in the user can specify the CoreDNS image details in KCP spec for upgrading CoreDNS to a specific version. Link for the patchset is https://github.com/kubernetes-sigs/cluster-api/pull/2574
If image is changed in KubeadmControlPlane spec, it causes auto-reconciliation which was tested in lab environment.
#### Deployment with Current Airshipctl repo:
Kubernetes Version: 1.18.6
CoreDNS version: 1.6.7
Deployed KubeadmControlPlane configuration (partial):
```
apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
kind: KubeadmControlPlane
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"controlplane.cluster.x-k8s.io/v1alpha3","kind":"KubeadmControlPlane","metadata":{"annotations":{},"name":"cluster-controlplane","namespace":"target-infra"},"spec":{"infrastructureTemplate":{"apiVersion":"infrastructure.cluster.x-k8s.io/v1alpha4","kind":"Metal3MachineTemplate","name":"cluster-controlplane"},"kubeadmConfigSpec":{"clusterConfiguration":{"apiServer":{"certSANs":["dex.utility.local"],"extraArgs":{"allow-privileged":"true","authorization-mode":"Node,RBAC","enable-admission-plugins":"NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,ResourceQuota,DefaultTolerationSeconds,NodeRestriction","feature-gates":"PodShareProcessNamespace=true","kubelet-preferred-address-types":"InternalIP,ExternalIP,Hostname","oidc-ca-file":"/etc/kubernetes/certs/dex-cert","oidc-client-id":"utility-kubernetes","oidc-groups-claim":"groups","oidc-issuer-url":"https://dex.utility.local:30556/dex","oidc-username-claim":"email","requestheader-allowed-names":"front-proxy-client","requestheader-group-headers":"X-Remote-Group","requestheader-username-headers":"X-Remote-User","service-cluster-ip-range":"10.0.0.0/20","service-node-port-range":"80-32767","tls-cipher-suites":"TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA","tls-min-version":"VersionTLS12","v":"2"},"extraVolumes":[{"hostPath":"/etc/kubernetes/certs/dex-cert","mountPath":"/etc/kubernetes/certs/dex-cert","name":"dex-cert","readOnly":true}],"timeoutForControlPlane":"1000s"},"controllerManager":{"extraArgs":{"bind-address":"127.0.0.1","cluster-cidr":"192.168.16.0/20","configure-cloud-routes":"false","enable-hostpath-provisioner":"true","node-monitor-grace-period":"20s","node-monitor-period":"5s","pod-eviction-timeout":"60s","port":"0","terminated-pod-gc-threshold":"1000","use-service-account-credentials":"true","v":"2"}},"imageRepository":"k8s.gcr.io","networking":{"dnsDomain":"cluster.local","podSubnet":"192.168.16.0/20","serviceSubnet":"10.0.0.0/20"}},"files":[{"content":"[Service]\nEnvironment=\"HTTP_PROXY=http://pxyapp.proxy.att.com:8080\"\nEnvironment=\"HTTPS_PROXY=http://pxyapp.proxy.att.com:8080\"\nEnvironment=\"NO_PROXY=localhost,127.0.0.1,.att.com,10.23.0.0/16,10.96.0.0/12,10.23.24.0/24,10.23.25.0/24,10.23.25.101,10.23.25.102,10.23.25.103,10.23.24.101,10.23.24.102,10.23.24.103,dex.function.local,107.124.202.144,10.0.0.1/20\"\n","path":"/etc/systemd/system/containerd.service.d/http-proxy.conf"},{"contentFrom":{"secret":{"key":"tls.crt","name":"dex-apiserver-secret"}},"owner":"root:root","path":"/etc/kubernetes/certs/dex-cert","permissions":"0644"},{"content":"! Configuration File for keepalived\nglobal_defs {\n}\nvrrp_instance KUBERNETES {\n state BACKUP\n interface bond.41\n virtual_router_id 101\n priority 101\n advert_int 1\n virtual_ipaddress {\n 10.23.25.103\n }\n}\nvrrp_instance INGRESS {\n state BACKUP\n interface bond.41\n virtual_router_id 102\n priority 102\n advert_int 1\n virtual_ipaddress {\n 10.23.25.104\n }\n}\n","path":"/etc/keepalived/keepalived.conf"}],"initConfiguration":{"nodeRegistration":{"criSocket":"unix:///run/containerd/containerd.sock","kubeletExtraArgs":{"cgroup-driver":"systemd","container-runtime":"remote","node-labels":"metal3.io/uuid={{ ds.meta_data.uuid }},node-type=controlplane"},"name":"{{ ds.meta_data.local_hostname }}"}},"joinConfiguration":{"controlPlane":{},"nodeRegistration":{"criSocket":"unix:///run/containerd/containerd.sock","kubeletExtraArgs":{"cgroup-driver":"systemd","container-runtime":"remote","node-labels":"metal3.io/uuid={{ ds.meta_data.uuid }},node-type=controlplane"},"name":"{{ ds.meta_data.local_hostname }}"}},"ntp":{"servers":["0.pool.ntp.org","1.pool.ntp.org","2.pool.ntp.org","3.pool.ntp.org"]},"preKubeadmCommands":["export HOME=/root","mkdir -p /etc/containerd","containerd config default | sed -r -e '/\\[plugins.\"io.containerd.grpc.v1.cri\".containerd.runtimes.runc\\]$/a\\ SystemdCgroup = true' | tee /etc/containerd/config.toml","systemctl daemon-reload","systemctl restart containerd","echo '10.23.25.102 dex.utility.local' | tee -a /etc/hosts","systemctl enable --now keepalived","systemctl restart keepalived"],"users":[{"name":"deployer","sshAuthorizedKeys":["ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDU6ve+JuhuSOEcInFrpZVbiqPNBhMLSBfzcm9pcWnouGSKHA0UnlTb8RvlyOQC4Oy+DkV4jf++nfQZg3586EbTpUnZSYnRHmM+60r1JQ/aAp7+cRNLIllJobbjeu/7f1/9alcG8eBw9eIXaKtlrlpAEtHxph4cFyLJHM1BKcRjeoy9SwkObqfYjfyqmTdcYefgewLTBqYfVWeAGC4v2fcJc2+Ct3So4GIxrZ6eJ7PIQqWjh8/W//jMyPYxOLrt7Nuwk9meUzp5HQ3mk47cPjzrq+ohUnnUENf0Bm+LtiClZr3lnQA7pqVPVKgzdY4OA9lD+vmjvDb4ZuB1IspTILrxQOwBdugeRhckblxrkVUCfvCU7445P/N6cQxiYLsCPn02LXxVKnYdlawpy+24zkujXxtu9KMr4j758uBcRoDN9/T569ibanZQUSPDI5buCc2kVUN3zSQ9tP/sjxcZo6in3CWqrMSpoG7Rx8gDZjX/JFdDgrdLYY+pjJZxl/AIea13Q6NmqiGUP5lbUvLmW8Xl4CcQ92qYHyy6/Rl4ZZtri69MuCt5BWOdkayvUHdTTHLR0ucrqL9Ktecz9WKiv54dHyxzD1Gr8sahZiawM3MXNTkstYwA8KPekYNX9dbBvTxjW12z0W0usrJMVzR9MCkH7jJ79nos8SW7bdPTkBz1FQ==\n"],"sudo":"ALL=(ALL) NOPASSWD:ALL"}]},"replicas":1,"version":"v1.18.6"}}
creationTimestamp: "2021-08-19T06:36:27Z"
finalizers:
- kubeadm.controlplane.cluster.x-k8s.io
labels:
cluster.x-k8s.io/cluster-name: target-cluster
name: cluster-controlplane
namespace: target-infra
spec:
infrastructureTemplate:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4
kind: Metal3MachineTemplate
name: cluster-controlplane
namespace: target-infra
kubeadmConfigSpec:
clusterConfiguration:
apiServer:
certSANs:
- dex.utility.local
extraArgs:
allow-privileged: "true"
authorization-mode: Node,RBAC
enable-admission-plugins: NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,ResourceQuota,DefaultTolerationSeconds,NodeRestriction
feature-gates: PodShareProcessNamespace=true
kubelet-preferred-address-types: InternalIP,ExternalIP,Hostname
oidc-ca-file: /etc/kubernetes/certs/dex-cert
oidc-client-id: utility-kubernetes
oidc-groups-claim: groups
oidc-issuer-url: https://dex.utility.local:30556/dex
oidc-username-claim: email
requestheader-allowed-names: front-proxy-client
requestheader-group-headers: X-Remote-Group
requestheader-username-headers: X-Remote-User
service-cluster-ip-range: 10.0.0.0/20
service-node-port-range: 80-32767
tls-cipher-suites: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA
tls-min-version: VersionTLS12
v: "2"
extraVolumes:
- hostPath: /etc/kubernetes/certs/dex-cert
mountPath: /etc/kubernetes/certs/dex-cert
name: dex-cert
readOnly: true
timeoutForControlPlane: 16m40s
controllerManager:
extraArgs:
bind-address: 127.0.0.1
cluster-cidr: 192.168.16.0/20
configure-cloud-routes: "false"
enable-hostpath-provisioner: "true"
node-monitor-grace-period: 20s
node-monitor-period: 5s
pod-eviction-timeout: 60s
port: "0"
terminated-pod-gc-threshold: "1000"
use-service-account-credentials: "true"
v: "2"
dns: {}
etcd: {}
imageRepository: k8s.gcr.io
```
#### Changing the Image version to 1.8.4 causes reconciliation
```
dns:
imageRepository: k8s.gcr.io/coredns/coredns
imageTag: v1.8.4
```
Updated KubeadmControlPlane configuration (partial)
```
ubuntu@nc:~$ sudo -E kubectl --kubeconfig /home/ubuntu/.airship/kubeconfig --context target-cluster describe kubeadmcontrolplane/cluster-controlplane -n target-infra
Name: cluster-controlplane
Namespace: target-infra
Labels: cluster.x-k8s.io/cluster-name=target-cluster
Annotations: API Version: controlplane.cluster.x-k8s.io/v1alpha3
Kind: KubeadmControlPlane
Metadata:
Creation Timestamp: 2021-08-19T06:36:27Z
Finalizers:
kubeadm.controlplane.cluster.x-k8s.io
labels:
cluster.x-k8s.io/cluster-name: target-cluster
name: cluster-controlplane
namespace: target-infra
Spec:
Infrastructure Template:
API Version: infrastructure.cluster.x-k8s.io/v1alpha4
Kind: Metal3MachineTemplate
Name: cluster-controlplane
Namespace: target-infra
Kubeadm Config Spec:
Cluster Configuration:
API Server:
Cert SA Ns:
dex.utility.local
Extra Args:
Allow - Privileged: true
Authorization - Mode: Node,RBAC
Enable - Admission - Plugins: NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,ResourceQuota,DefaultTolerationSeconds,NodeRestriction
Feature - Gates: PodShareProcessNamespace=true
Kubelet - Preferred - Address - Types: InternalIP,ExternalIP,Hostname
Oidc - Ca - File: /etc/kubernetes/certs/dex-cert
Oidc - Client - Id: utility-kubernetes
Oidc - Groups - Claim: groups
Oidc - Issuer - URL: https://dex.utility.local:30556/dex
Oidc - Username - Claim: email
Requestheader - Allowed - Names: front-proxy-client
Requestheader - Group - Headers: X-Remote-Group
Requestheader - Username - Headers: X-Remote-User
Service - Cluster - Ip - Range: 10.0.0.0/20
Service - Node - Port - Range: 80-32767
Tls - Cipher - Suites: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA
Tls - Min - Version: VersionTLS12
V: 2
Extra Volumes:
Host Path: /etc/kubernetes/certs/dex-cert
Mount Path: /etc/kubernetes/certs/dex-cert
Name: dex-cert
Read Only: true
Timeout For Control Plane: 16m40s
Controller Manager:
Extra Args:
Bind - Address: 127.0.0.1
Cluster - Cidr: 192.168.16.0/20
Configure - Cloud - Routes: false
Enable - Hostpath - Provisioner: true
Node - Monitor - Grace - Period: 20s
Node - Monitor - Period: 5s
Pod - Eviction - Timeout: 60s
Port: 0
Terminated - Pod - Gc - Threshold: 1000
Use - Service - Account - Credentials: true
V: 2
Dns:
Image Repository: k8s.gcr.io/coredns/coredns
Image Tag: v1.8.4
Selector: cluster.x-k8s.io/cluster-name=target-cluster,cluster.x-k8s.io/control-plane
Unavailable Replicas: 1
Updated Replicas: 1
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ControlPlaneUnhealthy 31s (x26 over 7m47s) kubeadm-control-plane-controller Waiting for control plane to pass control plane health check to continue reconciliation: control plane machine target-infra/cluster-controlplane-w9ns2 has no status.nodeRef
```
```
ubuntu@nc:~$ sudo -E kubectl --kubeconfig /home/ubuntu/.airship/kubeconfig --context target-cluster get machines -A
NAMESPACE NAME PROVIDERID PHASE
target-infra cluster-controlplane-6x4zg metal3://5d69da60-2156-49a5-aefe-e9b2dfb56c1d Running
target-infra cluster-controlplane-w9ns2 Provisioning
target-infra worker-1-5bc87ff868-t22q9 metal3://6b89b122-a650-4a5f-9e4b-af495dd8f4c5 Running
```
However in lab environment, there seemed to be some issue with respect to metal3 and node didnt come to provisioned status
## Other Important Aspects
* As mentioned above, since CAPI automatically handles Corefile upgardes whenever there is a new version in CoreDNS, corefile-migration librabry has to be kept upto date in CAPI.
User may upgrade CoreDNS with KCP unaware of the corresponding update of Corefile-migration missing from CAPI which could break the cluster.
Solution to above issue is mentioned in PR (currently open) https://github.com/kubernetes-sigs/cluster-api/issues/2599/v2FyhZcWRhqg3uCsJvbdpA
* There are multiple issues that has been merged related to CoreDNS, so ideally CAPI should be updated from 0.3.7 (current version in airship) to 0.4.x for upgrading CoreDNS to 1.8.4 with Kubernetes version 1.22. One such issue is highlighted below:
https://github.com/kubernetes-sigs/cluster-api/pull/4957
* CoreDNS installation as of now is a part of CAPI installation process which uses CABPK and user cannot manage/remove CoreDNS installation from CAPI. However with the merge of PR https://github.com/kubernetes-sigs/cluster-api/pull/2574 user can use CAPI for upgrade of CoreDNS
There is still an open Feature Request (lifecycle frozen though) for possibility of disabling the CoreDNS installation during CAPI installation itself. https://github.com/kubernetes-sigs/cluster-api/issues/3698
* Can you deploy an explicit version vs. the k8s default?
This works with Upgrade scenario as mentioned in above point. However currently user cannot deploy an explicit version of CoreDNS as a part of CAPI installation. This was tested in lab environment by updating the KubeadmControlPlane with desired image tag which had no impact (in lab environment, Kubernetes Version was 1.18.6 and CoreDNS deployed version was 1.6.7 whereas, CoreDNS image tag specified in KCP was 1.8.4)
# Outcome of Internal Design discussion
Considering the complexities involved with just CoreDNS upgrade as a part of CAPI cluster, it was discussed and concluded that upgrading the Kubernetes Version itself as a part of KCP spec is the best suitable approach to upgrade CoreDNS as well.
That way, the unnecessary complexities with just the CoreDNS upgrade can be avoided and we can go with the suggested CAPI kubernetes components version.