Upgrade Rook for a brownfield site [TM#194](https://github.com/airshipit/treasuremap/issues/194)

# Upgrade Rook for a brownfield site [TM#194](https://github.com/airshipit/treasuremap/issues/194) [TOC] ## Dependencies and constraints for a Rook upgrade in brownfield deployment This POC will determine the proposed plan to upgrade an existing rook operator from v1.6.11 to v1.7.11. It will also identify the required sequence, dependencies and constraints. As well as any impacts to cluster availability and performance during an upgrade. Assuming that the original cluster was deployed using rook-ceph operator, the brownfield scenario could consist of two independent steps: * Operator upgrade * Ceph upgrade Both steps could be accomplished in any sequence by following set of rules listed below: 1. Before upgrade the ceph cluster should be in a healthy state. It is possible (but not recommended) to perform an upgrade on a cluster which has some warnings alarms, however in this event a person responsible for maintenance should make a decision. Below are some examples of warnings when we still can proceed with upgrades: * some osds are permanently out/down because of drives errors. * some of PGs are in peering/waiting state because of scrubbing or deep scrubbing * there are some PGs that are not scrubbed in time * there are large omap objects However, warnings like * osd almost full * osds are flapping and similar Should be considered as a red flag for the brownfield upgrade. To summarize warnings listed above - there should be made a human decision about warning severity. 2. The upgrade should be performed within two major releases, e.g: rook 1.6 -> 1.7 and/or ceph 15.x -> ceph 16.x. It is recommended to upgrade ceph to the latest minor release before performing a major release upgrade. 3. Planning ceph upgrade to the next major release, it is recommended to perform the operator upgrade first. Usually, rook operator supports three major ceph releases N-1, N and N+1, e.g: rook 1.7 supports Nautilus, Octopus and Pacific. 4. It is possible to perform downgrade, as well. For the ceph the downgrade was tested between minor releases. Performing downgrade the attention should be paid to the ceph release notes. We can downgrade between bug-fix releases, but feature releases shouldn't be downgraded under any circumstances. As an example, latest octopus should not be downgraded to the previous minor versions because of data base schematics change. 5. Different upgrade scenarios performed in the local lab confirming that there are no significant performance or availability impacts. The operator upgrade doesn't affect a ceph functionality, according to the rook documentation, the ceph cluster remains fully functional with only minimal limitations. The performance impact during the ceph upgrade is absolutely comparable to the impact triggered by regular maintenance like osd node reboot or hard drive replacement. This level of impact is expected and well documented. To summarize the above statement, both brownfield operations are harmless for the cluster. ## Rook Operator Upgrade Process In this scenario we will upgrade rook operator from v1.6.11 to v1.7.11 ### Pre-requisites and health status Initial status of the rook operator and cephcluster: ``` airship@d105:~/shon/rook/cluster/examples/kubernetes/ceph$ kubectl get cephclusters.ceph.rook.io -n rook-ceph NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL rook-ceph /var/lib/rook 3 80m Ready Cluster created successfully HEALTH_OK ``` ``` airship@d105:~/shon/rook/cluster/examples/kubernetes/ceph$ kubectl get pods -n rook-ceph NAME READY STATUS RESTARTS AGE csi-cephfsplugin-84jjc 3/3 Running 0 29m csi-cephfsplugin-provisioner-775dcbbc86-6kw67 6/6 Running 0 29m csi-cephfsplugin-provisioner-775dcbbc86-bhqr4 6/6 Running 0 29m csi-cephfsplugin-qvqxx 3/3 Running 0 29m csi-cephfsplugin-zs42r 3/3 Running 0 29m csi-rbdplugin-chnkn 3/3 Running 0 29m csi-rbdplugin-mnmb7 3/3 Running 0 29m csi-rbdplugin-provisioner-5868bd8b55-6c4jw 6/6 Running 0 29m csi-rbdplugin-provisioner-5868bd8b55-z7xd5 6/6 Running 0 29m csi-rbdplugin-vklx8 3/3 Running 0 29m rook-ceph-crashcollector-node03-678646c48c-gs4gr 1/1 Running 0 26m rook-ceph-crashcollector-node04-6f688cdd56-phzdv 1/1 Running 0 26m rook-ceph-crashcollector-node05-cc865d54f-bpwf2 1/1 Running 0 26m rook-ceph-mgr-a-696fb58d75-h5pw4 1/1 Running 0 26m rook-ceph-mon-a-5cb4fbdf47-wgh2w 1/1 Running 0 29m rook-ceph-mon-b-88d5c7db6-7n9kc 1/1 Running 0 27m rook-ceph-mon-c-cdf7b8bc-zx5wt 1/1 Running 0 27m rook-ceph-operator-bfdc879fd-24xpg 1/1 Running 0 32m rook-ceph-osd-0-85648cfd7-frj48 1/1 Running 0 26m rook-ceph-osd-1-5748896bc-pm4nd 1/1 Running 0 26m rook-ceph-osd-2-744c4b4c9d-clqbb 1/1 Running 0 26m rook-ceph-osd-prepare-node03-hbv9c 0/1 Completed 0 26m rook-ceph-osd-prepare-node04-m7vfj 0/1 Completed 0 26m rook-ceph-osd-prepare-node05-5945n 0/1 Completed 0 26m ``` Operator/Ceph versions: ``` airship@d105:~/shon/rook/cluster/examples/kubernetes/ceph$ kubectl get deployments.apps -n rook-ceph -o=custom-columns="NAME:.metadata.name,IMAGE:.spec.template.spec.containers[*].image" NAME IMAGE csi-cephfsplugin-provisioner k8s.gcr.io/sig-storage/csi-attacher:v3.2.1,k8s.gcr.io/sig-storage/csi-snapshotter:v4.1.1,k8s.gcr.io/sig-storage/csi-resizer:v1.2.0,k8s.gcr.io/sig-storage/csi-provisioner:v2.2.2,quay.io/cephcsi/cephcsi:v3.3.1,quay.io/cephcsi/cephcsi:v3.3.1 csi-rbdplugin-provisioner k8s.gcr.io/sig-storage/csi-provisioner:v2.2.2,k8s.gcr.io/sig-storage/csi-resizer:v1.2.0,k8s.gcr.io/sig-storage/csi-attacher:v3.2.1,k8s.gcr.io/sig-storage/csi-snapshotter:v4.1.1,quay.io/cephcsi/cephcsi:v3.3.1,quay.io/cephcsi/cephcsi:v3.3.1 rook-ceph-crashcollector-node03 ceph/ceph:v15.2.13 rook-ceph-crashcollector-node04 ceph/ceph:v15.2.13 rook-ceph-crashcollector-node05 ceph/ceph:v15.2.13 rook-ceph-mgr-a ceph/ceph:v15.2.13 rook-ceph-mon-a ceph/ceph:v15.2.13 rook-ceph-mon-b ceph/ceph:v15.2.13 rook-ceph-mon-c ceph/ceph:v15.2.13 rook-ceph-operator rook/ceph:v1.6.11 rook-ceph-osd-0 ceph/ceph:v15.2.13 rook-ceph-osd-1 ceph/ceph:v15.2.13 rook-ceph-osd-2 ceph/ceph:v15.2.13 ``` Health status: ``` airship@d105:~/shon/rook/cluster/examples/kubernetes/ceph$ kubectl exec -n rook-ceph rook-ceph-tools-65c94d77bb-6czmn -- ceph status cluster: id: 0b59ebfb-2e36-45aa-af62-02e1d41cc2e6 health: HEALTH_OK services: mon: 3 daemons, quorum a,b,c (age 77m) mgr: a(active, since 76m) osd: 3 osds: 3 up (since 76m), 3 in (since 76m) data: pools: 1 pools, 1 pgs objects: 1 objects, 0 B usage: 3.0 GiB used, 15 GiB / 18 GiB avail pgs: 1 active+clean ``` ``` airship@d105:~/shon/upgrade/rook/cluster/examples/kubernetes$ kubectl get pv,pvc NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/ironic-pv-volume 10Gi RWO Retain Bound metal3/ironic-pv-claim default 5d4h persistentvolume/pvc-51ff734a-db0e-4044-a999-07da1f5e7d98 2Gi RWO Delete Bound default/mysql-pv-claim rook-ceph-block 13s persistentvolume/pvc-9738a6b2-03d4-41a9-9fa9-d344429aef9f 2Gi RWO Delete Bound default/wp-pv-claim rook-ceph-block 7s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/mysql-pv-claim Bound pvc-51ff734a-db0e-4044-a999-07da1f5e7d98 2Gi RWO rook-ceph-block 13s persistentvolumeclaim/wp-pv-claim Bound pvc-9738a6b2-03d4-41a9-9fa9-d344429aef9f 2Gi RWO rook-ceph-block 8s ``` ### Steps for Rook operator upgrade. #### 1) Update common resources and CRDs First get the latest common resources manifests that contain the latest changes. ``` git clone --single-branch --depth=1 --branch v1.7.11 https://github.com/rook/rook.git cd rook/cluster/examples/kubernetes/ceph ``` Then apply the latest changes. ``` kubectl apply -f common.yaml -f crds.yaml ``` #### 2) Update the Rook Operator The largest portion of the upgrade is triggered when the operator’s image is updated to v1.7.x. When the operator is updated, it will proceed to update all of the Ceph daemons. ``` kubectl -n rook-ceph set image deploy/rook-ceph-operator rook-ceph-operator=rook/ceph:v1.7.11 ``` #### 3) Wait for the upgrade to complete Watch now in amazement as the Ceph mons, mgrs, OSDs, rbd-mirrors, MDSes and RGWs are terminated and replaced with updated versions in sequence. The cluster may be offline very briefly as mons update, and the Ceph Filesystem may fall offline a few times while the MDSes are upgrading. This is normal. The versions of the components can be viewed as they are updated: ``` watch --exec kubectl -n rook-ceph get deployments -l rook_cluster=rook-ceph -o jsonpath='{range .items[*]}{.metadata.name}{" \treq/upd/avl: "}{.spec.replicas}{"/"}{.status.updatedReplicas}{"/"}{.status.readyReplicas}{" \trook-version="}{.metadata.labels.rook-version}{"\n"}{end}' ``` During upgrade: ``` Every 2.0s: kubectl -n rook-ceph get deployments -l rook_cluster=rook-ceph -o jsonpath={range .items[*]}{.metadata.nam... d105: Mon Jan 24 11:29:44 2022 rook-ceph-crashcollector-node03 req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-crashcollector-node04 req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-crashcollector-node05 req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-mgr-a req/upd/avl: 1/1/1 rook-version=v1.6.11 rook-ceph-mon-a req/upd/avl: 1/1/1 rook-version=v1.6.11 rook-ceph-mon-b req/upd/avl: 1/1/1 rook-version=v1.6.11 rook-ceph-mon-c req/upd/avl: 1/1/1 rook-version=v1.6.11 rook-ceph-osd-0 req/upd/avl: 1/1/1 rook-version=v1.6.11 rook-ceph-osd-1 req/upd/avl: 1/1/1 rook-version=v1.6.11 rook-ceph-osd-2 req/upd/avl: 1/1/1 rook-version=v1.6.11 ``` #### 4) Verify the updated cluster ``` # kubectl -n $ROOK_CLUSTER_NAMESPACE get deployment -l rook_cluster=$ROOK_CLUSTER_NAMESPACE -o jsonpath='{range .items[*]}{"rook-version="}{.metadata.labels.rook-version}{"\n"}{end}' | sort | uniq This cluster is not yet finished: rook-version=v1.6.11 rook-version=v1.7.11 This cluster is finished: rook-version=v1.7.11 ``` After upgrade completed: ``` Every 2.0s: kubectl -n rook-ceph get deployments -l rook_cluster=rook-ceph -o jsonpath={range .items[*]}{.metadata.nam... d105: Mon Jan 24 11:32:43 2022 rook-ceph-crashcollector-node03 req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-crashcollector-node04 req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-crashcollector-node05 req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-mgr-a req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-mon-a req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-mon-b req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-mon-c req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-osd-0 req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-osd-1 req/upd/avl: 1/1/1 rook-version=v1.7.11 rook-ceph-osd-2 req/upd/avl: 1/1/1 rook-version=v1.7.11 ``` ``` airship@d105:~/shon/upgrade/rook/cluster/examples/kubernetes/ceph$ kubectl get deployments.apps -n rook-ceph -o=custom-columns="NAME:.metadata.name,IMAGE:.spec.template.spec.containers[*].image" NAME IMAGE csi-cephfsplugin-provisioner k8s.gcr.io/sig-storage/csi-attacher:v3.3.0,k8s.gcr.io/sig-storage/csi-snapshotter:v4.2.0,k8s.gcr.io/sig-storage/csi-resizer:v1.3.0,k8s.gcr.io/sig-storage/csi-provisioner:v3.0.0,quay.io/cephcsi/cephcsi:v3.4.0,quay.io/cephcsi/cephcsi:v3.4.0 csi-rbdplugin-provisioner k8s.gcr.io/sig-storage/csi-provisioner:v3.0.0,k8s.gcr.io/sig-storage/csi-resizer:v1.3.0,k8s.gcr.io/sig-storage/csi-attacher:v3.3.0,k8s.gcr.io/sig-storage/csi-snapshotter:v4.2.0,quay.io/cephcsi/cephcsi:v3.4.0,quay.io/cephcsi/cephcsi:v3.4.0 rook-ceph-crashcollector-node03 ceph/ceph:v15.2.13 rook-ceph-crashcollector-node04 ceph/ceph:v15.2.13 rook-ceph-crashcollector-node05 ceph/ceph:v15.2.13 rook-ceph-mgr-a ceph/ceph:v15.2.13 rook-ceph-mon-a ceph/ceph:v15.2.13 rook-ceph-mon-b ceph/ceph:v15.2.13 rook-ceph-mon-c ceph/ceph:v15.2.13 rook-ceph-operator rook/ceph:v1.7.11 rook-ceph-osd-0 ceph/ceph:v15.2.13 rook-ceph-osd-1 ceph/ceph:v15.2.13 rook-ceph-osd-2 ceph/ceph:v15.2.13 ``` ``` airship@d105:~/shon/upgrade/rook/cluster/examples/kubernetes/ceph$ kubectl get pv,pvc NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/ironic-pv-volume 10Gi RWO Retain Bound metal3/ironic-pv-claim default 5d4h persistentvolume/pvc-51ff734a-db0e-4044-a999-07da1f5e7d98 2Gi RWO Delete Bound default/mysql-pv-claim rook-ceph-block 17m persistentvolume/pvc-9738a6b2-03d4-41a9-9fa9-d344429aef9f 2Gi RWO Delete Bound default/wp-pv-claim rook-ceph-block 17m NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/mysql-pv-claim Bound pvc-51ff734a-db0e-4044-a999-07da1f5e7d98 2Gi RWO rook-ceph-block 17m persistentvolumeclaim/wp-pv-claim Bound pvc-9738a6b2-03d4-41a9-9fa9-d344429aef9f 2Gi RWO rook-ceph-block 17m ``` ## Observations: 1) When we perform the rook operator upgrade, rook components like rook-ceph-mgr,rook-ceph-mon,rook-ceph-osd upgraded to the latest version. 2) OSD went down one after another from a single node at a time out of 3 nodes. So there was always 2 OSDs which preserve quoram. 3) Ceph cluster health went to Warn state once the OSD were down for a while. Post upgrade OSD came up and cpeh cluster health back to Health OK. 4) Node reboot not required post upgrading rook operator. 5) Rook operator did not upgraded the Ceph versions, we need to upgrade the ceph version as a separate step/process. 6) Different upgrade scenarios performed in the local lab confirming that there are no significant performance or availability impacts. The operator upgrade doesn't affect a ceph functionality, according to the rook documentation, the ceph cluster remains fully functional with only minimal limitations. The performance impact during the ceph upgrade is absolutely comparable to the impact triggered by regular maintenance like osd node reboot or hard drive replacement. This level of impact is expected and well documented. #### Performance Impact during upgrade process: ``` Intial ceph status: Every 2.0s: kubectl exec -n rook-ceph rook-ceph-tools-65c94d77bb-6czmn -- ceph status d105: Mon Jan 24 11:30:01 2022 cluster: id: 0b59ebfb-2e36-45aa-af62-02e1d41cc2e6 health: HEALTH_OK services: mon: 3 daemons, quorum a,b,c (age 110m) mgr: a(active, since 109m) osd: 3 osds: 3 up (since 109m), 3 in (since 109m) data: pools: 2 pools, 33 pgs objects: 64 objects, 158 MiB usage: 3.4 GiB used, 15 GiB / 18 GiB avail pgs: 33 active+clean 1st OSD went down from node03 Every 2.0s: kubectl exec -n rook-ceph rook-ceph-tools-65c94d77bb-6czmn -- ceph status d105: Mon Jan 24 11:31:05 2022 cluster: id: 0b59ebfb-2e36-45aa-af62-02e1d41cc2e6 health: HEALTH_WARN 1 osds down 1 host (1 osds) down Degraded data redundancy: 64/192 objects degraded (33.333%), 28 pgs degraded services: mon: 3 daemons, quorum a,b,c (age 111m) mgr: a(active, since 14s) osd: 3 osds: 2 up (since 5s), 3 in (since 110m) data: pools: 2 pools, 33 pgs objects: 64 objects, 158 MiB usage: 3.5 GiB used, 15 GiB / 18 GiB avail pgs: 64/192 objects degraded (33.333%) 28 active+undersized+degraded 5 active+undersized After OSD upgraded to latest version: Every 2.0s: kubectl exec -n rook-ceph rook-ceph-tools-65c94d77bb-6czmn -- ceph status d105: Mon Jan 24 11:31:16 2022 cluster: id: 0b59ebfb-2e36-45aa-af62-02e1d41cc2e6 health: HEALTH_OK services: mon: 3 daemons, quorum a,b,c (age 111m) mgr: a(active, since 25s) osd: 3 osds: 3 up (since 5s), 3 in (since 110m) data: pools: 2 pools, 33 pgs objects: 64 objects, 158 MiB usage: 3.5 GiB used, 15 GiB / 18 GiB avail pgs: 25 active+clean 8 active+clean+wait 2nd OSD went down from node04: Every 2.0s: kubectl exec -n rook-ceph rook-ceph-tools-65c94d77bb-6czmn -- ceph status d105: Mon Jan 24 11:31:21 2022 cluster: id: 0b59ebfb-2e36-45aa-af62-02e1d41cc2e6 health: HEALTH_WARN 1 osds down 1 host (1 osds) down Degraded data redundancy: 31/192 objects degraded (16.146%), 12 pgs degraded services: mon: 3 daemons, quorum a,b,c (age 111m) mgr: a(active, since 31s) osd: 3 osds: 2 up (since 4s), 3 in (since 110m) data: pools: 2 pools, 33 pgs objects: 64 objects, 158 MiB usage: 3.5 GiB used, 15 GiB / 18 GiB avail pgs: 31/192 objects degraded (16.146%) 12 active+undersized+degraded 11 active+clean 8 stale+active+clean 2 active+undersized Post OSD upgrade on node04 : Every 2.0s: kubectl exec -n rook-ceph rook-ceph-tools-65c94d77bb-6czmn -- ceph status d105: Mon Jan 24 11:31:27 2022 cluster: id: 0b59ebfb-2e36-45aa-af62-02e1d41cc2e6 health: HEALTH_WARN 1 osds down 1 host (1 osds) down Degraded data redundancy: 64/192 objects degraded (33.333%), 28 pgs degraded services: mon: 3 daemons, quorum a,b,c (age 111m) mgr: a(active, since 36s) osd: 3 osds: 2 up (since 10s), 3 in (since 111m) data: pools: 2 pools, 33 pgs objects: 64 objects, 158 MiB usage: 3.5 GiB used, 15 GiB / 18 GiB avail pgs: 64/192 objects degraded (33.333%) 28 active+undersized+degraded 5 active+undersized OSD upgrade on node05: Every 2.0s: kubectl exec -n rook-ceph rook-ceph-tools-65c94d77bb-6czmn -- ceph status d105: Mon Jan 24 11:31:38 2022 cluster: id: 0b59ebfb-2e36-45aa-af62-02e1d41cc2e6 health: HEALTH_WARN Degraded data redundancy: 64/192 objects degraded (33.333%), 28 pgs degraded services: mon: 3 daemons, quorum a,b,c (age 111m) mgr: a(active, since 47s) osd: 3 osds: 3 up (since 0.858719s), 3 in (since 111m) data: pools: 2 pools, 33 pgs objects: 64 objects, 158 MiB usage: 3.5 GiB used, 15 GiB / 18 GiB avail pgs: 64/192 objects degraded (33.333%) 28 active+undersized+degraded 5 active+undersized Post upgrade of OSD on node05: Every 2.0s: kubectl exec -n rook-ceph rook-ceph-tools-65c94d77bb-6czmn -- ceph status d105: Mon Jan 24 11:31:45 2022 cluster: id: 0b59ebfb-2e36-45aa-af62-02e1d41cc2e6 health: HEALTH_WARN 1 osds down 1 host (1 osds) down Reduced data availability: 2 pgs peering services: mon: 3 daemons, quorum a,b,c (age 111m) mgr: a(active, since 55s) osd: 3 osds: 2 up (since 1.42293s), 3 in (since 111m) data: pools: 2 pools, 33 pgs objects: 64 objects, 158 MiB usage: 3.5 GiB used, 15 GiB / 18 GiB avail pgs: 33.333% pgs not active 14 active+clean+wait 11 peering 8 stale+active+clean Final Ceph cluster status : Every 2.0s: kubectl exec -n rook-ceph rook-ceph-tools-65c94d77bb-6czmn -- ceph status d105: Mon Jan 24 11:32:07 2022 cluster: id: 0b59ebfb-2e36-45aa-af62-02e1d41cc2e6 health: HEALTH_OK services: mon: 3 daemons, quorum a,b,c (age 112m) mgr: a(active, since 76s) osd: 3 osds: 3 up (since 4s), 3 in (since 111m) task status: data: pools: 2 pools, 33 pgs objects: 64 objects, 158 MiB usage: 3.5 GiB used, 15 GiB / 18 GiB avail pgs: 22 active+clean 11 active+clean+wait ```

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.