## ETCD RECOVERY PROCEDURE ### STEP 1 Saving of etcd certs + keys `grep -A55 -a "openshift-config/etcd-signer" /var/lib/etcd/member/snap/db | sed -n '/-----BEGIN/,/-----END/ p' | sed 's/^.*-----BEGIN/-----BEGIN/g'` Save the signer certificate to *etcd-signer.crt* and the key to *etcd-signer.key* _for each signer etcd-serving etcd-peer etcd-serving-metrics_ ``` grep -A55 -a "openshift-config/etcd-metric-signer" /var/lib/etcd/member/snap/db | sed -n '/-----BEGIN/,/-----END/ p' | sed 's/^.*-----BEGIN/-----BEGIN/g' ``` ``` etcd-metric-signer.crt && etcd-metric-signer.key ``` ### Step 2 Stop control plane (with backup) ``` mkdir -v /etc/kubernetes/manifests-backup/ mv /etc/kubernetes/manifests/* /etc/kubernetes/manifests-backup/ crictl ps | grep -e "etcd\|kube-apiserver\|kube-controller\|kube-scheduler" crictl stop $(crictl ps | grep -e "etcd\|kube-apiserver\|kube-controller\|kube-scheduler" | awk '{print $1}') ``` ### Create etcd cert + key Copy down results of: ``` openssl x509 -noout -ext "subjectAltName" -in /etc/kubernetes/static-pod-resources/etcd-certs/secrets/etcd-all-certs/etcd-serving-master-1.example.com.crt | grep -v X509v3 | sed 's/^ *//;s/ Address//g' ``` ```bash #!/bin/bash create() { # for crating csr instead openssl genrsa -out $TARGET.key 2048 OPENSSL_CNF=/etc/pki/tls/openssl.cnf openssl req -new -sha256 \ -key $TARGET.key \ -subj "/O=$O, /CN=$CN" \ -reqexts SAN \ -config <(cat ${OPENSSL_CNF} \ <(printf "\n[SAN]\nsubjectAltName=${SAN}\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment \nextendedKeyUsage=serverAuth, clientAuth")) \ -out $TARGET.csr # sign the csr openssl x509 \ -req \ -sha256 \ -extfile <(printf "subjectAltName=${SAN}\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment \nextendedKeyUsage=serverAuth, clientAuth") \ -days 2000 \ -in $TARGET.csr \ -CA $CACRT \ -CAkey $CAKEY \ -CAcreateserial -out $TARGET.crt } # add masters here ``` Select a repo ### Step 4 Tarball! Save the certs + keys into an archive: tar -cvzf etcd-all-certs.tar.gz ./*{.crt,.key} # save to bastion however you like scp etcd-all-certs.tar.gz core@masterN:/tmp ### Step 5 Backup each masters etcd keys: ~~~ mkdir -v /etc/kubernetes/etcd-certs-backup-$(date +%Y%m%d) cp -rvf /etc/kubernetes/static-pod-resources/etcd-certs/secrets/* /etc/kubernetes/etcd-certs-backup-$(date +%Y%m%d)/ find /etc/kubernetes/etcd-certs-backup-$(date +%Y%m%d)/ ~~~ ### Step 6 Replace etcd keys! ~~~ cd /etc/kubernetes/static-pod-resources/etcd-certs/secrets/etcd-all-certs rm ./*.{crt,key} tar -xvf /tmp/etcd-all-certs.tar.gz -C /etc/kubernetes/static-pod-resources/etcd-certs/secrets/etcd-all-certs/ cd /etc/kubernetes/static-pod-resources/etcd-certs/secrets/etcd-all-certs/ find . -maxdepth 1 -iname 'etcd-peer-*' -exec cp -v {} ../etcd-all-peer \; find . -maxdepth 1 -iname 'etcd-serving-*' ! -iname "etcd-serving-metrics*" -exec cp -v {} ../etcd-all-serving \; find . -maxdepth 1 -iname 'etcd-serving-metrics*' -exec cp -v {} ../etcd-all-serving-metrics \; ~~~ ### Step 7 Start etcd again: ~~~ mv /etc/kubernetes/manifests-backup/etcd-pod.yaml /etc/kubernetes/manifests/ watch 'crictl ps --name etcd' # Control+C to exit when containers are running crictl exec $(crictl ps --name etcdctl -q) etcdctl endpoint status -w table # This should print the status of all etcd peers ~~~ --- ## ETCD RECOVERY PROCEDURE ### STEP 1 Saving of etcd certs + keys `grep -A55 -a "openshift-config/etcd-signer" /var/lib/etcd/member/snap/db | sed -n '/-----BEGIN/,/-----END/ p' | sed 's/^.*-----BEGIN/-----BEGIN/g'` Save the signer certificate to *etcd-signer.crt* and the key to *etcd-signer.key* `grep -A55 -a "openshift-config/etcd-metric-signer" /var/lib/etcd/member/snap/db | sed -n '/-----BEGIN/,/-----END/ p' | sed 's/^.*-----BEGIN/-----BEGIN/g'` save to etcd-metric-signer.crt & etcd-metric-signer.key ### Step 2 Stop control plane (with backup) ~~~ mkdir -v /etc/kubernetes/manifests-backup/ mv /etc/kubernetes/manifests/* /etc/kubernetes/manifests-backup/ crictl ps | grep -e "etcd\|kube-apiserver\|kube-controller\|kube-scheduler" crictl stop $(crictl ps | grep -e "etcd\|kube-apiserver\|kube-controller\|kube-scheduler" | awk '{print $1}') ~~~ ### Create etcd cert + key Copy down results of: ~~~ openssl x509 -noout -ext "subjectAltName" -in /etc/kubernetes/static-pod-resources/etcd-certs/secrets/etcd-all-certs/etcd-serving-master-1.example.com.crt | grep -v X509v3 | sed 's/^ *//;s/ Address//g' ~~~ ```bash #!/bin/bash create() { # for crating csr instead openssl genrsa -out $TARGET.key 2048 OPENSSL_CNF=/etc/pki/tls/openssl.cnf openssl req -new -sha256 \ -key $TARGET.key \ -subj "/O=$O, /CN=$CN" \ -reqexts SAN \ -config <(cat ${OPENSSL_CNF} \ <(printf "\n[SAN]\nsubjectAltName=${SAN}\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment \nextendedKeyUsage=serverAuth, clientAuth")) \ -out $TARGET.csr # sign the csr openssl x509 \ -req \ -sha256 \ -extfile <(printf "subjectAltName=${SAN}\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment \nextendedKeyUsage=serverAuth, clientAuth") \ -days 2000 \ -in $TARGET.csr \ -CA $CACRT \ -CAkey $CAKEY \ -CAcreateserial -out $TARGET.crt } # add masters here ``` ### Step 4 Tarball! Save the certs + keys into an archive: tar -cvzf etcd-all-certs.tar.gz ./*{.crt,.key} #### save to bastion however you like scp etcd-all-certs.tar.gz core@masterN:/tmp ### Step 5 Backup each masters etcd keys: ~~~ mkdir -v /etc/kubernetes/etcd-certs-backup-$(date +%Y%m%d) cp -rvf /etc/kubernetes/static-pod-resources/etcd-certs/secrets/* /etc/kubernetes/etcd-certs-backup-$(date +%Y%m%d)/ find /etc/kubernetes/etcd-certs-backup-$(date +%Y%m%d)/ ~~~ ### Step 6 Replace etcd keys! ~~~ cd /etc/kubernetes/static-pod-resources/etcd-certs/secrets/etcd-all-certs rm ./*.{crt,key} tar -xvf /tmp/etcd-all-certs.tar.gz -C /etc/kubernetes/static-pod-resources/etcd-certs/secrets/etcd-all-certs/ cd /etc/kubernetes/static-pod-resources/etcd-certs/secrets/etcd-all-certs/ find . -maxdepth 1 -iname 'etcd-peer-*' -exec cp -v {} ../etcd-all-peer \; find . -maxdepth 1 -iname 'etcd-serving-*' ! -iname "etcd-serving-metrics*" -exec cp -v {} ../etcd-all-serving \; find . -maxdepth 1 -iname 'etcd-serving-metrics*' -exec cp -v {} ../etcd-all-serving-metrics \; ~~~ ### Step 7 Start etcd again: ~~~ mv /etc/kubernetes/manifests-backup/etcd-pod.yaml /etc/kubernetes/manifests/ watch 'crictl ps --name etcd' # Control+C to exit when containers are running crictl exec $(crictl ps --name etcdctl -q) etcdctl endpoint status -w table # This should print the status of all etcd peers ~~~ ### Step 8 Start kube apiserver, kube controller, etc... ~~~ mv /etc/kubernetes/manifests-backup/* /etc/kubernetes/manifests/ watch 'crictl ps | grep -e "kube-apiserver\|kube-controller\|kube-scheduler"' ~~~ ### Step 9 _oc still didn't login, we logged in from master node on local host and renewed certs from CO_ small verification: `for each in $(oc get secret -n openshift-etcd | grep "kubernetes.io/tls" | grep -e "etcd-peer\|etcd-serving" | awk '{print $1}'); do oc get secret $each -n openshift-etcd -o jsonpath="{.data.tls\.crt}" | base64 -d | openssl x509 -noout -dates; echo '-----'; done` posterity backup: `oc get secret -o yaml -n openshift-etcd $(oc get -n openshift-etcd secret -o jsonpath='{range .items[?(.type=="kubernetes.io/tls")]}{.metadata.name}{"\n"}{end}' | grep -e "etcd-peer\|etcd-serving") >etcd-certs-backup.yaml` actual cert rotation: `oc delete secret -n openshift-etcd $(oc get -n openshift-etcd secret -o jsonpath='{range .items[?(.type=="kubernetes.io/tls")]}{.metadata.name}{"\n"}{end}' | grep -e "etcd-peer\|etcd-serving")` check for new secrets: `oc get secret -n openshift-etcd $(oc get -n openshift-etcd secret -o jsonpath='{range .items[?(.type=="kubernetes.io/tls")]}{.metadata.name}{"\n"}{end}' | grep -e "etcd-peer\|etcd-serving")` new cert check: `for each in $(oc get secret -n openshift-etcd | grep "kubernetes.io/tls" | grep -e "etcd-peer\|etcd-serving" | awk '{print $1}'); do oc get secret $each -n openshift-etcd -o jsonpath="{.data.tls\.crt}" | base64 -d | openssl x509 -noout -dates; echo '-----'; done` cert rotation will be complete once etcd is finished: `watch 'oc get co etcd; oc get pods -n openshift-etcd'` ### COMPLETE!