# kubeadm k8s 備份與復原
## 建立測試用 pod
```
$ kubectl run test --image=nginx
$ kubectl get po
NAME READY STATUS RESTARTS AGE
test 1/1 Running 0 27m
```
## 備份 k8s
* 安裝 etcdutl、etcd、etcdctl 指令
```
$ ETCD_RELEASE=$(curl -s https://api.github.com/repos/etcd-io/etcd/releases/latest|grep tag_name | cut -d '"' -f 4)
$ wget https://github.com/etcd-io/etcd/releases/download/${ETCD_RELEASE}/etcd-${ETCD_RELEASE}-linux-amd64.tar.gz
$ tar zxvf etcd-${ETCD_RELEASE}-linux-amd64.tar.gz
$ sudo cp -rp etcd-${ETCD_RELEASE}-linux-amd64/etc* /usr/local/bin
$ rm -f etcd-${ETCD_VER}-linux-amd64.tar.gz
$ etcd --version
etcd Version: 3.6.1
Git SHA: a4708be
Go Version: go1.23.10
Go OS/Arch: linux/amd64
$ etcdctl version
etcdctl version: 3.6.1
API version: 3.6
$ etcdutl version
etcdutl version: 3.6.1
API version: 3.6
```
* 設定與測試 etcdctl,填入自己的 etcd ip
```
$ alias etcdctl="ETCDCTL_API=3 sudo /usr/local/bin/etcdctl \
--endpoints=127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/apiserver-etcd-client.crt \
--key=/etc/kubernetes/pki/apiserver-etcd-client.key"
$ etcdctl member list
768bf4f42edfb9b3, started, m1, https://172.20.7.80:2380, https://172.20.7.80:2379, false
e044dfaaca4a3cf4, started, m2, https://172.20.7.81:2380, https://172.20.7.81:2379, false
ff78f71ad246a1cd, started, m3, https://172.20.7.82:2380, https://172.20.7.82:2379, false
$ etcdctl endpoint status -w table
+----------------+------------------+---------+-----------------+---------+--------+-----------------------+-------+-----------+------------+-----------+------------+--------------------+--------+--------------------------+-------------------+
| ENDPOINT | ID | VERSION | STORAGE VERSION | DB SIZE | IN USE | PERCENTAGE NOT IN USE | QUOTA | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | DOWNGRADE TARGET VERSION | DOWNGRADE ENABLED |
+----------------+------------------+---------+-----------------+---------+--------+-----------------------+-------+-----------+------------+-----------+------------+--------------------+--------+--------------------------+-------------------+
| 127.0.0.1:2379 | 5718db9e2058ee89 | 3.6.4 | 3.6.0 | 8.1 MB | 3.5 MB | 58% | 0 B | false | false | 17 | 8896038 | 8896038 | | | false |
+----------------+------------------+---------+-----------------+---------+--------+-----------------------+-------+-----------+------------+-----------+------------+--------------------+--------+--------------------------+-------------------+
```
### 對 etcd snapshot
* 使用 `etcdctl` snapshot
```
$ mkdir ~/etcd
$ etcdctl snapshot save "$HOME"/etcd/etcd-snapshot.db
```
* 檢查備份
```
$ ls -l "$HOME"/etcd
total 25556
-rw------- 1 root root 26165280 Jun 26 15:24 etcd-snapshot.db
```
* 使用 `etcdutl` 檢查 snapshot 狀態
```
$ sudo etcdutl --write-out=table snapshot status etcd/etcd-snapshot.db
+----------+----------+------------+------------+---------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE | VERSION |
+----------+----------+------------+------------+---------+
| c7ee3a10 | 2761931 | 687 | 26 MB | |
+----------+----------+------------+------------+---------+
```
## etcd 復原
* 將 pod 刪除
```
$ kubectl delete pod test
pod "test" deleted
```
* 建立復原目錄
```
$ sudo mkdir /var/lib/etcd-restore
```
* 使用 `etcdutl` 將剛剛的 snapshot restore 到 `/var/lib/etcd-restore` 目錄
```
$ sudo etcdutl --data-dir="/var/lib/etcd-restore" snapshot restore "$HOME"/etcd/etcd-snapshot.db
```
```
$ sudo ls -l /var/lib/etcd-restore
total 4
drwx------ 4 root root 4096 Jun 26 15:47 member
```
* 設定 etcd 掛載到剛剛 restore 的目錄
```
$ sudo nano /etc/kubernetes/manifests/etcd.yaml
......
- hostPath:
path: /var/lib/etcd-restore # 修改此行
type: DirectoryOrCreate
name: etcd-data
```
* 重啟 kubelet
```
$ sudo systemctl daemon-reload
$ sudo systemctl restart kubelet
```
* 更新後 etcd 會重新產生,使用 `crictl` 檢查狀態
```
$ sudo crictl ps -a | grep etcd
2732de9b81398 2e96e5913fc06 31 seconds ago Running etcd 0 f7d4519040e0d etcd-m1
```
* 等待時間恢復 k8s 後,確認剛剛刪除的 pod 也復原
```
$ kubectl get no
NAME STATUS ROLES AGE VERSION
m1 Ready control-plane 14d v1.30.13
w1 Ready worker 14d v1.30.13
w2 Ready worker 14d v1.30.13
$ kubectl get po
NAME READY STATUS RESTARTS AGE
test 1/1 Running 0 68m
```
### 如果使用以上方式恢復還需要再 restore 一次到 `/var/lib/etcd` 目錄下,不然升級時間查會有問題
#### 以下方式為直接 restore 到 `/var/lib/etcd` 目錄
* 先關閉 kubelet
```
$ sudo systemctl stop kubelet
```
* 刪除 apiserver、etcd、controller-manager、scheduler、etcd container
```
$ sudo crictl ps -a --name 'kube-apiserver|kube-controller-manager|kube-scheduler|etcd' -q | xargs -r -n1 sudo crictl rm -f
```
* 將 etcd 儲存資料移出
```
$ sudo mv /var/lib/etcd "$HOME"/etcd/
```
* 使用 `etcdutl` 將剛剛的 snapshot restore 到 `/var/lib/etcd` 目錄
```
$ sudo etcdutl --data-dir="/var/lib/etcd" snapshot restore "$HOME"/etcd/etcd-snapshot.db
```
* 重啟 kubelet
```
$ sudo systemctl daemon-reload
$ sudo systemctl restart kubelet
```
* 更新後 etcd 會重新產生,使用 `crictl` 檢查狀態
```
$ sudo crictl ps -a|grep etcd
0b5caf8d752ed 5f1f5298c888daa46c4409ff4cefe5ca9d16e479419f94cdb5f5d5563dac0115 About a minute ago Running etcd 0 6abd9326a3f19 etcd-m1 kube-syste
```
* 等待時間恢復 k8s 後,確認剛剛刪除的 pod 也復原
```
$ kubectl get no
NAME STATUS ROLES AGE VERSION
m1 Ready control-plane 14d v1.30.13
w1 Ready worker 14d v1.30.13
w2 Ready worker 14d v1.30.13
$ kubectl get po
NAME READY STATUS RESTARTS AGE
test 1/1 Running 0 68m
```