# Backup & Recovery (BR)
> - **Objective:** Learn to backup & recover a TiDB cluster on AWS (with Kubernetes)
> - **Prerequisites:**
> - Background knowledge of TiDB components
> - Background knowledge of Kubernetes and TiDB Operator
> - Background knowledge of [BR](https://pingcap.com/docs/stable/reference/tools/br/br/#command-line-description)
> - AWS account
> - TiDB cluster on AWS
> - **Optionality:**: Required
> - **Estimated time:** 1 hour
This document describes how to perform physical backups for a TiDB cluster and use them to recover a TiDB cluster.
## Prepare
> **Optionality:** Required
### Prepare Data
> **Optionality:** You can skip this section if you already have data in the TiDB cluster.
Prepare data using sysbench. Refer to [sysbench](https://hackmd.io/0RpTgviPTfShBTDoEBhPfw#Sysbench)
### Grant AWS Account Permissions
> - **Optionality:** Required
Before you perform backup, AWS account permissions need to be granted to the Backup Custom Resource (CR) object. There are three methods to grant AWS account permissions:
- [Grant permissions by importing AccessKey and SecretKey](https://pingcap.com/docs/tidb-in-kubernetes/stable/backup-to-aws-s3-using-br/#three-methods-to-grant-aws-account-permissions)
- [Grant permissions by associating IAM with Pod](https://pingcap.com/docs/tidb-in-kubernetes/stable/backup-to-aws-s3-using-br/#grant-permissions-by-associating-iam-with-pod)
- [Grant permissions by associating IAM with ServiceAccount](https://pingcap.com/docs/tidb-in-kubernetes/stable/backup-to-aws-s3-using-br/#grant-permissions-by-associating-iam-with-pod)
In this doc, we use `grant permissions by importing accesskey and secretkey` to grant AWS account permissions.
> **Note**
>
> Grant permissions by associating IAM with Pod or Grant permissions by associating IAM with ServiceAccount is recommanded in production enviroment.
### Create S3 Bucket
If you don't have an S3 bucket for backup, you can create an S3 bucket in the same AWS region of your EKS cluster:

You can skip this section if you already have a S3 bucket to store backups.
### Install RBAC
Download [backup-rbac.yaml](https://github.com/pingcap/tidb-operator/blob/master/manifests/backup/backup-rbac.yaml), and execute the following command to create the role-based access control (RBAC) resources in your namespace:
```{.bash .copyable}
kubectl apply -f backup-rbac.yaml -n ${cluster_namespace}
```
```{.output}
serviceaccount/tidb-backup-manager created
rolebinding.rbac.authorization.k8s.io/tidb-backup-manager created
```
### Create Secrets
#### Create s3-secret
TiDB operator needs to access S3 when performing backup operations. To do that, you can create the s3-secret secret which stores the credential used to access S3:
```{.bash .copyable}
kubectl create secret generic s3-secret --from-literal=access_key=${aws_access_key} --from-literal=secret_key=${aws_secret_key} --namespace=${cluster_namespace}
```
```{.output}
secret/s3-secret created
```
This s3-secret will be used in your Backup CR.
Verfify that the secret is properly created:
```{.bash .copyable}
kubectl get secrets -n ${cluster_namespace}
```
#### Create tidb-secret
TiDB operator needs to access TiDB when performing backup operations. To do that, you can create a secret which stores the password of the user account needed to access the TiDB cluster.
```{.bash .copyable}
kubectl create secret generic tidb-secret --from-literal=password=${password} --namespace=${cluster_namespace}
```
```{.output}
secret/backup-secret created
```
> **Note**
>
> If there's no password for the user account, leave it blank in ${password}.
#### Verify Secrets
Verfify that the secret is properly created:
```{.bash .copyable}
kubectl get secrets -n ${cluster_namespace}
```
## Ad-hoc Full Backup
> - **Optionality:** Required
This section describes how to perform full backup. We use Backup Custom Resource (CR) to desbribe an ad-hoc full backup. TiDB Operator performs backup operation based on the specification in the Backup CR.
### Configure Backup CR
The following is an example Backup CR:
```
apiVersion: pingcap.com/v1alpha1
kind: Backup
metadata:
name: backup01
namespace: ${cluster_namespace}
spec:
backupType: full
br:
cluster: ${cluster_name}
sendCredToTikv: true
from:
host: ${cluster_name}-tidb
secretName: tidb-secret
s3:
provider: aws
secretName: s3-secret
region: ${region}
bucket: ${bucket}
prefix: ${prefix}
```
You should replace values in `{}` with specific variables in your envrioment and save in `backup-aws-s3.yaml`.
### Perform Backup
You can perform an ad-hoc full backup using the following command:
```{.bash .copyable}
kubectl apply -f backup-aws-s3.yaml
```
```
backup.pingcap.com/backup01 created
```
### Verify Backup
#### Check Backup Status
You can use the following command to check the backup status:
```{.bash .copyable}
kubectl get bk -n ${cluster_namespace} -o wide
```
```{.output}
NAME BACKUPPATH BACKUPSIZE COMMITTS STARTED COMPLETED AGE
backup01 s3://{my_bucket}/{my_folder}/ 1611872 416522333306486785 3m58s 3m55s 15m
```
#### Inspect Backup Log
You can use the following command to find the pod for the ad-hoc full backup:
```{.bash .copyable}
kubectl get po -n ${cluster_namespace}
```
You can then use the following command to check the backup progress:
```{.bash .copyable}
kubectl logs ${backup_pod} -n ${cluster_namespace}
```
```{.output}
...
I0508 02:56:48.029966 1 backup.go:92] [2020/05/08 02:56:48.029 +00:00] [INFO] [domain.go:607] ["domain closed"] ["take time"=2.586075ms]
I0508 02:56:48.030625 1 backup.go:92] [2020/05/08 02:56:48.030 +00:00] [INFO] [collector.go:203] ["Full backup Success summary: total backup ranges: 16, total success: 16, total failed: 0, total take(s): 1.98, total kv: 16000, total size(MB): 1.92, avg speed(MB/s): 0.97"] ["backup fast checksum"=1.561152ms] ["backup checksum"=21.194566ms] ["backup total regions"=16]
I0508 02:56:48.032898 1 backup.go:92]
I0508 02:56:48.033027 1 backup.go:107] Backup data for cluster anthony/backup01 successfully
I0508 02:56:48.040218 1 manager.go:207] reset cluster anthony/backup01 tikv_gc_life_time to 10m0s success
I0508 02:56:48.040241 1 manager.go:218] backup cluster anthony/backup01 data to s3://anthonybr/anthony_01/ success
I0508 02:56:48.136334 1 manager.go:232] Get size 1611872 for backup files in s3://anthonybr/anthony_01/ of cluster anthony/backup01 success
I0508 02:56:48.156090 1 manager.go:244] get cluster anthony/backup01 commitTs 416522333306486785 success
I0508 02:56:48.179889 1 backup_status_updater.go:66] Backup: [anthony/backup01] updated successfully
```
#### Check Backup Files
You can use the S3 console or `aws s3` command to check backup files:
```{.bash .copyable}
aws s3 ls s3://${my_bucket}/${my_folder}/
```
```{.output}
2020-05-17 17:06:03 0
2020-05-17 17:59:48 11598 1_100_25_3283d3a03adc06548749d97c17e127615132cbbf66b60da73a957847f19b62f7_write.sst
2020-05-17 17:59:48 187058 1_100_25_80992061af3e5194c3f28a5b79d486c5e9db2feda1afb3f84b4ca229ddce9932_write.sst
2020-05-17 17:59:48 187374 1_104_26_6d17617cf027713cbd7e33a2a124fe83bc2c35035a776d0c11e63551cbae7815_write.sst
2020-05-17 17:59:48 11487 1_104_26_a48ed62e9b0b2959f804d7850a940f2fc34e1382be1ab74d1518368e2512aba2_write.sst
2020-05-17 17:59:49 11544 1_108_27_2334a18b4455213018fb96ec3c92e951dcc4b10106e91896d882a5ee0d2dbea4_write.sst
2020-05-17 17:59:49 187315 1_108_27_4d4bf84bd30a3ae10b645ad9354c2c26cbc5ff2a95b338b93b820db95359617b_write.sst
2020-05-17 17:59:49 11551 1_116_29_96deb4607345137477d0eea19a6e7a200a37064f87c259d5fa8b08f3c52ed426_write.sst
2020-05-17 17:59:49 187206 1_116_29_9fb8afb3f7322e93f506f0d9d11e9b1569bc90b7c3779da9bb7e35137e8e6597_write.sst
2020-05-17 17:59:49 187222 1_2_29_393d3575060c0d616300c1199ef1c015784fdd3d13e950a6251007bbcbaf2c06_write.sst
2020-05-17 17:59:49 11640 1_2_29_e738a916dbc786a9aef2a69b15289ac934265a4fe851e1dc11140d2ac17b28e8_write.sst
2020-05-17 17:59:48 11556 1_92_23_c882bdf96118f3fcc64e32ac9040b3194ab67d910d1f5232b928416088211ba5_write.sst
2020-05-17 17:59:48 187312 1_92_23_f34f254bf742607feee767bcba259d92794f5f373e6fd415af086e4b42689491_write.sst
2020-05-17 17:59:48 187683 1_96_24_38e4059222579a22c8c937f6506dd8e2198a19c3d7fb4a245c0a4f804cd85adc_write.sst
2020-05-17 17:59:48 11487 1_96_24_675ff68620bc779c8c35ff210de4be67260ad904c9d18c17fed517a4fcf4226b_write.sst
2020-05-17 17:59:49 11564 5_112_28_20ea02d2b95115ebc0c5516aee14058da5e27a1f5bbc545fb831e6c3446fda82_write.sst
2020-05-17 17:59:49 187310 5_112_28_2550d727ffe3e408af41f541b134d5e064440c25988bbecfb665933d1195a45d_write.sst
2020-05-17 17:59:49 20770 backupmeta
```
## Scheduled Full Backup
> - **Optionality:** Optional
You can setup a backup policy to perform scheduled backups for a TiDB cluster, and set a backup retention policy. A scheduled full backup is described by a BackupSchedule CR object.
### Configure BackupSchedule CR
The following is an example BackupSchedule CR:
```
apiVersion: pingcap.com/v1alpha1
kind: BackupSchedule
metadata:
name: demo-backup-schedule-s3
namespace: ${cluster_namespace}
spec:
#maxBackups: 5
#pause: true
maxReservedTime: "3h"
schedule: "*/2 * * * *"
backupTemplate:
backupType: full
br:
cluster: ${cluster_name}
clusterNamespace: ${cluster_namespace}
sendCredToTikv: true
from:
host: ${cluster_name}-tidb
secretName: tidb-secret
s3:
provider: aws
secretName: s3-secret
region: ${aws_region}
bucket: ${my_bucket}
prefix: ${my_folder}
```
You should replace values in `{}` with specific variables in your envrioment and save in `backup-scheduler-aws-s3.yaml`.
### Perform Scheduled Backup
You can perform scheduled full backup using the following command:
```{.bash .copyable}
kubectl apply -f backup-scheduler-aws-s3.yaml
```
```{.output}
backupschedule.pingcap.com/demo-backup-schedule-s3 created
```
### Verify
### Check Backup Status
You can use the following command to check the backup status:
```{.bash .copyable}
kubectl get bk -n ${tidb_cluster_namespace} -o wide
```
```{.output}
NAME BACKUPPATH BACKUPSIZE COMMITTS STARTED COMPLETED AGE
backup02 s3://${/anthony_02/ 1611578 416550273190199297 5m1s 4m58s 5m12s
demo-backup-schedule-s3-2020-05-09t08-36-00 s3://anthonybr/anthony_sche/anthony-pd.anthony-2379-2020-05-09t08-36-00/ 1611578 416550320389226497 2m1s 118s 2m2s
demo-backup-schedule-s3-2020-05-09t08-38-00 0 2s
```
#### Inspect Backup Log
You can use the following command to find the pod for the scheduled full backup:
```{.bash .copyable}
kubectl get pod -n ${cluster_namespace}
```
You can then use the following command to check the backup progress:
```{.bash .copyable}
kubectl logs ${backup_pod} -n anthony
```
```{.output}
...
I0509 08:36:10.257977 1 backup.go:92] [2020/05/09 08:36:10.257 +00:00] [INFO] [client.go:149] ["save backup meta"] [path=s3://anthonybr/anthony_sche/anthony-pd.anthony-2379-2020-05-09t08-36-00] [jobs=0]
I0509 08:36:10.258477 1 backup.go:92] [2020/05/09 08:36:10.258 +00:00] [INFO] [progress.go:102] [Checksum] [progress=100.00%!](MISSING)
I0509 08:36:10.296059 1 backup.go:92] [2020/05/09 08:36:10.295 +00:00] [INFO] [ddl.go:407] ["[ddl] DDL closed"] [ID=061caa48-f309-422b-a53e-11623f55e1a8] ["take time"=973.143µs]
I0509 08:36:10.296130 1 backup.go:92] [2020/05/09 08:36:10.296 +00:00] [INFO] [ddl.go:301] ["[ddl] stop DDL"] [ID=061caa48-f309-422b-a53e-11623f55e1a8]
I0509 08:36:10.297872 1 backup.go:92] [2020/05/09 08:36:10.297 +00:00] [INFO] [domain.go:607] ["domain closed"] ["take time"=2.820476ms]
I0509 08:36:10.298207 1 backup.go:92] [2020/05/09 08:36:10.298 +00:00] [INFO] [collector.go:203] ["Full backup Success summary: total backup ranges: 16, total success: 16, total failed: 0, total take(s): 2.00, total kv: 16000, total size(MB): 1.92, avg speed(MB/s): 0.96"] ["backup checksum"=16.581592ms] ["backup fast checksum"=1.633146ms] ["backup total regions"=16]
I0509 08:36:10.300690 1 backup.go:92]
I0509 08:36:10.300730 1 backup.go:107] Backup data for cluster anthony/demo-backup-schedule-s3-2020-05-09t08-36-00 successfully
I0509 08:36:10.328351 1 manager.go:207] reset cluster anthony/demo-backup-schedule-s3-2020-05-09t08-36-00 tikv_gc_life_time to 10m0s success
I0509 08:36:10.328368 1 manager.go:218] backup cluster anthony/demo-backup-schedule-s3-2020-05-09t08-36-00 data to s3://anthonybr/anthony_sche/anthony-pd.anthony-2379-2020-05-09t08-36-00/ success
I0509 08:36:10.491981 1 manager.go:232] Get size 1611578 for backup files in s3://anthonybr/anthony_sche/anthony-pd.anthony-2379-2020-05-09t08-36-00/ of cluster anthony/demo-backup-schedule-s3-2020-05-09t08-36-00 success
I0509 08:36:10.520015 1 manager.go:244] get cluster anthony/demo-backup-schedule-s3-2020-05-09t08-36-00 commitTs 416550320389226497 success
I0509 08:36:10.541470 1 backup_status_updater.go:66] Backup: [anthony/demo-backup-schedule-s3-2020-05-09t08-36-00] updated successfully
```
#### Check Backup Files
You can use the S3 console or `aws s3` command to check backup files:
```{.bash .copyable}
aws s3 ls s3://${my_bucket}/${my_folder}/
```
## Restore
> - **Optionality:** Required
In this section, we will demonstrate how to use the backup created in previous sections to restore a database.
### Cleanup
Note that, BR only supports to restore to an empty cluster. To do that,you first need to login to the tidb cluster and drop the database `sbtest`:
```
MySQL [(none)]> drop database sbtest;
Query OK, 0 rows affected (0.31 sec)
```
### Configure Restore CR
Similar to backup, We use Restore Custom Resource (CR) to desbribe a restore operation. TiDB Operator performs restore operation based on the specification in the Restore CR.
The following is an example Restore CP:
```
apiVersion: pingcap.com/v1alpha1
kind: Restore
metadata:
name: demo-restore-s3
namespace: ${cluster_namespace}
spec:
br:
cluster: ${clustername}
clusterNamespace: ${cluster_namespace}
to:
host: ${cluster_name}-tidb
secretName: tidb-secret
s3:
provider: aws
secretName: s3-secret
region: ${awe_region}
bucket: ${my_bucket}
prefix: ${my_folder}
```
You should replace values in `{}` with specific variables in your envrioment and save in `restore-aws-s3.yaml`.
### Perform Restore
You can perform restore using the following command:
```{.bash .copyable}
kubectl apply -f resotre-aws-s3.yaml
```
```{.output}
restore.pingcap.com/demo-restore-s3 created
```
### Verify
#### Check Restore Status
You can use the following command to check the restore status:
```{.bash .copyable}
kubectl get rt -n ${namespace} -o wide
```
```{.output}
NAME STARTED COMPLETED AGE
demo-restore-s3 43s 25s 49s
```
#### Inspect Restore Log
You can use the following command to check the retore process:
```{.bash .copyable}
kubectl logs ${restore_pod} -n ${namespace}
```
```{.output}
...
I0508 09:04:46.106204 1 restore.go:86] [2020/05/08 09:04:46.106 +00:00] [INFO] [collector.go:203] ["Full restore Success summary: total restore files: 16, total success: 16, total failed: 0, total take(s): 0.51, total kv: 16000, total size(MB): 1.92, avg speed(MB/s): 3.79"] ["split region"=110.963551ms] ["restore checksum"=19.994152ms] ["restore ranges"=16]
I0508 09:04:46.108742 1 restore.go:86]
I0508 09:04:46.108804 1 restore.go:101] Restore data for cluster anthony/demo-restore-s3 successfully
I0508 09:04:46.120192 1 manager.go:206] reset cluster anthony/demo-restore-s3 tikv_gc_life_time to 10m0s success
I0508 09:04:46.120219 1 manager.go:217] restore cluster anthony/demo-restore-s3 from succeed
I0508 09:04:46.143491 1 restore_status_updater.go:66] Restore: [anthony/demo-restore-s3] updated successfully
```
#### Check Database
You can check from TiDB cluster after restore is done successfully:
```
MySQL [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| INFORMATION_SCHEMA |
| METRICS_SCHEMA |
| PERFORMANCE_SCHEMA |
| mysql |
| sbtest |
| test |
+--------------------+
6 rows in set (0.00 sec)
MySQL [(none)]> use sbtest;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MySQL [sbtest]> show tables;
+------------------+
| Tables_in_sbtest |
+------------------+
| sbtest1 |
| sbtest2 |
| sbtest3 |
| sbtest4 |
| sbtest5 |
| sbtest6 |
| sbtest7 |
| sbtest8 |
+------------------+
8 rows in set (0.00 sec)
MySQL [sbtest]> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 1000 |
+----------+
1 row in set (0.01 sec)
```