Binlog - HackMD

# Binlog > - **Objective:** Learn to use Binlog to sync data between TiDB cluster on AWS (with Kubernetes) > - **Prerequisites:** > - Background knowledge of TiDB components > - Background knowledge of Kubernetes and TiDB Operator > - Background knowledge of [Binlog](https://pingcap.com/docs/stable/reference/tidb-binlog/overview/#tidb-binlog-cluster-overview) > - AWS account > - TiDB cluster on AWS > - **Optionality:** Optional > - **Estimated time:** 30 mins ## Deploy Downstream TiDB Cluster > - **Optionality:** Optional TODO: extract instructions on how to deploy a second TiDB cluster. If you have a downstream cluster already deployed, you can skip this section. ## Provision Binlog Nodes TODO: The provision script does not support this feature yet. We will modify provision script to provide this. ## Deploy Binglog Currently we have two TiDB cluster with two namespace. we need deploy Binlog components(pump & drainer) in first cluster. ### Deploy Pump ``` $ kubectl edit tc ${upstream} -n ${upstream_namespace} ``` add pump in spec: ``` spec pump: baseImage: pingcap/tidb-binlog replicas: 1 storageClassName: ebs-gp2 requests: storage: 30Gi schedulerName: default-scheduler config: addr: 0.0.0.0:8250 gc: 7 heartbeat-interval: 2 ``` ---- **NOTE** The Pump version need to be the as same as the TiDB version. --- Confirm that pump pod is running: ``` $ kubectl get pod -n ${namespace} ``` ### Deploy Drainer #### Configure Drainer ``` $ helm repo update ``` You can search for the available versions ``` $ helm search tidb-drainer -l pingcap/tidb-drainer v1.1.0 A Helm chart for TiDB Binlog drainer. pingcap/tidb-drainer v1.0.6 A Helm chart for TiDB Binlog drainer. pingcap/tidb-drainer v1.0.5 A Helm chart for TiDB Binlog drainer. pingcap/tidb-drainer v1.0.4 A Helm chart for TiDB Binlog drainer. pingcap/tidb-drainer latest A Helm chart for TiDB Binlog drainer. pingcap/tidb-drainer dev A Helm chart for TiDB Binlog drainer. ``` ``` $ helm inspect values pingcap/tidb-drainer --version=v1.1.0 > values-drainer.yaml ``` You need to configure ``` clusterName: basic clusterVersion: v4.0.0 storageClassName: local-storage storage: 10Gi config: | detect-interval = 10 [syncer] worker-count = 16 txn-batch = 20 disable-dispatch = false ignore-schemas = "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql" safe-mode = false db-type = "tidb" [syncer.to] host = 10.106.173.83 user = "root" password = "" port = 4000 ``` #### Install Drainer ``` $ helm install pingcap/tidb-drainer --name=${upstream} --namespace=${upstream_namespace} --version=v1.1.0 -f values-drainer.yaml ``` ### Run Sysbench 1. Login to the bastion machine: ``` $ ssh -i credentials/${eks_name}.pem centos@${bastion_ip} ``` > - Note: Use `terraform output` to get bastion_ip 2. Create a database for sysbench ``` $ mysql -h ${first_cluster_tidb_ip} -P 4000 -u root mysql> create database binlog; ``` 3. Create a sysbench config file named `config`: ``` mysql-host=${first_cluster_tidb_ip} mysql-port=4000 mysql-user=root mysql-db=binlog time=1200 threads=8 report-interval=10 db-driver=mysql ``` 4. Prepare data: ``` $ sysbench --config-file=config oltp_point_select --tables=1 --table-size=1000 prepare ``` ### Use admin checksum #### check data is synced 1. Login to the bastion machine: ``` $ ssh -i credentials/${eks_name}.pem centos@${bastion_ip} ``` > - Note: Use `terraform output` to get bastion_ip > 2. record upstream checksum ``` $ mysql -h ${first_cluster_tidb_ip} -P 4000 -u root > admin checksum table binlog.sbtest1; ``` 3. record downstream checksum ``` $ mysql -h ${first_cluster_tidb_ip} -P 4000 -u root > admin checksum table binlog.sbtest1; ```