# Change Data Capture > - **Objective:** Learn to deploy TiCDC in a TiDB cluster on AWS (with Kubernetes) > - **Prerequisites:** > - Background knowledge of TiDB components > - Background knowledge of Kubernetes and TiDB Operator > - Background knowledge of [TiCDC](https://pingcap.com/docs/stable/ticdc/ticdc-overview/) > - **Optionality:** Optional > - **Estimated time:** 30 mins ## Deploy Downstream TiDB Cluster > - **Optionality:** Optional TODO: extract instructions on how to deploy a second TiDB cluster. If you have a downstream cluster already deployed, you can skip this section. ## Provision TiCDC Nodes ``` variable "create_cdc_node_pool" { description = "whether creating node pool for cdc" default = true } variable "cluster_cdc_count" { default = 3 } variable "cluster_cdc_instance_type" { default = "c5.2xlarge" } ``` To apply the changes, you can run: ``` $ terraform apply ``` It might take 10 minutes or more to finish the process. ## Deploy TiCDC To deploy TiCDC, you can edit `TidbCluster` CR: ``` $ kubectl edit tc ${upstream} -n ${upstream_namespace} ``` In the prompt, add the TiCDC specification: ``` ticdc: baseImage: pingcap/ticdc replicas: 3 ``` Once you have save the changes, TiDB operator starts to deploy TiCDC. You can use the following command to check the status of TiCDC pods: ``` $ kubectl get pod -n ${upstream_namespace} NAME READY STATUS RESTARTS AGE basic-discovery-6bb656bfd-sps8z 1/1 Running 0 4h7m basic-pd-0 1/1 Running 0 4h7m basic-pd-1 1/1 Running 0 4h7m basic-pd-2 1/1 Running 2 4h7m basic-ticdc-0 1/1 Running 0 3h15m basic-ticdc-1 1/1 Running 0 3h15m basic-ticdc-2 1/1 Running 0 3h15m basic-tidb-0 2/2 Running 0 4h6m basic-tidb-1 2/2 Running 0 4h6m basic-tikv-0 1/1 Running 0 4h7m basic-tikv-1 1/1 Running 0 4h7m basic-tikv-2 1/1 Running 0 4h7m ``` You can use the following command to check the status of TiCDC service: ``` $ kubectl get svc -n ${upstream_namespace} NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE basic-discovery ClusterIP 10.108.13.111 <none> 10261/TCP 3h37m basic-pd ClusterIP 10.103.226.105 <none> 2379/TCP 3h37m basic-pd-peer ClusterIP None <none> 2380/TCP 3h37m basic-ticdc-peer ClusterIP None <none> 8301/TCP 165m basic-tidb ClusterIP 10.108.186.92 <none> 4000/TCP,10080/TCP 3h35m basic-tidb-peer ClusterIP None <none> 10080/TCP 3h35m basic-tikv-peer ClusterIP None <none> 20160/TCP 3h36m ``` You should take notes of the `ClusterIP` of `basic-pd` and `basic-tidb`, which will be used by TiCDC to create changefeed. ## Create Changefeed To create a change feed, you first login to one of the TiCDC pod: ``` $ kubectl exec -it basic-ticdc-0 -n ${upstream_namespace} sh ``` Inside the pod, you can first inspect the TiCDC cluster: ``` $ /cdc cli capture list --pd="http://{pd_CLUSTER-IP}:2379" [ { "id": "391d4695-a4fb-456a-b800-5a07fb1bc9d6", "is-owner": false, "address": "basic-ticdc-0.basic-ticdc-peer.demo.svc:8301" }, { "id": "659b88a5-0656-47bf-997f-f47956ae9e1e", "is-owner": true, "address": "basic-ticdc-2.basic-ticdc-peer.demo.svc:8301" }, { "id": "c83b6c55-8293-4613-9f49-73c6142abc75", "is-owner": false, "address": "basic-ticdc-1.basic-ticdc-peer.demo.svc:8301" } ] ``` To create a changefeed, you can execute the following command: ``` $ /cdc cli changefeed create --sink-uri="mysql://root:@{tidb_CLUSTER-IP}:4000/" --pd="http://${pd_CLUSTER-IP}:2379" Create changefeed successfully! ID: 145ee6dd-1220-43f2-8d0b-423ab175944f Info: {"sink-uri":"mysql://root:@10.104.118.45:4000/","opts":{},"create-time":"2020-05-30T19:34:11.4398499Z","start-ts":417036304749166593,"target-ts":0,"admin-job-type":0,"sort-engine":"memory","sort-dir":".","config":{"case-sensitive":true,"filter":{"ignore-txn-start-ts":null,"ddl-white-list":null},"mounter":{"worker-num":16},"sink":{"dispatch-rules":null},"cyclic-replication":{"enable":false,"replica-id":0,"filter-replica-ids":null,"id-buckets":0,"sync-ddl":false}}} ``` You can check the current in progress processes: ``` $ cdc cli processor list --pd="http://${pd_CLUSTER-IP}:2379" [ { "changefeed-id": "145ee6dd-1220-43f2-8d0b-423ab175944f", "capture-id": "e2692613-9aaf-408e-8718-3d710fd2117e" } ] ``` ## Run Sysbench It is recommended to explore TiCDC with an empty database. ``` mysql-host=${upstream_tidb_EXTERNAL-IP} mysql-port=4000 mysql-user=root mysql-db=cdc time=1200 threads=8 report-interval=10 db-driver=mysql ``` To prepare data, you can run the following command: ``` $ sysbench --config-file=config oltp_point_select --tables=1 --table-size=1000 prepare ``` ## Verify Data ### Verify Data is Synced You can get the checksum of the `cdc.sbtest1` table in both the upstream and downstream TiDB clusters: ``` $ mysql -h ${upstream_tidb_EXTERNAL-IP} -P 4000 -u root ``` ``` mysql> admin checksum table cdc.sbtest1; ``` ``` $ mysql -h ${downstream_tidb_EXTERNAL-IP} -P 4000 -u root ``` ``` mysql> admin checksum table cdc.sbtest1; ``` The value of the checksum should match. You can run SQL queries for further data verifications,. ## Cleanup ### Remove Changefeed ``` $ kubectl exec -it basic-ticdc-0 -n ${upstream_namespace} sh ``` ``` $ /cdc cli changefeed remove --changefeed-id=145ee6dd-1220-43f2-8d0b-423ab175944f --pd="http://10.103.226.105:2379" ``` Check the remove is successful: ``` $ cdc cli processor list --pd="http://${pd_CLUSTER-IP}:2379" [] ``` ### Remove TiCDC in TidbCluster CR You can remove TiCDC frin `TidbCluster` CR: ``` kubectl edit tc ${upstream} -n ${upstream_namespace} ``` ### Delete TiCDC StatefulSet After that, you can delete the TiCDC StatefulSet: ``` $ kubectl get sts -n ${upstream_namespace} NAME READY AGE basic-pd 0/3 2d12h basic-ticdc 0/3 2d11h basic-tidb 0/2 2d12h basic-tikv 0/3 2d12h ``` ``` $ kubectl delete sts basic-ticdc -n ${upstream_namespace} statefulset.apps "basic-ticdc" deleted ``` You can verify that the StatefulSet is successfully deleted: ``` $ kubectl get pod -n ${upstream_namespace} ``` #### Troubleshooting In case that the TiCDC pods are stuck in the `Terminating` state, you can force delete them by: ``` $ kubectl delete pod basic-ticdc-2 -n ${upstream_namespace} --force --grace-period=0 ```