# Connect to TiDB
> - **Objective:** Learn options for connecting to TiDB
> - **Prerequisites:**
> - background knowledge of TiDB components
> - background knowledge of Kubernetes and TiDB
> - AWS account
> - TiDB cluster on AWS
> - Shell variables set
> - **Optionality:** Required
> - **Estimated time:** 15 minutes
This document focuses on connecting to TiDB using a MySQL-compatible client or client library.
This document assumes you have set shell variables for your cluster name, namespace, and Kubernetes config file.
```{.bash .copyable}
cluster_name=my-cluster
namespace=poc
KUBECONFIG=credentials/kubeconfig
export KUBECONFIG
```
The "endpoint" (address) you use to connect to the TiDB in a Kubernetes cluster depends on where you're connecting from.
If you deploy an application or client in the same Kubernetes cluster as your TiDB cluster, you can use internal Kubernetes DNS to connect to the load balancer for the TiDB service.
If you deploy outside of the Kubernetes cluster where TiDB cluster is deployed, but in the same VPC, you can use the URL of the load balancer, which is publicly-resolvable to a private IP.
If you want to connect using a client program running locally on your desktop, or connect from an application you're developing or testing locally, you can use `kubectl` port forwarding to do so. This involves traffic going over the internet, so it's the slowest and least secure approach.
You can use any client you like, including the command-line clients included in MySQL and MariaDB, and the full variety of [connectors and APIs]( https://pingcap.com/docs/dev/connectors-and-apis/) compatible with MySQL.
> **Note:**
>
> + To connect to TiDB using a MySQL client from MySQL 8.0, you must explicitly specify `--default-auth=mysql_native_password`, because `mysql_native_password` is [no longer the default plugin](https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html#upgrade-caching-sha2-password).
## Using external load balancer
The "external" load balancer associated with the TiDB service in Kubernetes is usable from within the same VPC where the Kubernetes cluster is deployed.
The Terraform deployment of TiDB Operator creates a "bastion" host that exists inside the same AWS VPC where the Kubernetes cluster is deployed, but outside the Kubernetes cluster itself. If you intend to deploy your application outside of Kubernetes, you should try to deploy it in the same VPC as your TiDB cluster to increase efficiency and decrease costs.
### Get load balancer hostname
Identify the hostname of the load balancer with this command:
```{.bash .copyable}
kubectl get svc -n "$namespace" -l app.kubernetes.io/component=tidb
```
```{.output}
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-cluster-tidb LoadBalancer 172.20.3.7 a706a407a021a4517b855a26b2b6679a-a40528719845c02f.elb.us-west-2.amazonaws.com 4000:31907/TCP,10080:30064/TCP 45m
my-cluster-tidb-peer ClusterIP None <none> 10080/TCP 45m
```
Copy the value from the `EXTERNAL-IP` column for the service ending in `-tidb`, in this case `a706a407a021a4517b855a26b2b6679a-a40528719845c02f.elb.us-west-2.amazonaws.com`.
You can use the `dig` command (or `host` or `nslookup` or similar) to see what IP addresses this resolves to:
```{.bash .copyable}
dig +short a706a407a021a4517b855a26b2b6679a-a40528719845c02f.elb.us-west-2.amazonaws.com
```
```{.output}
10.0.17.221
10.0.45.122
```
Note that although the hostname can be resolved using public DNS, the IP addresses it resolves to are private addresses that are addressable only inside the VPC where the Kubernetes cluster runs.
It's possible your load balancer was created with the `load_balancing.cross_zone.enabled` attribute set to `false`. If so, you might find that your usage does not balance across all TiDB pods if you're connecting from an EC2 instance that happens to be in the same zone as one of the nodes where a TiDB pod is running. You can use the AWS console or `aws` command-line tool to modify that attribute. For example, plug in the ARN of your load balancer into a command like this one:
```{.bash .copyable}
aws elbv2 modify-load-balancer-attributes \
--attributes Key="load_balancing.cross_zone.enabled",Value=true \
--load-balancer-arn "arn:aws:elasticloadbalancing:us-west-2:385595570414:loadbalancer/net/acb2749b8cb434736a759ad4d9beb79b/8b1661d153853fdd"
```
### Connect to intermediate host (bastion)
The IP address of the bastion EC2 instance is shown in Terraform's output:
```{.bash .copyable}
terraform output
```
```{.output}
bastion_ip = [
"34.220.67.194",
]
eks_endpoint = https://D82C6E3B0A44482ADCE3BA444005A89A.sk1.us-west-2.eks.amazonaws.com
eks_version = 1.15
kubeconfig_filename = credentials/kubeconfig_poc-cluster
region = us-west-2
```
Copy the IP value from the `bastion_ip` array in that output, or get it directly with this command:
```{.bash .copyable}
terraform output -json | jq -r '.bastion_ip.value[0]'
```
```{.output}
34.220.67.194
```
The SSH key for the bastion host is in the `credentials/` subdirectory, named using the `eks_name` variable from the terraform deployment followed by a `.pem` suffix.
The bastion host runs CentOS, and the OS login name is `centos`.
```{.bash .copyable}
bastion_ip=34.220.67.194
ssh -i credentials/poc-cluster.pem -l centos "$bastion_ip"
```
### Connect to TiDB
Note that the TiDB service listens on port 4000 rather than port 3306 used by MySQL. Substitute your endpoint URL:
```{.bash .copyable}
host=a706a407a021a4517b855a26b2b6679a-a40528719845c02f.elb.us-west-2.amazonaws.com
mysql -P 4000 -u root -h "$host"
```
To confirm that load balancing is working, you can use a simple shell loop like this:
```{.bash .copyable}
for ((i=0; i<20; i++))
do
mysql -h "$host" -P 4000 -u root -BNe 'select @@hostname'
sleep .5
done
```
```{.output}
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-1
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-0
my-cluster-tidb-1
my-cluster-tidb-0
```
## Using Kubernetes DNS
If you are going to deploy your application in the same Kubernetes cluster where TiDB is deployed, you can use internal Kubernetes name resolution even if you deploy into a different namespace.
### Get hostname of TiDB service
Kubernetes service DNS names are composed of the name of the service followed by the name of the namespace where the service exists. See [DNS for Services and Pods](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/) for more details about DNS in Kubernetes.
```{.bash .copyable}
kubectl get svc -n "$namespace" -l app.kubernetes.io/component=tidb
```
```{.output}
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-cluster-tidb LoadBalancer 172.20.3.7 a706a407a021a4517b855a26b2b6679a-a40528719845c02f.elb.us-west-2.amazonaws.com 4000:31907/TCP,10080:30064/TCP 116m
my-cluster-tidb-peer ClusterIP None <none> 10080/TCP 116m
```
In this case, the service is named `my-cluster-tidb` and the namespace is `poc`. That means the DNS name to resolve the service is `my-cluster-tidb.poc`. (You can optionally append `.svc`, and even the "zone" (cluster domain) to create a fully-qualified name like `my-cluster-tidb.poc.svc.cluster.local`, though the details of that are beyond the scope of this document. Read [DNS for Services and Pods](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/) for more details.
### Create a new pod
If you were deploying an application in Kubernetes, this is where you'd do that. It must be managed by the same Kubernetes cluster as TiDB, but it can (and likely should) be in a different namespace. We'll simulate that here by creating a namespace and deploying a pod temporarily where we can run the MySQL client.
Create a new namespace and get a shell in a new pod:
```{.bash .copyable}
kubectl create namespace tidb-client
kubectl run mysql-client --namespace=tidb-client -ti --rm --image=mysql -- /bin/bash
```
```{.output}
namespace/tidb-client created
If you don't see a command prompt, try pressing enter.
root@mysql-client:/#
```
### Connect to TiDB
From there, you can execute the `mysql` client as usual to connect to the TiDB service:
```{.bash .copyable}
mysql -h my-cluster-tidb.poc -P 4000 -u root
```
If you want to confirm that load balancing via the service endpoint works as expected, you can execute something like this:
```{.bash .copyable}
for ((i=0; i<20; i++))
do
mysql -h my-cluster-tidb.poc \
-P 4000 -u root -BNe 'select @@hostname'
sleep .5
done
```
```{.output}
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-1
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-0
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-0
my-cluster-tidb-1
my-cluster-tidb-0
```
## Using `kubectl` port forwarding
Port forwarding using `kubectl` is a convenient way to connect to the TiDB service from your desktop for quick troubleshooting, using GUI programs, or testing an app that you're developing locally.
First, get the name of the TiDB service.
```{.bash .copyable}
kubectl get svc -n "$namespace" -l app.kubernetes.io/component=tidb
```
```{.output}
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-cluster-tidb LoadBalancer 172.20.3.7 a706a407a021a4517b855a26b2b6679a-a40528719845c02f.elb.us-west-2.amazonaws.com 4000:31907/TCP,10080:30064/TCP 167m
my-cluster-tidb-peer ClusterIP None <none> 10080/TCP 167m
```
Now, forward local port 4000 to port 4000 of the TiDB service in the cluster. Note that `svc/` is prefixed to the service name so `kubectl` knows that you want to forward to a _service_ instead of a _pod_.
```{.bash .copyable}
kubectl port-forward -n "$namespace" svc/my-cluster-tidb 4000
```
```{.output}
Forwarding from 127.0.0.1:4000 -> 4000
Forwarding from [::1]:4000 -> 4000
```
This will run in the foreground of the shell, so use another terminal window to execute the `mysql` client. (If you don't want to or can't easily open a new window, send `kubectl` to the background by stopping it with Ctrl-Z and resuming it by executing `bg` in the shell.)
```{.bash .copyable}
mysql -h 127.0.0.1 -P 4000 -u root
```
Note that load balancing does not occur when port forwarding to a Kubernetes service. A single backing pod is selected and all traffic goes to that single pod for as long as `kubectl` forwards the port.
```{.bash .copyable}
for ((i=0; i<10; i++)); do
mysql -h 127.0.0.1 \
-P 4000 -u root -BNe 'select @@hostname'
sleep .5
done
```
```{.output}
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-1
my-cluster-tidb-1
```
This means that if the backing pod goes away for any reason, the port forwarding session will end. For example, delete `poc-cluster-tidb-1`.
```{.bash .copyable}
kubectl delete pod my-cluster-tidb-1 -n "$namespace" && \
mysql -h 127.0.0.1 -P 4000 -u root
```
```{.output}
pod "poc-cluster-tidb-1" deleted
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 0
```
And in the terminal window where `kubectl port-forward` is running, you should see something like this:
```{.output}
E0519 11:54:10.982504 48792 portforward.go:400] an error occurred forwarding 4000 -> 4000: error forwarding port 4000 to pod 8287660362dbac81aca58481ffcddc71867c604b60cc623eb84f6b10458a896b, uid : Error: No such container: 8287660362dbac81aca58481ffcddc71867c604b60cc623eb84f6b10458a896b
```
## Troubleshooting
### describe the unusual case
describe the reason and how to fix
#### describe another unusual case
describe the reason and how to fix