owned this note
owned this note
Published
Linked with GitHub
---
tags: IT, Database, slide
disqus: kywk-moco
slideOptions:
transition: slide
---
[DB] Introduction to Vitess
===========================
Summary of [The Vitess Documentation](https://vitess.io/docs/)
---
## Overview
High-level information about Vitess
- What Is Vitess
- Scaling MySQL With Vitess
- Key Vitess Concepts
---
## What Is Vitess
Vitess is a database solution for deploying, scaling and managing large clusters of MySQL instances, help you with the following problems:
1. Scaling a MySQL database by allowing you to shard it, while keeping application changes to a minimum.
2. Migrating from baremetal to a private or public cloud.
3. Deploying and managing a large number of MySQL instances.
---
## Features
- Performance
- Connection pooling
- Query de-duping
- Transaction manager
----
## Features (cont.)
- Protection
- Query rewriting and sanitization
- Query blacklisting
- Query killer
- Table ACLs
----
## Features (cont.)
- Monitoring
- Performance analysis
- Query streaming
- Update stream
----
## Features (cont.)
- Topology Management Tools
- Master management tools (handles reparenting)
- Web-based management GUI
- Designed to work in multiple data centers / regions
----
## Features (cont.)
- Sharding
- Virtually seamless dynamic re-sharding
- Vertical and Horizontal sharding support
- Multiple sharding schemes, with the ability to plug-in custom ones
---
## Architecture

----
### Topology

----
### Topology
- __vtgate__: a light proxy server that routes traffic to the correct vttablet(s) and returns consolidated results back to the client. It is the server to which applications send queries.
- vtgate considers the sharding scheme, required latency, and the availability of the tablets.
----
### Topology (cont.)
- __vttablet__: a proxy server that sits in front of a MySQL database. A Vitess implementation has one vttablet for each MySQL instance.
- Its features include connection pooling, query rewriting, and query de-duping.
----
### Topology (cont.)
- _vtctl_: a command-line tool used to administer a Vitess cluster.
- _vtctld_: an HTTP server that lets you browse the information stored in the lockserver.
----
### Topology (cont.)
- __vtworker__: hosts long-running processes. It supports a plugin architecture and offers libraries.
- resharding differ jobs check data integrity during shard splits and joins
- vertical split differ jobs check data integrity during vertical splits and joins
----
### Other support tools: admin
- _mysqlctl_: Manage MySQL instances
- _zk_: Command-line ZooKeeper client and explorer
- _zkctl_: Manage ZooKeeper instances
----
### Other support tools: testing
- _vtcombo_: A single binary that contains all components of Vitess.
- _vtexplain_: A command line tool that is used to explore how Vitess will handle queries based on a user-supplied schema and topology, without needing to set up a full cluster.
---
## Scaling MySQL With Vitess
Running Vitess on Local Kubernetes (minikube)
----
### Prerequisites
- Install Minikube
```
$ minikube start --cpus=4 --memory=5000
```
- Install etcd operator
- setup basic rule for etcd operator
```
$ example/rabc/create_role.sh
```
- install etcd operator
```
$ kubectl create -f example/depolyment.yaml
```
- Install helm
```
$ helm init
```
---
## Starting a single keyspace cluster
- Download and create vitess chart
```
$ cd example/helm
$ helm install ../../helm/vitess -f 101_initial_cluster.yaml
```
----
### Topolgy chart
```
topology:
cells:
- name: "zone1"
etcd:
replicas: 1
vtctld:
replicas: 1
vtgate:
replicas: 1
mysqlProtocol:
enabled: true
authType: "none"
keyspaces:
- name: "commerce"
shards:
- name: "0"
tablets:
- type: "replica"
vttablet:
replicas: 2
- type: "rdonly"
vttablet:
replicas: 1
schema:
initial: |-
create table product(
sku varbinary(128),
description varbinary(128),
price bigint,
primary key(sku)
);
create table customer(
customer_id bigint not null auto_increment,
email varbinary(128),
primary key(customer_id)
);
create table corder(
order_id bigint not null auto_increment,
customer_id bigint,
sku varbinary(128),
price bigint,
primary key(order_id)
);
vschema:
initial: |-
{
"tables": {
"product": {},
"customer": {},
"corder": {}
}
}
etcd:
replicas: 1
resources:
vtctld:
serviceType: "NodePort"
resources:
vtgate:
serviceType: "NodePort"
resources:
vttablet:
mysqlSize: "test"
resources:
mysqlResources:
vtworker:
resources:
pmm:
enabled: false
orchestrator:
enabled: false
```
----
### Verify cluster
```
$ kubectl get pods,jobs
NAME READY STATUS RESTARTS AGE
pod/etcd-global-tr49f52728 1/1 Running 0 69m
pod/etcd-operator-6c54c6cff6-kk8zw 1/1 Running 0 70m
pod/etcd-zone1-fzmv6thdrb 1/1 Running 0 69m
pod/vtctld-5dc4466d88-gfwbc 1/1 Running 3 69m
pod/vtgate-zone1-9c8589688-vj2rw 1/1 Running 3 69m
pod/zone1-commerce-0-rdonly-0 6/6 Running 0 69m
pod/zone1-commerce-0-replica-0 6/6 Running 0 69m
pod/zone1-commerce-0-replica-1 6/6 Running 0 69m
NAME COMPLETIONS DURATION AGE
job.batch/commerce-apply-schema-initial 1/1 2m9s 69m
job.batch/commerce-apply-vschema-initial 1/1 2m5s 69m
job.batch/zone1-commerce-0-init-shard-master 1/1 2m 69m
```
----
### Example database content
```
$ ./kmysql.sh
mysql> show tables;
+--------------------+
| Tables_in_commerce |
+--------------------+
| corder |
| customer |
| product |
+--------------------+
3 rows in set (0.01 sec)
```
----
__vtctld: Admin panel__ (`$ ./kvtctld.sh`)

---
## Vertical Split
----
__Prepare sample data__
```
$ ./kmysql.sh < ../common/insert_commerce_data.sql
$ ./kmysql.sh --table < ../common/select_commerce_data.sql
Using commerce/0
Customer
+-------------+--------------------+
| customer_id | email |
+-------------+--------------------+
| 1 | alice@domain.com |
| 2 | bob@domain.com |
| 3 | charlie@domain.com |
| 4 | dan@domain.com |
| 5 | eve@domain.com |
+-------------+--------------------+
Product
+----------+-------------+-------+
| sku | description | price |
+----------+-------------+-------+
| SKU-1001 | Monitor | 100 |
| SKU-1002 | Keyboard | 30 |
+----------+-------------+-------+
COrder
+----------+-------------+----------+-------+
| order_id | customer_id | sku | price |
+----------+-------------+----------+-------+
| 1 | 1 | SKU-1001 | 100 |
| 2 | 2 | SKU-1002 | 30 |
| 3 | 3 | SKU-1002 | 30 |
| 4 | 4 | SKU-1002 | 30 |
| 5 | 5 | SKU-1002 | 30 |
+----------+-------------+----------+-------+
```
----
### Create Keyspace
1. Create a special `served_from` keyspace.
- This keyspace starts off as an alias for the `commerce` keyspace.
- Vertically split tables into the new keyspace without change app.
2. Create another keyspace.
----
__Upgrade vitess chart__
```
$ export release=$(helm ls -q)
$ helm upgrade $release ../../helm/vitess/ -f 201_customer_keyspace.yaml
```
```
jobs:
- name: "create-customer-ks"
kind: "vtctlclient"
command: "CreateKeyspace -served_from='master:commerce,replica:commerce,rdonly:commerce' customer"
NAME DESIRED SUCCESSFUL AGE
jobs/vtctlclient-create-customer-ks 1 1 10s
```
----
__Customer Tablets__
```
keyspaces:
- name: "commerce"
shards:
- name: "0"
tablets:
- type: "replica"
vttablet:
replicas: 2
- type: "rdonly"
vttablet:
replicas: 1
vschema:
vsplit: |-
{
"tables": {
"product": {}
}
}
- name: "customer"
shards:
- name: "0"
tablets:
- type: "replica"
vttablet:
replicas: 2
- type: "rdonly"
vttablet:
replicas: 1
copySchema:
source: "commerce/0"
tables:
- "customer"
- "corder"
vschema:
vsplit: |-
{
"tables": {
"customer": {},
"corder": {}
}
}
```
----
```
$ helm upgrade $release ../../helm/vitess/ -f 202_customer_tablets.yaml
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
pod/etcd-global-tr49f52728 1/1 Running 0 69m
pod/etcd-operator-6c54c6cff6-kk8zw 1/1 Running 0 70m
pod/etcd-zone1-fzmv6thdrb 1/1 Running 0 69m
pod/vtctld-5dc4466d88-gfwbc 1/1 Running 3 69m
pod/vtgate-zone1-9c8589688-vj2rw 1/1 Running 3 69m
pod/zone1-commerce-0-rdonly-0 6/6 Running 0 69m
pod/zone1-commerce-0-replica-0 6/6 Running 0 69m
pod/zone1-commerce-0-replica-1 6/6 Running 0 69m
pod/zone1-customer-0-rdonly-0 6/6 Running 0 18m
pod/zone1-customer-0-replica-0 6/6 Running 0 18m
pod/zone1-customer-0-replica-1 6/6 Running 0 18m
NAME DESIRED SUCCESSFUL AGE
jobs/commerce-apply-vschema-vsplit 1 1 5m
jobs/customer-apply-vschema-vsplit 1 1 5m
jobs/customer-copy-schema-0 1 1 5m
jobs/zone1-customer-0-init-shard-master 1 1 5m
```
----
### VerticalSplitClone ###
```
jobs:
- name: "vertical-split"
kind: "vtworker"
cell: "zone1"
command: "VerticalSplitClone -min_healthy_rdonly_tablets=1 -tables=customer,corder customer/0"
```
----
````
./kmysql.sh --table < ../common/select_customer0_data.sql
Using customer/0
Customer
+-------------+--------------------+
| customer_id | email |
+-------------+--------------------+
| 1 | alice@domain.com |
| 2 | bob@domain.com |
| 3 | charlie@domain.com |
| 4 | dan@domain.com |
| 5 | eve@domain.com |
+-------------+--------------------+
COrder
+----------+-------------+----------+-------+
| order_id | customer_id | sku | price |
+----------+-------------+----------+-------+
| 1 | 1 | SKU-1001 | 100 |
| 2 | 2 | SKU-1002 | 30 |
| 3 | 3 | SKU-1002 | 30 |
| 4 | 4 | SKU-1002 | 30 |
| 5 | 5 | SKU-1002 | 30 |
+----------+-------------+----------+-------+
````
----
### Cut over
__migrate_replicas__
```
jobs:
- name: "msf1"
kind: "vtctlclient"
command: "MigrateServedFrom customer/0 rdonly"
- name: "msf2"
kind: "vtctlclient"
command: "MigrateServedFrom customer/0 replica"
```
__migrate_master__
```
- name: "msf3"
kind: "vtctlclient"
command: "MigrateServedFrom customer/0 master"
```
----
### Clean up
```
schema:
postsplit: |-
drop table customer;
drop table corder;
```
```
jobs:
- name: "vclean1"
kind: "vtctlclient"
command: "SetShardTabletControl -blacklisted_tables=customer,corder -remove commerce/0 rdonly"
- name: "vclean2"
kind: "vtctlclient"
command: "SetShardTabletControl -blacklisted_tables=customer,corder -remove commerce/0 replica"
- name: "vclean3"
kind: "vtctlclient"
command: "SetShardTabletControl -blacklisted_tables=customer,corder -remove commerce/0 master"
```
----
```
./kmysql.sh --table < ../common/select_commerce_data.sql
Using commerce/0
Customer
ERROR 1105 (HY000) at line 4: vtgate: http://vtgate-zone1-5ff9c47db6-7rmld:15001/: target: commerce.0.master, used tablet: zone1-1564760600 (zone1-commerce-0-replica-0.vttablet), vttablet: rpc error: code = InvalidArgument desc = table customer not found in schema (CallerID: userData1)
```
---
## Horizontal sharding
----
### Sequences
The customer and corder have auto-increment columns.
This scheme does not work well in a sharded setup. Instead, Vitess provides an equivalent feature through sequences.
The syntax to generate an id is: `select next :n values from customer_seq`.
----
```
create table customer_seq(id int, next_id bigint, cache bigint, primary key(id)) comment 'vitess_sequence';
insert into customer_seq(id, next_id, cache) values(0, 1000, 100);
create table order_seq(id int, next_id bigint, cache bigint, primary key(id)) comment 'vitess_sequence';
insert into order_seq(id, next_id, cache) values(0, 1000, 100);
```
- `id` is always 0
- `next_id` is set to `1000`
- `cache` specifies the number of values to cache before vttablet updates `next_id`.
----
The VTGates also need to know about the sequence tables.
```
{
"tables": {
"customer_seq": {
"type": "sequence"
},
"order_seq": {
"type": "sequence"
},
"product": {}
}
}
```
----
### Vindexes
The next decision is about the sharding keys, aka Primary Vindexes.
- What are the highest QPS queries, and what are the where clauses for them?
- Cardinality of the column; it must be high.
- Do we want some rows to live together to support in-shard joins?
- Do we want certain rows that will be in the same transaction to live together?
----
__VSchema for `customer`__
```
{
"sharded": true,
"vindexes": {
"hash": {
"type": "hash"
}
},
"tables": {
"customer": {
"column_vindexes": [
{
"column": "customer_id",
"name": "hash"
}
],
"auto_increment": {
"column": "customer_id",
"sequence": "customer_seq"
}
},
"corder": {
"column_vindexes": [
{
"column": "customer_id",
"name": "hash"
}
],
"auto_increment": {
"column": "order_id",
"sequence": "order_seq"
}
}
}
}
```
----
__Upgrade vitess chart__
```
$ helm upgrade $release ../../helm/vitess/ -f 301_customer_sharded.yaml
```
```
NAME DESIRED SUCCESSFUL AGE
jobs/commerce-apply-schema-seq 1 1 19s
jobs/commerce-apply-vschema-seq 1 1 19s
jobs/customer-apply-schema-sharded 1 1 19s
jobs/customer-apply-vschema-sharded 1 1 19s
```
----
### Create new shards
```
- name: "customer"
shards:
- name: "0"
tablets:
- type: "replica"
vttablet:
replicas: 2
- type: "rdonly"
vttablet:
replicas: 1
- name: "-80"
tablets:
- type: "replica"
vttablet:
replicas: 2
- type: "rdonly"
vttablet:
replicas: 1
copySchema:
source: "customer/0"
- name: "80-"
tablets:
- type: "replica"
vttablet:
replicas: 2
- type: "rdonly"
vttablet:
replicas: 1
copySchema:
source: "customer/0"
```
----
### Shard naming
What is the meaning of -80 and 80-? :
- They represent a range, where the left number is included, but the right is not.
- Their notation is hexadecimal.
- They are left justified.
- A - prefix means: anything less than the RHS value.
- A - postfix means: anything greater than or equal to the LHS value.
- A plain - denotes the full keyrange.
----
### SplitClone
```
jobs:
- name: "horizontal-split"
kind: "vtworker"
cell: "zone1"
command: "SplitClone -min_healthy_rdonly_tablets=1 customer/0"
```
----
### Cut over
__migrate replicas__
```
jobs:
- name: "horizontal-split"
kind: "vtworker"
cell: "zone1"
command: "SplitClone -min_healthy_rdonly_tablets=1 customer/0"
```
__migrate_master__
```
jobs:
- name: "mst3"
kind: "vtctlclient"
command: "MigrateServedTypes customer/0 master"
```
----
### Result
```
./kmysql.sh --table < ../common/select_customer-80_data.sql
Using customer/-80
Customer
+-------------+--------------------+
| customer_id | email |
+-------------+--------------------+
| 1 | alice@domain.com |
| 2 | bob@domain.com |
| 3 | charlie@domain.com |
| 5 | eve@domain.com |
+-------------+--------------------+
COrder
+----------+-------------+----------+-------+
| order_id | customer_id | sku | price |
+----------+-------------+----------+-------+
| 1 | 1 | SKU-1001 | 100 |
| 2 | 2 | SKU-1002 | 30 |
| 3 | 3 | SKU-1002 | 30 |
| 5 | 5 | SKU-1002 | 30 |
+----------+-------------+----------+-------+
./kmysql.sh --table < ../common/select_customer80-_data.sql
Using customer/80-
Customer
+-------------+----------------+
| customer_id | email |
+-------------+----------------+
| 4 | dan@domain.com |
+-------------+----------------+
COrder
+----------+-------------+----------+-------+
| order_id | customer_id | sku | price |
+----------+-------------+----------+-------+
| 4 | 4 | SKU-1002 | 30 |
+----------+-------------+----------+-------+
```