[DB] Introduction to Vitess =========================== Summary of [The Vitess Documentation](https://vitess.io/docs/) --- ## Overview High-level information about Vitess - What Is Vitess - Scaling MySQL With Vitess - Key Vitess Concepts --- ## What Is Vitess Vitess is a database solution for deploying, scaling and managing large clusters of MySQL instances, help you with the following problems: 1. Scaling a MySQL database by allowing you to shard it, while keeping application changes to a minimum. 2. Migrating from baremetal to a private or public cloud. 3. Deploying and managing a large number of MySQL instances. --- ## Features - Performance - Connection pooling - Query de-duping - Transaction manager ---- ## Features (cont.) - Protection - Query rewriting and sanitization - Query blacklisting - Query killer - Table ACLs ---- ## Features (cont.) - Monitoring - Performance analysis - Query streaming - Update stream ---- ## Features (cont.) - Topology Management Tools - Master management tools (handles reparenting) - Web-based management GUI - Designed to work in multiple data centers / regions ---- ## Features (cont.) - Sharding - Virtually seamless dynamic re-sharding - Vertical and Horizontal sharding support - Multiple sharding schemes, with the ability to plug-in custom ones --- ## Architecture ![Vitess Architecture](https://vitess.io/docs/overview/img/VitessOverview.png) ---- ### Topology ![](https://lh3.googleusercontent.com/gEKxYMxsmGpzOwgLltLDnSeOfbDnQLOAO8zaDq3UTAsm9JLr5y8TK_ztkkTB2b1hxkvcVz2i2oX2tEdh4ZwWFkOEkVgVd7paEN2DiPV1xIEo-h-HoACHOvJrNBoBr_sVXhsogSeK79ce5Pke-cg0UWXgP9tJno0fOHFHEy3w9zbn0R4k9xZvPWN3Fy69_B8RAYx-vP3T_Sa0jUZfqVTwQYlBkApEJul7m6Rw4W0riqQm2JWFPUglSfRK-GRHWSNGL7rA319dqdo4u4zfuqf0HlaqVE6zUcM_qzKJUHjLq7ZrW2VCUWBXVrO6iopLOGQjHVRvJ3iVLjFQa5LxBhOJ1q5LIak8kVIyaLfEGlqWPgH9aygtTbeXH230GSD_zitb2hHJHvrCxPD5euTolZX9DbaXqTyjOxBMMfwXpEHOEe2B_AGRlkIRsYeDkm_jCrPM-3NQqXm9o-QxQDgTfUAjgsDMzEsYPQDTUZri082j5tR0470IIkRM5TmJU_lW6MuEoP6ijtotudbCCUN2czNhcQUwSNln2pItfKREjKMhWZaueAVgpsCmwyGcfxMLb0XtTpxGK65vhPSfPS7IqayoRnLHy6PqPZ45LMrkTfcK8Q84JRJlgnTjFxO44bfbsrhPm1NRXBs9pdPElKLgYPSVTjP4vv3ePxdPwbf5rQabo_mWHLxEemn5bqza_LC_57kSsoqh1_o7jCNILVRt_505VxU3Sg=w959-h486-no) ---- ### Topology - __vtgate__: a light proxy server that routes traffic to the correct vttablet(s) and returns consolidated results back to the client. It is the server to which applications send queries. - vtgate considers the sharding scheme, required latency, and the availability of the tablets. ---- ### Topology (cont.) - __vttablet__: a proxy server that sits in front of a MySQL database. A Vitess implementation has one vttablet for each MySQL instance. - Its features include connection pooling, query rewriting, and query de-duping. ---- ### Topology (cont.) - _vtctl_: a command-line tool used to administer a Vitess cluster. - _vtctld_: an HTTP server that lets you browse the information stored in the lockserver. ---- ### Topology (cont.) - __vtworker__: hosts long-running processes. It supports a plugin architecture and offers libraries. - resharding differ jobs check data integrity during shard splits and joins - vertical split differ jobs check data integrity during vertical splits and joins ---- ### Other support tools: admin - _mysqlctl_: Manage MySQL instances - _zk_: Command-line ZooKeeper client and explorer - _zkctl_: Manage ZooKeeper instances ---- ### Other support tools: testing - _vtcombo_: A single binary that contains all components of Vitess. - _vtexplain_: A command line tool that is used to explore how Vitess will handle queries based on a user-supplied schema and topology, without needing to set up a full cluster. --- ## Scaling MySQL With Vitess Running Vitess on Local Kubernetes (minikube) ---- ### Prerequisites - Install Minikube ``` $ minikube start --cpus=4 --memory=5000 ``` - Install etcd operator - setup basic rule for etcd operator ``` $ example/rabc/create_role.sh ``` - install etcd operator ``` $ kubectl create -f example/depolyment.yaml ``` - Install helm ``` $ helm init ``` --- ## Starting a single keyspace cluster - Download and create vitess chart ``` $ cd example/helm $ helm install ../../helm/vitess -f 101_initial_cluster.yaml ``` ---- ### Topolgy chart ``` topology: cells: - name: "zone1" etcd: replicas: 1 vtctld: replicas: 1 vtgate: replicas: 1 mysqlProtocol: enabled: true authType: "none" keyspaces: - name: "commerce" shards: - name: "0" tablets: - type: "replica" vttablet: replicas: 2 - type: "rdonly" vttablet: replicas: 1 schema: initial: |- create table product( sku varbinary(128), description varbinary(128), price bigint, primary key(sku) ); create table customer( customer_id bigint not null auto_increment, email varbinary(128), primary key(customer_id) ); create table corder( order_id bigint not null auto_increment, customer_id bigint, sku varbinary(128), price bigint, primary key(order_id) ); vschema: initial: |- { "tables": { "product": {}, "customer": {}, "corder": {} } } etcd: replicas: 1 resources: vtctld: serviceType: "NodePort" resources: vtgate: serviceType: "NodePort" resources: vttablet: mysqlSize: "test" resources: mysqlResources: vtworker: resources: pmm: enabled: false orchestrator: enabled: false ``` ---- ### Verify cluster ``` $ kubectl get pods,jobs NAME READY STATUS RESTARTS AGE pod/etcd-global-tr49f52728 1/1 Running 0 69m pod/etcd-operator-6c54c6cff6-kk8zw 1/1 Running 0 70m pod/etcd-zone1-fzmv6thdrb 1/1 Running 0 69m pod/vtctld-5dc4466d88-gfwbc 1/1 Running 3 69m pod/vtgate-zone1-9c8589688-vj2rw 1/1 Running 3 69m pod/zone1-commerce-0-rdonly-0 6/6 Running 0 69m pod/zone1-commerce-0-replica-0 6/6 Running 0 69m pod/zone1-commerce-0-replica-1 6/6 Running 0 69m NAME COMPLETIONS DURATION AGE job.batch/commerce-apply-schema-initial 1/1 2m9s 69m job.batch/commerce-apply-vschema-initial 1/1 2m5s 69m job.batch/zone1-commerce-0-init-shard-master 1/1 2m 69m ``` ---- ### Example database content ``` $ ./kmysql.sh mysql> show tables; +--------------------+ | Tables_in_commerce | +--------------------+ | corder | | customer | | product | +--------------------+ 3 rows in set (0.01 sec) ``` ---- __vtctld: Admin panel__ (`$ ./kvtctld.sh`) ![](https://lh3.googleusercontent.com/Qv8SZtpqUZKIxJI-iMyxx0GN0_wEV5MmoAhiPCWWmRrO3hgmtX2QgrTm8ByWS7Yd9gDLNZjqsIScEt6R7iR1hwYS0LsUv_f1pknX4M-mZZQHeq037fslBzH2VDNbkqj2uw_WTWqj2SOMzaJscgMUDZizu7uudsFDs1T35x_xxbA-lGQoXRgPbsFFobQ2hCX1F72FPnJl82wwMQzzVDHYCbl0xYGxo3X95WuiNov6BaOTAnPkY_nrigPLqmDJQfnl7fxqJ4k4Y_BwgtbWw7DSdmxUx2SaZecZT5LBlGPGUAUBQXKFp1NKVSyzmn2RVLrr_wjsjv8b-tsB8VLje-Aug72hSxNkPkVm1Ko8mwiVHTKGPy5M7CjqZqjtPDgTzUJOqbejrw6z3rndzQeVRw0Xy6pUHFNPLilx_atSBL57WD2vZqesoz12jKen3DzCphqWARYchYXeZXmuO6BDyTBBiTJWVRGh5eQ4dxC9bIrgL2l0ECGPv4XmQ3XrKrvtOaAin_aUd9w3vO0p4jLZOmfDgPdEcGqbYQPHx1vOzzkFY8JBagEGVWTNfkMbSEigTA8fOrtf4aAT3VrDWhOdX_1U-Gmg0yDBSyYGpK9bFjmd9b2M8DKPvY413KvRbu35iZ2x0mAUFO2zINvEgpzyUY3VgHWUVoe_RsjYf1cBJqPNOu1DvdeOP47S8Yww2Zd_s_mvzR0SFnLtAX_H7AP9jBiEPvxwaw=w743-h606-no) --- ## Vertical Split ---- __Prepare sample data__ ``` $ ./kmysql.sh < ../common/insert_commerce_data.sql $ ./kmysql.sh --table < ../common/select_commerce_data.sql Using commerce/0 Customer +-------------+--------------------+ | customer_id | email | +-------------+--------------------+ | 1 | alice@domain.com | | 2 | bob@domain.com | | 3 | charlie@domain.com | | 4 | dan@domain.com | | 5 | eve@domain.com | +-------------+--------------------+ Product +----------+-------------+-------+ | sku | description | price | +----------+-------------+-------+ | SKU-1001 | Monitor | 100 | | SKU-1002 | Keyboard | 30 | +----------+-------------+-------+ COrder +----------+-------------+----------+-------+ | order_id | customer_id | sku | price | +----------+-------------+----------+-------+ | 1 | 1 | SKU-1001 | 100 | | 2 | 2 | SKU-1002 | 30 | | 3 | 3 | SKU-1002 | 30 | | 4 | 4 | SKU-1002 | 30 | | 5 | 5 | SKU-1002 | 30 | +----------+-------------+----------+-------+ ``` ---- ### Create Keyspace 1. Create a special `served_from` keyspace. - This keyspace starts off as an alias for the `commerce` keyspace. - Vertically split tables into the new keyspace without change app. 2. Create another keyspace. ---- __Upgrade vitess chart__ ``` $ export release=$(helm ls -q) $ helm upgrade $release ../../helm/vitess/ -f 201_customer_keyspace.yaml ``` ``` jobs: - name: "create-customer-ks" kind: "vtctlclient" command: "CreateKeyspace -served_from='master:commerce,replica:commerce,rdonly:commerce' customer" NAME DESIRED SUCCESSFUL AGE jobs/vtctlclient-create-customer-ks 1 1 10s ``` ---- __Customer Tablets__ ``` keyspaces: - name: "commerce" shards: - name: "0" tablets: - type: "replica" vttablet: replicas: 2 - type: "rdonly" vttablet: replicas: 1 vschema: vsplit: |- { "tables": { "product": {} } } - name: "customer" shards: - name: "0" tablets: - type: "replica" vttablet: replicas: 2 - type: "rdonly" vttablet: replicas: 1 copySchema: source: "commerce/0" tables: - "customer" - "corder" vschema: vsplit: |- { "tables": { "customer": {}, "corder": {} } } ``` ---- ``` $ helm upgrade $release ../../helm/vitess/ -f 202_customer_tablets.yaml $ kubectl get pods NAME READY STATUS RESTARTS AGE pod/etcd-global-tr49f52728 1/1 Running 0 69m pod/etcd-operator-6c54c6cff6-kk8zw 1/1 Running 0 70m pod/etcd-zone1-fzmv6thdrb 1/1 Running 0 69m pod/vtctld-5dc4466d88-gfwbc 1/1 Running 3 69m pod/vtgate-zone1-9c8589688-vj2rw 1/1 Running 3 69m pod/zone1-commerce-0-rdonly-0 6/6 Running 0 69m pod/zone1-commerce-0-replica-0 6/6 Running 0 69m pod/zone1-commerce-0-replica-1 6/6 Running 0 69m pod/zone1-customer-0-rdonly-0 6/6 Running 0 18m pod/zone1-customer-0-replica-0 6/6 Running 0 18m pod/zone1-customer-0-replica-1 6/6 Running 0 18m NAME DESIRED SUCCESSFUL AGE jobs/commerce-apply-vschema-vsplit 1 1 5m jobs/customer-apply-vschema-vsplit 1 1 5m jobs/customer-copy-schema-0 1 1 5m jobs/zone1-customer-0-init-shard-master 1 1 5m ``` ---- ### VerticalSplitClone ### ``` jobs: - name: "vertical-split" kind: "vtworker" cell: "zone1" command: "VerticalSplitClone -min_healthy_rdonly_tablets=1 -tables=customer,corder customer/0" ``` ---- ```` ./kmysql.sh --table < ../common/select_customer0_data.sql Using customer/0 Customer +-------------+--------------------+ | customer_id | email | +-------------+--------------------+ | 1 | alice@domain.com | | 2 | bob@domain.com | | 3 | charlie@domain.com | | 4 | dan@domain.com | | 5 | eve@domain.com | +-------------+--------------------+ COrder +----------+-------------+----------+-------+ | order_id | customer_id | sku | price | +----------+-------------+----------+-------+ | 1 | 1 | SKU-1001 | 100 | | 2 | 2 | SKU-1002 | 30 | | 3 | 3 | SKU-1002 | 30 | | 4 | 4 | SKU-1002 | 30 | | 5 | 5 | SKU-1002 | 30 | +----------+-------------+----------+-------+ ```` ---- ### Cut over __migrate_replicas__ ``` jobs: - name: "msf1" kind: "vtctlclient" command: "MigrateServedFrom customer/0 rdonly" - name: "msf2" kind: "vtctlclient" command: "MigrateServedFrom customer/0 replica" ``` __migrate_master__ ``` - name: "msf3" kind: "vtctlclient" command: "MigrateServedFrom customer/0 master" ``` ---- ### Clean up ``` schema: postsplit: |- drop table customer; drop table corder; ``` ``` jobs: - name: "vclean1" kind: "vtctlclient" command: "SetShardTabletControl -blacklisted_tables=customer,corder -remove commerce/0 rdonly" - name: "vclean2" kind: "vtctlclient" command: "SetShardTabletControl -blacklisted_tables=customer,corder -remove commerce/0 replica" - name: "vclean3" kind: "vtctlclient" command: "SetShardTabletControl -blacklisted_tables=customer,corder -remove commerce/0 master" ``` ---- ``` ./kmysql.sh --table < ../common/select_commerce_data.sql Using commerce/0 Customer ERROR 1105 (HY000) at line 4: vtgate: http://vtgate-zone1-5ff9c47db6-7rmld:15001/: target: commerce.0.master, used tablet: zone1-1564760600 (zone1-commerce-0-replica-0.vttablet), vttablet: rpc error: code = InvalidArgument desc = table customer not found in schema (CallerID: userData1) ``` --- ## Horizontal sharding ---- ### Sequences The customer and corder have auto-increment columns. This scheme does not work well in a sharded setup. Instead, Vitess provides an equivalent feature through sequences. The syntax to generate an id is: `select next :n values from customer_seq`. ---- ``` create table customer_seq(id int, next_id bigint, cache bigint, primary key(id)) comment 'vitess_sequence'; insert into customer_seq(id, next_id, cache) values(0, 1000, 100); create table order_seq(id int, next_id bigint, cache bigint, primary key(id)) comment 'vitess_sequence'; insert into order_seq(id, next_id, cache) values(0, 1000, 100); ``` - `id` is always 0 - `next_id` is set to `1000` - `cache` specifies the number of values to cache before vttablet updates `next_id`. ---- The VTGates also need to know about the sequence tables. ``` { "tables": { "customer_seq": { "type": "sequence" }, "order_seq": { "type": "sequence" }, "product": {} } } ``` ---- ### Vindexes The next decision is about the sharding keys, aka Primary Vindexes. - What are the highest QPS queries, and what are the where clauses for them? - Cardinality of the column; it must be high. - Do we want some rows to live together to support in-shard joins? - Do we want certain rows that will be in the same transaction to live together? ---- __VSchema for `customer`__ ``` { "sharded": true, "vindexes": { "hash": { "type": "hash" } }, "tables": { "customer": { "column_vindexes": [ { "column": "customer_id", "name": "hash" } ], "auto_increment": { "column": "customer_id", "sequence": "customer_seq" } }, "corder": { "column_vindexes": [ { "column": "customer_id", "name": "hash" } ], "auto_increment": { "column": "order_id", "sequence": "order_seq" } } } } ``` ---- __Upgrade vitess chart__ ``` $ helm upgrade $release ../../helm/vitess/ -f 301_customer_sharded.yaml ``` ``` NAME DESIRED SUCCESSFUL AGE jobs/commerce-apply-schema-seq 1 1 19s jobs/commerce-apply-vschema-seq 1 1 19s jobs/customer-apply-schema-sharded 1 1 19s jobs/customer-apply-vschema-sharded 1 1 19s ``` ---- ### Create new shards ``` - name: "customer" shards: - name: "0" tablets: - type: "replica" vttablet: replicas: 2 - type: "rdonly" vttablet: replicas: 1 - name: "-80" tablets: - type: "replica" vttablet: replicas: 2 - type: "rdonly" vttablet: replicas: 1 copySchema: source: "customer/0" - name: "80-" tablets: - type: "replica" vttablet: replicas: 2 - type: "rdonly" vttablet: replicas: 1 copySchema: source: "customer/0" ``` ---- ### Shard naming What is the meaning of -80 and 80-? : - They represent a range, where the left number is included, but the right is not. - Their notation is hexadecimal. - They are left justified. - A - prefix means: anything less than the RHS value. - A - postfix means: anything greater than or equal to the LHS value. - A plain - denotes the full keyrange. ---- ### SplitClone ``` jobs: - name: "horizontal-split" kind: "vtworker" cell: "zone1" command: "SplitClone -min_healthy_rdonly_tablets=1 customer/0" ``` ---- ### Cut over __migrate replicas__ ``` jobs: - name: "horizontal-split" kind: "vtworker" cell: "zone1" command: "SplitClone -min_healthy_rdonly_tablets=1 customer/0" ``` __migrate_master__ ``` jobs: - name: "mst3" kind: "vtctlclient" command: "MigrateServedTypes customer/0 master" ``` ---- ### Result ``` ./kmysql.sh --table < ../common/select_customer-80_data.sql Using customer/-80 Customer +-------------+--------------------+ | customer_id | email | +-------------+--------------------+ | 1 | alice@domain.com | | 2 | bob@domain.com | | 3 | charlie@domain.com | | 5 | eve@domain.com | +-------------+--------------------+ COrder +----------+-------------+----------+-------+ | order_id | customer_id | sku | price | +----------+-------------+----------+-------+ | 1 | 1 | SKU-1001 | 100 | | 2 | 2 | SKU-1002 | 30 | | 3 | 3 | SKU-1002 | 30 | | 5 | 5 | SKU-1002 | 30 | +----------+-------------+----------+-------+ ./kmysql.sh --table < ../common/select_customer80-_data.sql Using customer/80- Customer +-------------+----------------+ | customer_id | email | +-------------+----------------+ | 4 | dan@domain.com | +-------------+----------------+ COrder +----------+-------------+----------+-------+ | order_id | customer_id | sku | price | +----------+-------------+----------+-------+ | 4 | 4 | SKU-1002 | 30 | +----------+-------------+----------+-------+ ```
{"metaMigratedAt":"2023-06-14T20:12:09.534Z","metaMigratedFrom":"YAML","title":"[DB] Introduction to Vitess","breaks":true,"disqus":"kywk-moco","slideOptions":"{\"transition\":\"slide\"}","contributors":"[{\"id\":\"048cd709-7321-493c-b45c-52ef6600afc2\",\"add\":23747,\"del\":2858}]"}
    1324 views