# MongoDB Basic Cluster Administration<br />ch3 Sharding
###### tags: `MongoDB University M103`
## What is Sharding?
* Shards
* store distributed collections
* Config Server
* store metadata about each shards
* Mongos
* routes queries to shards
## When to Shard
* When we reach the most powerful servers available, maximizing our vertical scale options.
* Sharding can provide an alternative to vertical scaling.
* Data sovereignty laws require data to be located in a specific geography.
* Sharding allows us to store different pieces of data in specific countries or regions.
* When holding more than 5TB per server and operational costs increase dramatically.
* Generally, when our deployment reaches 2-5TB per server, we should consider sharding.
## Setting Up a Sharded Cluster
### Config Server config file
```yaml=
sharding:
clusterRole: configsvr
replication:
replSetName: m103-csrs
security:
keyFile: /var/mongodb/pki/m103-keyfile
net:
bindIp: 127.0.0.1,192.168.103.100
port: 27000
systemLog:
destination: file
path: /var/mongodb/db/csrs1.log
logAppend: true
processManagement:
fork: true
storage:
dbPath: /var/mongodb/db/csrs1
```
### Mongos config file
```yaml=
sharding:
configDB: m103-csrs/192.168.103.100:26001,192.168.103.100:26002,192.168.103.100:26003
security:
keyFile: /var/mongodb/pki/m103-keyfile
net:
bindIp: 127.0.0.1,192.168.103.100
port: 26000
systemLog:
destination: file
path: /var/mongodb/db/mongos.log
logAppend: true
processManagement:
fork: true
```
## Config DB
* Never write any data to it
* informations
* shards
* chunks
* mongos
## Shard Keys
* Shard key field must be indexed
* must be present in every document in the collection
* Shard key are immutable
* You cannot change the shard key field post-sharding
* before 4.2, You cannot change or update the value of the shard key field value post-sharding
* Shard key are permanent
* You cannot unshard a sharded collection
* If the collection is empty, sh.shardCollection() creates the index on the shard key if such an index does not already exists.
* If the collection is not empty, you must create the index first before using sh.shardCollection().
### How to shard
1. Use ==sh.enableSharding("<database>")== to enable sharding for the specified database
2. Use ==db.collection.createIndex()== to create the index for your shard key fields
3. Use ==sh.shardCollection( "<database>.<collection>", { shard key } )== to shard the collection
## Picking a Good Shard Key
### Cardinality
* High cardinality - many possible unique value gives more chunk
### Frequency
* Low frequency - low repetition of values, distribution is going to be more even.
### Monotonic Change
* Avoid shard key that change monotonically
## Hashed Shard Keys
* shard key with hashed index
* Even distribution of shard keys on monotonically changing fields like dates
* Lost fast sorts, targeted queries on ranges of shard key values, or geographically isolated workloads.
* Hashed indexes are single field, non-array
### Using hashed shard key
1. Enable sharding for the specified database ==sh.enableSharding("<database>")==
2. Create the index for shard key field ==db.collection.createIndex("<field>": "hashed")==
3. Shard the collection sh.shardCollection("<database>.<collection>", { <shard key field>: "hashed" })
## Chunks
* ChunkSize = 64MB by default
* 1MB <= ChunkSize <= 1024MB
### Jumbo Chunks
* shard key 單向遞增或遞減
* 無法分出而過於肥大的 chunk
## Balancing
* sh.startBalancer(timeout, interval)
* sh.stopBalancer(timeout, interval)
* sh.setBalncerState(boolean)
### Config Server
:::info
Starting in MongoDB 3.4, the balancer runs on the primary of the config server replica set (CSRS)
:::
* Must have zero arbiters.
* Must have no delayed members.
* Must build indexes
* Users **should avoid writing directly to the config database** in the course of normal operation or maintenance.
* How to split chunks manually
```javascript=
db.adminCommand( { split: "myapp.users", middle: { email : prefix } } );
```
* How to merge chunks manually
```javascript=
db.adminCommand( {
mergeChunks: "test.members",
bounds: [ { "username" : "user69816" },
{ "username" : "user96401" } ]
} )
```
:::warning
if the operations do not require running on the database’s primary shard, these operations will route the results to a **random shard** to merge the results to avoid overloading the primary shard for that database. The **$out** stage and the ==$lookup== stage require running on the database’s **primary shard**.
:::
### Remove an exist shard
```javascript=
db.adminCommand( { removeShard: "mongodb0" } )
```