# MongoDB cluster formation
## MongoDB
MongoDB is a document database with the scalability and flexibility that you want with the querying and indexing that you need.
#### Scalability:
- Horizontal: Horizontal scaling means that you scale by adding more machines into your pool of resources.
In this document, we are approaching Horizontal scaling.
## Install MongoDB
- To install the MongoDB in the machine, follow the steps below.
```bash
$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 9DA31620334BD75D9DCB49F368818C72E52529D4
```
- Note: Please ensure your ubunut codename with **lsb_release -dc** before proceeding with below command. If codename is bionic then proceed with below command. otherwise, check this [installation](https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/)
```bash
$ echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.0.list
```
- To install latest MongoDB version
```bash
$ sudo apt-get update
$ sudo apt-get install -y mongodb-org
```
- To install specific MongoDB version
```bash
$ sudo apt-get update
$ sudo apt-get install -y mongodb-org==(version-number)
```
## MongoDB cluster architecture
![](https://i.imgur.com/nRd5ccL.png)
**shard**: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set.
**config server**: Config servers store metadata and configuration settings for the cluster. The metadata reflects state and organization for all data and components within the sharded cluster. The metadata includes the list of chunks on every shard.
**mongos**: The mongos acts as a query router, providing an interface between client applications and the sharded cluster.
## Form a Cluster
- There are two ways to form a cluster.
- Through configuration file
- Through command line
## Through command line
### Create the shard server Replica set
- Create the mongo shard insatnce.
```bash
mongod --replSet <replSetname> --logpath <logpath> --dbpath <datapath> --port <port number> --fork --shardsvr
```
- Next to initiate the shaerd server replicaset, connect a **mongo** shell to one of the shard server members.
```bash
mongo --port <port>
```
- Initiate the replica set with below command in mongo shell
```bash
rs.initiate(
{
_id : <replicaSetName>,
members: [
{ _id : 0, host : "s1-mongo1.example.net:27017" },
{ _id : 1, host : "s1-mongo2.example.net:27018" },
{ _id : 2, host : "s1-mongo3.example.net:27019" }
]
}
)
```
### Create the config server Replica set.
- Create mongo config instance.
```bash
mongod --replSet <replSetname> --logpath <logpath> --dbpath <datapath> --port <port number> --fork --configsvr
```
- Next to initiate the config server replicaset, connect a **mongo** shell to one of the config server members.
```bash
mongo --port <port>
```
- Initiate the replica set with below command in mongo shell
```bash
rs.initiate(
{
_id: "<replSetName>",
configsvr: true,
members: [
{ _id : 0, host : "cfg1.example.net:27019" },
{ _id : 1, host : "cfg2.example.net:27019" },
{ _id : 2, host : "cfg3.example.net:27019" }
]
}
)
```
### Mongos
- Run the bellow command to create mongos instance.
```bash
mongos --logpath "mongos-1.log" --configdb "<configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019" --port 27017
```
- Now add shards to the mongos by connect a **mongo** shell to one of the config server members.
```bash
mongo --port <port>
```
- Add shards to the cluster
```bash
sh.addShard( "<replSetName>/s1-mongo1.example.net:27017", s1-mongo2.example.net:27018, s1-mongo3.example.net:27019)
```
### Enable sharding
- To proceed, you must be connected to a mongos associated with the target sharded cluster.
- Before you can shard a collection, you must enable sharding for the collection’s database. Enabling sharding for a database does not redistribute data but make it possible to shard the collections in that database.
- Enable sharding on the target database with
```bash
sh.enableSharding("<database>")
```
- enable sharding on the target collection with
```bash
sh.shardCollection("<database>.<collection>", { <key> : <direction> } )
```
## Example:
- Here we are forming a cluter in our local machine, By following above mentioned steps.
### clean everything:
- kill existing mongod and mongos instances.
```bash
$ killall mongod
$ killall mongos
$ rm -rf /data/config
$ rm -rf /data/shard*
```
### Create Shard Replica Sets:
#### Create Shard0(s0) with replica sets:
- Run the following command to create data directory.
```bash
mkdir -p /data/shard0/rs0 /data/shard0/rs1 /data/shard0/rs2
```
- Run the following commands to create shard instances.
```bash
mongod --replSet s0 --logpath "s0-r0.log" --dbpath /data/shard0/rs0 --port 37017 --fork --shardsvr
mongod --replSet s0 --logpath "s0-r1.log" --dbpath /data/shard0/rs1 --port 37018 --fork --shardsvr
mongod --replSet s0 --logpath "s0-r2.log" --dbpath /data/shard0/rs2 --port 37019 --fork --shardsvr
```
##### Configuring s0 replica set:
- Connect to one of the shard server and initiate the replica set.
```bash
mongo --port 37017
config = { _id: "s0", members:[
{ _id : 0, host : "localhost:37017" },
{ _id : 1, host : "localhost:37018" },
{ _id : 2, host : "localhost:37019" }]};
rs.initiate(config)
```
#### Create Shard1(s1) with replica sets:
- Run the following command to create data directory.
```bash
mkdir -p /data/shard1/rs0 /data/shard1/rs1 /data/shard1/rs2
```
- Run the following to commands to creat shard instance.
```bash
mongod --replSet s1 --logpath "s1-r0.log" --dbpath /data/shard1/rs0 --port 47017 --fork --shardsvr --smallfiles
mongod --replSet s1 --logpath "s1-r1.log" --dbpath /data/shard1/rs1 --port 47018 --fork --shardsvr --smallfiles
mongod --replSet s1 --logpath "s1-r2.log" --dbpath /data/shard1/rs2 --port 47019 --fork --shardsvr --smallfiles
```
##### Configuring s1 replica set
- Connect to one of the shard server and initiate the replica set.
```bash
mongo --port 47017
config = { _id: "s1", members:[
{ _id : 0, host : "localhost:47017" },
{ _id : 1, host : "localhost:47018" },
{ _id : 2, host : "localhost:47019" }]};
rs.initiate(config)
```
#### Create Shard2(s2) with replica sets:
- Run the following command to create data dir for shard2.
```bash
mkdir -p /data/shard2/rs0 /data/shard2/rs1 /data/shard2/rs2
```
- Run the following commands to create the shard mongo instances.
```bash
mongod --replSet s2 --logpath "s2-r0.log" --dbpath /data/shard2/rs0 --port 57017 --fork --shardsvr
mongod --replSet s2 --logpath "s2-r1.log" --dbpath /data/shard2/rs1 --port 57018 --fork --shardsvr
mongod --replSet s2 --logpath "s2-r2.log" --dbpath /data/shard2/rs2 --port 57019 --fork --shardsvr
```
#### Configuring s2 replica set:
- Connect to one of the shard server and initiate the replica set.
```bash
mongo --port 57017
config = { _id: "s2", members:[
{ _id : 0, host : "localhost:57017" },
{ _id : 1, host : "localhost:57018" },
{ _id : 2, host : "localhost:57019" }]};
rs.initiate(config)
```
### Start config servers
- Create config file for config servers.
- Run the follwoing commands to create data directories for config servers.
```bash
mkdir -p /data/config/config-a /data/config/config-b /data/config/config-c
```
- Next run the following commands to create mongo instances for config servers.
```bash
mongod --replSet cf --logpath "cfg-a.log" --dbpath /data/config/config-a --port 57040 --fork --configsvr
mongod --replSet cf --logpath "cfg-b.log" --dbpath /data/config/config-b --port 57041 --fork --configsvr
mongod --replSet cf --logpath "cfg-c.log" --dbpath /data/config/config-c --port 57042 --fork --configsvr
```
#### Configuring replica set for config servers:
- Connect to one of the config server and initiate the replica set.
```bash
mongo --port 57040
rs.initiate(
{
_id: "cf",
configsvr: true,
members:[
{ _id : 0, host : "localhost:57040" },
{ _id : 1, host : "localhost:57041" },
{ _id : 2, host : "localhost:57042" }]
}
)
rs.initiate(config)
```
### Mongos:
- Now start the mongos on a standard port.
- Run the following command
```bash
mongos --logpath "mongos-1.log" --configdb "cf/localhost:57040,localhost:57041,localhost:57042" --port 27017 --fork
```
### Enable sharding:
- Add shards and enable sharding.
```bash
mongo --port 27017
sh.addShard('s0/localhost:37017, localhost:37018, localhost:37019')
sh.addShard('s0/localhost:47017, localhost:47018, localhost:47019')
sh.addShard('s0/localhost:57017, localhost:57018, localhost:57019')
```
- Enable sharding for Db.
```bash
use school;
db.adminCommand({enableSharding: "school"});
```
- Enable sharding for collection.
```bash
db.createCollection("students");
db.adminCommand({shardCollection: "school.students", key: {student_id:1}});
```