# MongoDB cluster formation ## MongoDB MongoDB is a document database with the scalability and flexibility that you want with the querying and indexing that you need. #### Scalability: - Horizontal: Horizontal scaling means that you scale by adding more machines into your pool of resources. In this document, we are approaching Horizontal scaling. ## Install MongoDB - To install the MongoDB in the machine, follow the steps below. ```bash $ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 9DA31620334BD75D9DCB49F368818C72E52529D4 ``` - Note: Please ensure your ubunut codename with **lsb_release -dc** before proceeding with below command. If codename is bionic then proceed with below command. otherwise, check this [installation](https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/) ```bash $ echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.0.list ``` - To install latest MongoDB version ```bash $ sudo apt-get update $ sudo apt-get install -y mongodb-org ``` - To install specific MongoDB version ```bash $ sudo apt-get update $ sudo apt-get install -y mongodb-org==(version-number) ``` ## MongoDB cluster architecture ![](https://i.imgur.com/nRd5ccL.png) **shard**: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set. **config server**: Config servers store metadata and configuration settings for the cluster. The metadata reflects state and organization for all data and components within the sharded cluster. The metadata includes the list of chunks on every shard. **mongos**: The mongos acts as a query router, providing an interface between client applications and the sharded cluster. ## Form a Cluster - There are two ways to form a cluster. - Through configuration file - Through command line ## Through command line ### Create the shard server Replica set - Create the mongo shard insatnce. ```bash mongod --replSet <replSetname> --logpath <logpath> --dbpath <datapath> --port <port number> --fork --shardsvr ``` - Next to initiate the shaerd server replicaset, connect a **mongo** shell to one of the shard server members. ```bash mongo --port <port> ``` - Initiate the replica set with below command in mongo shell ```bash rs.initiate( { _id : <replicaSetName>, members: [ { _id : 0, host : "s1-mongo1.example.net:27017" }, { _id : 1, host : "s1-mongo2.example.net:27018" }, { _id : 2, host : "s1-mongo3.example.net:27019" } ] } ) ``` ### Create the config server Replica set. - Create mongo config instance. ```bash mongod --replSet <replSetname> --logpath <logpath> --dbpath <datapath> --port <port number> --fork --configsvr ``` - Next to initiate the config server replicaset, connect a **mongo** shell to one of the config server members. ```bash mongo --port <port> ``` - Initiate the replica set with below command in mongo shell ```bash rs.initiate( { _id: "<replSetName>", configsvr: true, members: [ { _id : 0, host : "cfg1.example.net:27019" }, { _id : 1, host : "cfg2.example.net:27019" }, { _id : 2, host : "cfg3.example.net:27019" } ] } ) ``` ### Mongos - Run the bellow command to create mongos instance. ```bash mongos --logpath "mongos-1.log" --configdb "<configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019" --port 27017 ``` - Now add shards to the mongos by connect a **mongo** shell to one of the config server members. ```bash mongo --port <port> ``` - Add shards to the cluster ```bash sh.addShard( "<replSetName>/s1-mongo1.example.net:27017", s1-mongo2.example.net:27018, s1-mongo3.example.net:27019) ``` ### Enable sharding - To proceed, you must be connected to a mongos associated with the target sharded cluster. - Before you can shard a collection, you must enable sharding for the collection’s database. Enabling sharding for a database does not redistribute data but make it possible to shard the collections in that database. - Enable sharding on the target database with ```bash sh.enableSharding("<database>") ``` - enable sharding on the target collection with ```bash sh.shardCollection("<database>.<collection>", { <key> : <direction> } ) ``` ## Example: - Here we are forming a cluter in our local machine, By following above mentioned steps. ### clean everything: - kill existing mongod and mongos instances. ```bash $ killall mongod $ killall mongos $ rm -rf /data/config $ rm -rf /data/shard* ``` ### Create Shard Replica Sets: #### Create Shard0(s0) with replica sets: - Run the following command to create data directory. ```bash mkdir -p /data/shard0/rs0 /data/shard0/rs1 /data/shard0/rs2 ``` - Run the following commands to create shard instances. ```bash mongod --replSet s0 --logpath "s0-r0.log" --dbpath /data/shard0/rs0 --port 37017 --fork --shardsvr mongod --replSet s0 --logpath "s0-r1.log" --dbpath /data/shard0/rs1 --port 37018 --fork --shardsvr mongod --replSet s0 --logpath "s0-r2.log" --dbpath /data/shard0/rs2 --port 37019 --fork --shardsvr ``` ##### Configuring s0 replica set: - Connect to one of the shard server and initiate the replica set. ```bash mongo --port 37017 config = { _id: "s0", members:[ { _id : 0, host : "localhost:37017" }, { _id : 1, host : "localhost:37018" }, { _id : 2, host : "localhost:37019" }]}; rs.initiate(config) ``` #### Create Shard1(s1) with replica sets: - Run the following command to create data directory. ```bash mkdir -p /data/shard1/rs0 /data/shard1/rs1 /data/shard1/rs2 ``` - Run the following to commands to creat shard instance. ```bash mongod --replSet s1 --logpath "s1-r0.log" --dbpath /data/shard1/rs0 --port 47017 --fork --shardsvr --smallfiles mongod --replSet s1 --logpath "s1-r1.log" --dbpath /data/shard1/rs1 --port 47018 --fork --shardsvr --smallfiles mongod --replSet s1 --logpath "s1-r2.log" --dbpath /data/shard1/rs2 --port 47019 --fork --shardsvr --smallfiles ``` ##### Configuring s1 replica set - Connect to one of the shard server and initiate the replica set. ```bash mongo --port 47017 config = { _id: "s1", members:[ { _id : 0, host : "localhost:47017" }, { _id : 1, host : "localhost:47018" }, { _id : 2, host : "localhost:47019" }]}; rs.initiate(config) ``` #### Create Shard2(s2) with replica sets: - Run the following command to create data dir for shard2. ```bash mkdir -p /data/shard2/rs0 /data/shard2/rs1 /data/shard2/rs2 ``` - Run the following commands to create the shard mongo instances. ```bash mongod --replSet s2 --logpath "s2-r0.log" --dbpath /data/shard2/rs0 --port 57017 --fork --shardsvr mongod --replSet s2 --logpath "s2-r1.log" --dbpath /data/shard2/rs1 --port 57018 --fork --shardsvr mongod --replSet s2 --logpath "s2-r2.log" --dbpath /data/shard2/rs2 --port 57019 --fork --shardsvr ``` #### Configuring s2 replica set: - Connect to one of the shard server and initiate the replica set. ```bash mongo --port 57017 config = { _id: "s2", members:[ { _id : 0, host : "localhost:57017" }, { _id : 1, host : "localhost:57018" }, { _id : 2, host : "localhost:57019" }]}; rs.initiate(config) ``` ### Start config servers - Create config file for config servers. - Run the follwoing commands to create data directories for config servers. ```bash mkdir -p /data/config/config-a /data/config/config-b /data/config/config-c ``` - Next run the following commands to create mongo instances for config servers. ```bash mongod --replSet cf --logpath "cfg-a.log" --dbpath /data/config/config-a --port 57040 --fork --configsvr mongod --replSet cf --logpath "cfg-b.log" --dbpath /data/config/config-b --port 57041 --fork --configsvr mongod --replSet cf --logpath "cfg-c.log" --dbpath /data/config/config-c --port 57042 --fork --configsvr ``` #### Configuring replica set for config servers: - Connect to one of the config server and initiate the replica set. ```bash mongo --port 57040 rs.initiate( { _id: "cf", configsvr: true, members:[ { _id : 0, host : "localhost:57040" }, { _id : 1, host : "localhost:57041" }, { _id : 2, host : "localhost:57042" }] } ) rs.initiate(config) ``` ### Mongos: - Now start the mongos on a standard port. - Run the following command ```bash mongos --logpath "mongos-1.log" --configdb "cf/localhost:57040,localhost:57041,localhost:57042" --port 27017 --fork ``` ### Enable sharding: - Add shards and enable sharding. ```bash mongo --port 27017 sh.addShard('s0/localhost:37017, localhost:37018, localhost:37019') sh.addShard('s0/localhost:47017, localhost:47018, localhost:47019') sh.addShard('s0/localhost:57017, localhost:57018, localhost:57019') ``` - Enable sharding for Db. ```bash use school; db.adminCommand({enableSharding: "school"}); ``` - Enable sharding for collection. ```bash db.createCollection("students"); db.adminCommand({shardCollection: "school.students", key: {student_id:1}}); ```