---
# System prepended metadata

title: MongoDB cluster formation

---

# MongoDB cluster formation
## MongoDB
MongoDB is a document database with the scalability and flexibility that you want with the querying and indexing that you need.
#### Scalability:
- Horizontal: Horizontal scaling means that you scale by adding more machines into your pool of resources. 
 
In this document, we are approaching Horizontal scaling.
## Install MongoDB 
- To install the MongoDB in the machine, follow the steps below.
     ```bash
      $ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 9DA31620334BD75D9DCB49F368818C72E52529D4
     ```
- Note: Please ensure your ubunut codename with **lsb_release -dc** before proceeding with below command. If codename is bionic then proceed with below command. otherwise, check this [installation](https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/)
    ```bash
    $ echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.0.list
    ```
- To install latest MongoDB version
    ```bash
    $ sudo apt-get update
    $ sudo apt-get install -y mongodb-org
    ```
- To install specific MongoDB version
    ```bash
    $ sudo apt-get update
    $ sudo apt-get install -y mongodb-org==(version-number)
    ```
## MongoDB cluster architecture
![](https://i.imgur.com/nRd5ccL.png)

**shard**: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set.
**config server**: Config servers store metadata and configuration settings for the cluster. The metadata reflects state and organization for all data and components within the sharded cluster. The metadata includes the list of chunks on every shard.
**mongos**: The mongos acts as a query router, providing an interface between client applications and the sharded cluster.

## Form a Cluster
- There are two ways to form a cluster. 
   - Through configuration file
   - Through command line

## Through command line
### Create the shard server Replica set
- Create the mongo shard insatnce.
    ```bash
    mongod --replSet <replSetname> --logpath <logpath> --dbpath <datapath> --port <port number> --fork --shardsvr
    ```
- Next to initiate the shaerd server replicaset, connect a **mongo** shell to one of the shard server members.
    ```bash
    mongo --port <port>
    ```
 - Initiate the replica set with below command in mongo shell
    ```bash
    rs.initiate(
      {
        _id : <replicaSetName>,
        members: [
          { _id : 0, host : "s1-mongo1.example.net:27017" },
          { _id : 1, host : "s1-mongo2.example.net:27018" },
          { _id : 2, host : "s1-mongo3.example.net:27019" }
        ]
      }
    )

    ```

### Create the config server Replica set.
- Create mongo config instance.
    ```bash
    mongod --replSet <replSetname> --logpath <logpath> --dbpath <datapath> --port <port number> --fork --configsvr
    ```
- Next to initiate the config server replicaset, connect a **mongo** shell to one of the config server members.
    ```bash
    mongo --port <port>
    ```
- Initiate the replica set with below command in mongo shell
    ```bash
    rs.initiate(
      {
        _id: "<replSetName>",
        configsvr: true,
        members: [
          { _id : 0, host : "cfg1.example.net:27019" },
          { _id : 1, host : "cfg2.example.net:27019" },
          { _id : 2, host : "cfg3.example.net:27019" }
        ]
      }
    )
    ```

### Mongos 
- Run the bellow command to create mongos instance.
    ```bash
    mongos --logpath "mongos-1.log" --configdb "<configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019" --port 27017
    ```
- Now add shards to the mongos by connect a **mongo** shell to one of the config server members.
    ```bash
    mongo --port <port>
    ```
- Add shards to the cluster 
    ```bash
    sh.addShard( "<replSetName>/s1-mongo1.example.net:27017", s1-mongo2.example.net:27018, s1-mongo3.example.net:27019)
    ```
 
### Enable sharding
- To proceed, you must be connected to a mongos associated with the target sharded cluster.
- Before you can shard a collection, you must enable sharding for the collection’s database. Enabling sharding for a database does not redistribute data but make it possible to shard the collections in that database.
- Enable sharding on the target database with
    ```bash
    sh.enableSharding("<database>")
    ```
- enable sharding on the target collection with
    ```bash
    sh.shardCollection("<database>.<collection>", { <key> : <direction> } )
    ```
    
## Example:
- Here we are forming a cluter in our local machine, By following above mentioned steps.

### clean everything:
- kill existing mongod and mongos instances.
    ```bash
    $ killall mongod
    $ killall mongos
    $ rm -rf /data/config
    $ rm -rf /data/shard*
    ```


### Create Shard Replica Sets:

#### Create Shard0(s0) with replica sets:

- Run the following command to create data directory.
    ```bash
    mkdir -p /data/shard0/rs0 /data/shard0/rs1 /data/shard0/rs2
    ```
- Run the following commands to create shard instances.
    ```bash
    mongod --replSet s0 --logpath "s0-r0.log" --dbpath /data/shard0/rs0 --port 37017 --fork --shardsvr
    mongod --replSet s0 --logpath "s0-r1.log" --dbpath /data/shard0/rs1 --port 37018 --fork --shardsvr
    mongod --replSet s0 --logpath "s0-r2.log" --dbpath /data/shard0/rs2 --port 37019 --fork --shardsvr
    ```
##### Configuring s0 replica set:
- Connect to one of the shard server and initiate the replica set.
    ```bash
    mongo --port 37017
    config = { _id: "s0", members:[
          { _id : 0, host : "localhost:37017" },
          { _id : 1, host : "localhost:37018" },
           { _id : 2, host : "localhost:37019"     }]};
    rs.initiate(config)
    ```

#### Create Shard1(s1) with replica sets:
- Run the following command to create data directory.
    ```bash
    mkdir -p /data/shard1/rs0 /data/shard1/rs1 /data/shard1/rs2
    ```
- Run the following to commands to creat shard instance.
    ```bash
    mongod --replSet s1 --logpath "s1-r0.log" --dbpath /data/shard1/rs0 --port 47017 --fork --shardsvr --smallfiles
    mongod --replSet s1 --logpath "s1-r1.log" --dbpath /data/shard1/rs1 --port 47018 --fork --shardsvr --smallfiles
    mongod --replSet s1 --logpath "s1-r2.log" --dbpath /data/shard1/rs2 --port 47019 --fork --shardsvr --smallfiles
    ```

##### Configuring s1 replica set
- Connect to one of the shard server and initiate the replica set.
    ```bash
    mongo --port 47017
    config = { _id: "s1", members:[
          { _id : 0, host : "localhost:47017" },
          { _id : 1, host : "localhost:47018" },
          { _id : 2, host : "localhost:47019"         }]};
    rs.initiate(config)
    ```
#### Create Shard2(s2) with replica sets:
- Run the following command to create data dir for shard2.
    ```bash
    mkdir -p /data/shard2/rs0 /data/shard2/rs1 /data/shard2/rs2
    ```
- Run the following commands to create the shard mongo instances.
    ```bash
    mongod --replSet s2 --logpath "s2-r0.log" --dbpath /data/shard2/rs0 --port 57017 --fork --shardsvr
    mongod --replSet s2 --logpath "s2-r1.log" --dbpath /data/shard2/rs1 --port 57018 --fork --shardsvr
    mongod --replSet s2 --logpath "s2-r2.log" --dbpath /data/shard2/rs2 --port 57019 --fork --shardsvr
    ```

#### Configuring s2 replica set:

- Connect to one of the shard server and initiate the replica set.
    ```bash
    mongo --port 57017
    config = { _id: "s2", members:[
          { _id : 0, host : "localhost:57017" },
           { _id : 1, host : "localhost:57018" },
          { _id : 2, host : "localhost:57019"         }]};
    rs.initiate(config)
    ```


### Start config servers
- Create config file for config servers.
- Run the follwoing commands to create data directories for config servers.
    ```bash
    mkdir -p /data/config/config-a /data/config/config-b /data/config/config-c
    ```
- Next run the following commands to create mongo instances for config servers.
    ```bash
    mongod --replSet cf --logpath "cfg-a.log" --dbpath /data/config/config-a --port 57040 --fork --configsvr
    mongod --replSet cf --logpath "cfg-b.log" --dbpath /data/config/config-b --port 57041 --fork --configsvr
    mongod --replSet cf --logpath "cfg-c.log" --dbpath /data/config/config-c --port 57042 --fork --configsvr
    ```
#### Configuring replica set for config servers:

- Connect to one of the config server and initiate the replica set.
    ```bash
    mongo --port 57040
    rs.initiate(
    {
        _id: "cf",
        configsvr: true,
        members:[
          { _id : 0, host : "localhost:57040" },
           { _id : 1, host : "localhost:57041" },
          { _id : 2, host : "localhost:57042" }]
     }
    )
    rs.initiate(config)
    ```

### Mongos:
- Now start the mongos on a standard port.
- Run the following command
    ```bash
    mongos --logpath "mongos-1.log" --configdb "cf/localhost:57040,localhost:57041,localhost:57042" --port 27017 --fork
    ```

### Enable sharding:
- Add shards and enable sharding.
    ```bash
    mongo --port 27017
    sh.addShard('s0/localhost:37017, localhost:37018, localhost:37019')
    sh.addShard('s0/localhost:47017, localhost:47018, localhost:47019')
    sh.addShard('s0/localhost:57017, localhost:57018, localhost:57019')

    ```

- Enable sharding for Db.
    ```bash
    use school;
    db.adminCommand({enableSharding: "school"});
    ```
- Enable sharding for collection.
    ```bash
    db.createCollection("students");
    db.adminCommand({shardCollection: "school.students", key: {student_id:1}});
    ```