MongoDB Dev and Admin (14 Blogs) Become a Certified Professional

Concept of Sharding in MongoDB

Last updated on May 22,2019 2.8K Views


myMock Interview Service for Real Tech Jobs


myMock Interview Service for Real Tech Jobs

  • Mock interview in latest tech domains i.e JAVA, AI, DEVOPS,etc
  • Get interviewed by leading tech experts
  • Real time assessment report and video recording


When the data is so large, it can’t be stored and scaled in a single machine. It can be too expensive to store exponentially growing data in a single machine. Moreover, as the size of data increases, data storage in a single machine may not provide an acceptable read and write throughput.

What is Sharding?

Sharding is the process of storing data records across multiple machines. It provides support to meet the demands of data growth. It is not replication of data, but amassing different data from different machines. Sharding allows horizontal scaling of data stored in multiple shards. With Sharding, we can add more machines to meet the demands of growing data and the demands of read and write operations. The more machines you add, the more read and write operations your database can support.

Why do we need Sharding?

  • In replication, all writes go to master node. The master node is latency sensitive.
  • Each of the single replica set has the limitation of 12 nodes
  • The memory can’t be large enough when the active data set is large enough. There’s a limit up to which main memory can be increased.
  • The local disk is not big enough to store the large amount of data.
  • Verticle scaling is too expensive, e.g. RDBMS

Sharding Architecture

There are number of replica sets in a MongoDB cluster, each of which contains 3 or more mongod nodes. There are multiple shards within the clusters. Mongos communicate with each of the Shards, and the App server in turn communicates with the query router, Mongos. This way the data is partitioned.

For example, if there are 6 million employee documents, they can’t be stored in a single machine as there is a limit to its storage capacity, and read and write throughput. In such a case, Sharding helps in storing and managing data across multiple shards. If data is to be horizontally divided across the 6 shards, based on the employee id of each employee, every shard will have 1 million employee ids. This way, the large set of data can be easily scaled.

Got a question for us? Mention them in the comments section and we will get back to you. 

Related Posts:

MongoDB: The Database for Big Data Processing

Why Big Data Professionals need to Learn MongoDB?

Real World Use Cases of MongoDB

Learn MongoDB


Browse Categories

Subscribe to our Newsletter, and get personalized recommendations.