MongoDB Dev and Admin (17 Blogs) Become a Certified Professional

Concept of Sharding in MongoDB

Last updated on Oct 23,2020 5K Views

MongoDB is a NoSQL database which stores data in the form of key-value pairs and has the ability to work in cross-platform model. Sharding is one of the concepts which is very important to MongoDB. In laymen terms, it means to break up large tabular data into smaller subsets.

So let us begin with the article,

What is the issue?

When the data is so large, it can’t be stored and scaled in a single machine. It can be too expensive to store exponentially growing data in a single machine. Moreover, as the size of data increases, data storage in a single machine may not provide an acceptable read and write throughput.

What is Sharding?

Sharding is the process of storing data records across multiple machines. It provides support to meet the demands of data growth. It is not replication of data, but amassing different data from different machines. Sharding allows horizontal scaling of data stored in multiple shards. With Sharding, we can add more machines to meet the demands of growing data and the demands of read and write operations. The more machines you add, the more read and write operations your database can support.

Why do we need Sharding?

  • In replication, all writes go to master node. The master node is latency sensitive.
  • Each of the single replica set has the limitation of 12 nodes
  • The memory can’t be large enough when the active data set is large enough. There’s a limit up to which main memory can be increased.
  • The local disk is not big enough to store the large amount of data.
  • Vertical scaling is too expensive, e.g. RDBMS

Sharding Architecture

There are number of replica sets in a MongoDB cluster, each of which contains 3 or more mongodb nodes. There are multiple shards within the clusters. Mongos communicate with each of the Shards, and the App server in turn communicates with the query router, Mongos. This way the data is partitioned.

For example, if there are 6 million employee documents, they can’t be stored in a single machine as there is a limit to its storage capacity, and read and write throughput. In such a case, Sharding helps in storing and managing data across multiple shards. If data is to be horizontally divided across the 6 shards, based on the employee id of each employee, every shard will have 1 million employee ids. This way, the large set of data can be easily scaled.

Got a question for us? Mention them in the comments section and we will get back to you. 

Related Posts:

MongoDB: The Database for Big Data Processing

Real World Use Cases of MongoDB

Learn MongoDB

mongoDB

Comments
0 Comments

Join the discussion

Browse Categories

webinar REGISTER FOR FREE WEBINAR
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP

Subscribe to our Newsletter, and get personalized recommendations.