AWS Architect Certification Training
- 53k Enrolled Learners
- Live Class
Setting up Cassandra is quite simple and the maintenance is automatically done. The platform is quite fast even when it is scaled up or a node is added. Cassandra also takes care of re-syncing, balancing or distribution of data. The platform is known to provide high velocity random read writes compared to other NoSQL platforms since it has columnar storage capability and distributed decentralized architecture.
Flexible Sparse & Wide Column requirements talk about capability to increase your columns as and when you need. It is suitable only in those cases where secondary index needs are less, which means you have it absolutely de-normalized. In other words all information is available to serve a specific query and does not go across multiple tables to get server specific client query.
It is important to know that Cassandra is suitable with non-group by kind of models. For applications that have requirement of group-by functionality, Cassandra would not be the right system to choose. This also includes bringing in secondary indexes, which will result into overall performance of system going down.
Cassandra is the most suitable platform where there is less secondary index needs, simple setup, and maintenance, very high velocity of random read & writes & wide column requirements.
Take example of twitter, where there is massive scale & availability.
For example, if Angelina Jolie tweets saying that if anybody retweets her tweet he shall get a date with her. This is in turn will result in millions of people retweeting. Twitter must be equipped to handle such large traffic.
Let’s take an example to explain consistency. Say if a system has to be highly available, which means that if you have multiple clients querying, then something must be available to service your requirements which calls for a need to have multiple replicas within the system.
When one has multiple replicas, its important to make sure that all replicas are absolutely in sync to determine consistency. So when one does a write operation and sets tuneable consistency at the highest which implies that all have to be properly in sync. So every time one does a write operation, it writes on the replica but the write does not come back with success until all the replicas in cluster are in sync with the data. Thus, the latency of the write increases because of the consistency in data before you have written a success for your write. This is basically the consistency concept. Thus, every request coming from client application results in the same requirement going back.
Got a question for us? Mention them in the comments section and we will get back to you.