Hadoop on cassandra database

0 votes

I am using Cassandra to store my data and hive to process my data. I have 5 machines on which i have set up cassandra and 2 machines I use as analytics node(where hive runs) So I want to ask is does hive do map reduce on just two machines(analytics nodes) and brings data there or it moves the process/computation to 5 cassandra nodes as well and process/compute the data on those machines.(What I know is in hadoop, process moves to data not data to process).

Mar 26, 2018 in Big Data Hadoop by Shubham
• 13,290 points
49 views

1 answer to this question.

0 votes

Regarding your question - there is a tradeof: 

a) If you run Hadoop / Hive on separate nodes you loose data locality and thereof your data throughput is limited by your network bandwidth. 
b) If you run hadoop / Hive on the same nodes as cassandra runs - you can get data locality but MapReduce processing behind hive queries might clogg your network (and other resources) and thereof affect your quality of service from cassandra. 

My suggestion will be to have separate hive nodes if performance of your cassandra cluster are critical. 
If your cassandra is mostly used as a data store and do not handle real-time requests - then running hive on each node will improve performance and hardware utilization.

answered Mar 26, 2018 by nitinrawat895
• 10,510 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Relationship between Spark, Hadoop and Cassandra?

Spark is a distributed in memory processing ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by nitinrawat895
• 10,510 points
115 views
+1 vote
1 answer

Hadoop Installation Issue on Windows

Below is the main error you are ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by nitinrawat895
• 10,510 points
659 views
0 votes
1 answer

Hadoop on Windows - Training

If your primary objective is to learn ...READ MORE

answered Mar 27, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
107 views
0 votes
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,510 points
2,399 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
12,205 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
896 views
0 votes
1 answer
0 votes
1 answer

How to install Hadoop on Ubuntu?

You can refer to this blog by ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 10,510 points
317 views
+1 vote
1 answer

Cassandra and Hadoop - realtime vs batch

Apache Hadoop, is a big data analytics ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by nitinrawat895
• 10,510 points
74 views