Hadoop on cassandra database

0 votes

I am using Cassandra to store my data and hive to process my data. I have 5 machines on which i have set up cassandra and 2 machines I use as analytics node(where hive runs) So I want to ask is does hive do map reduce on just two machines(analytics nodes) and brings data there or it moves the process/computation to 5 cassandra nodes as well and process/compute the data on those machines.(What I know is in hadoop, process moves to data not data to process).

Mar 26, 2018 in Big Data Hadoop by Shubham
• 13,300 points
63 views

1 answer to this question.

0 votes

Regarding your question - there is a tradeof: 

a) If you run Hadoop / Hive on separate nodes you loose data locality and thereof your data throughput is limited by your network bandwidth. 
b) If you run hadoop / Hive on the same nodes as cassandra runs - you can get data locality but MapReduce processing behind hive queries might clogg your network (and other resources) and thereof affect your quality of service from cassandra. 

My suggestion will be to have separate hive nodes if performance of your cassandra cluster are critical. 
If your cassandra is mostly used as a data store and do not handle real-time requests - then running hive on each node will improve performance and hardware utilization.

answered Mar 26, 2018 by nitinrawat895
• 10,690 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Relationship between Spark, Hadoop and Cassandra?

Spark is a distributed in memory processing ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by nitinrawat895
• 10,690 points
154 views
+1 vote
1 answer

Hadoop Installation Issue on Windows

Below is the main error you are ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by nitinrawat895
• 10,690 points
804 views
0 votes
1 answer

Hadoop on Windows - Training

If your primary objective is to learn ...READ MORE

answered Mar 27, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
133 views
0 votes
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,690 points
3,085 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
15,152 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
1,127 views
0 votes
1 answer
0 votes
1 answer

How to install Hadoop on Ubuntu?

You can refer to this blog by ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 10,690 points
360 views
+1 vote
1 answer

Cassandra and Hadoop - realtime vs batch

Apache Hadoop, is a big data analytics ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by nitinrawat895
• 10,690 points
749 views