How Impala is fast compared to Hive in terms of query response

0 votes
I am querying large CSV data sets present in HDFS using Hive and Impala. I saw that I’m getting better response time with Impala compared to Hive for the queries.

Can anyone tell me some use cases where impala is best suited and where hive is best suited?

How impala is fast in terms of query response when compared to hive?
Mar 21, 2018 in Big Data Hadoop by coldcode
• 2,070 points
946 views

1 answer to this question.

0 votes

Impala provides faster response as it uses MPP(massively parallel processing) unlike Hive which uses MapReduce under the hood, which involves some initial overheads (as Charles sir has specified). Massively parallel processing is a type of computing that uses many separate CPUs running in parallel to execute a single program where each CPU has it's own dedicated memory. The very fact that Impala, being MPP based, doesn't involve the overheads of a MapReduce jobs viz. job setup and creation, slot assignment, split creation, map generation etc., makes it blazingly fast.

But that doesn't mean that Impala is the solution to all your problems. Being highly memory intensive (MPP), it is not a good fit for tasks that require heavy data operations like joins etc., as you just can't fit everything into the memory. This is where Hive is a better fit.

So, if you need real time, ad-hoc queries over a subset of your data go for Impala. And if you have batch processing kinda needs over your Big Data go for Hive.

answered Mar 21, 2018 by nitinrawat895
• 11,380 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Hadoop Hive: How to skip the first line of csv while loading in hive table?

You can try this: CREATE TABLE temp ...READ MORE

answered Nov 8, 2018 in Big Data Hadoop by Omkar
• 69,150 points
5,940 views
+1 vote
1 answer

How to limit the number of rows per each item in a Hive QL?

SELECT a_id, b, c, count(*) as sumrequests FROM ...READ MORE

answered Nov 30, 2018 in Big Data Hadoop by Omkar
• 69,150 points
15,425 views
0 votes
2 answers

How to change the location of a table in hive?

Changing location requires 2 steps: 1.) Change location ...READ MORE

answered Feb 12, 2020 in Big Data Hadoop by Saksham Sehrawet
3,740 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
7,793 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,290 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
60,625 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
2,936 views
0 votes
1 answer

How to delete a directory from Hadoop cluster which is having comma(,) in its name?

Just try the following command: hadoop fs -rm ...READ MORE

answered May 7, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
1,441 views
0 votes
7 answers

How to run a jar file in hadoop?

I used this command to run my ...READ MORE

answered Dec 10, 2018 in Big Data Hadoop by Dasinto
16,325 views