When to use Hadoop HBase Hive and Pig

+1 vote

What are the benefits of using either Hadoop or HBase or Hive ?

From my understanding, HBase avoids using map-reduce and has a column oriented storage on top of HDFS. Hive is a sql-like interface for Hadoop and HBase.

I would also like to know how Hive compares with Pig

Nov 20, 2020 in Big Data Hadoop by anonymous
• 10,400 points
76 views

1 answer to this question.

+1 vote

MapReduce is just a computing framework. HBase has nothing to do with it. That said, you can efficiently put or fetch data to/from HBase by writing MapReduce jobs. Alternatively, you can write sequential programs using other HBase APIs, such as Java, to put or fetch the data. But we use Hadoop, HBase, etc to deal with gigantic amounts of data, so that doesn't make much sense. Using normal sequential programs would be highly inefficient when your data is too huge.

Coming back to the first part of your question, Hadoop is basically 2 things: a Distributed FileSystem (HDFS) + a Computation or Processing framework (MapReduce). Like all other FS, HDFS also provides our storage, but in a fault-tolerant manner with high throughput and lower risk of data loss (because of the replication). But, being an FS, HDFS lacks random read and write access. This is where HBase comes into the picture. It's a distributed, scalable, big data store, modeled after Google's BigTable. It stores data as key/value pairs.

Coming to Hive. It provides us data warehousing facilities on top of an existing Hadoop cluster. Along with that, it provides an SQL-like interface which makes your work easier, in case you are coming from an SQL background. You can create tables in Hive and store data there. Along with that you can even map your existing HBase tables to Hive and operate on them.

While Pig is basically a dataflow language that allows us to process enormous amounts of data very easily and quickly. Pig basically has 2 parts: the Pig Interpreter and the language, PigLatin. You write Pig script in PigLatin and using Pig interpreter process them. Pig makes our life a lot easier, otherwise writing MapReduce is always not easy. In fact, in some cases, it can really become a pain.

answered Nov 20, 2020 by Gitika
• 65,850 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Hadoop Hive Hbase: How to insert data into Hbase using Hive (JSON file)?

You can use the get_json_object function to parse the ...READ MORE

answered Nov 15, 2018 in Big Data Hadoop by Omkar
• 69,130 points
1,526 views
0 votes
1 answer

Can anyone help me out to install the following packages R-MR, R-HDFS, and R-HBase on R-HAdoop?

I have understood your problem, I will ...READ MORE

answered May 31, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
132 views
0 votes
1 answer

When and when not to use PigStore?

The Pig storage is not used only ...READ MORE

answered Jul 10, 2019 in Big Data Hadoop by Rushil
84 views
0 votes
1 answer

How to install and configure a multi-node Hadoop cluster?

I would recommend you to install Cent ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by Shubham
• 13,460 points
1,167 views
+1 vote
2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,390 points
3,623 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
6,650 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
45,444 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
2,512 views
0 votes
2 answers

How to know Hive and Hadoop versions from command prompt?

Hi, Hadoop and hive have their individual commands. ...READ MORE

answered Dec 18, 2020 in Big Data Hadoop by akhtar
• 38,120 points
51 views
0 votes
1 answer

What are the relational operators available related to loading and storing in pig language?

Hey, For loading data and storing it into ...READ MORE

answered May 3, 2019 in Big Data Hadoop by Gitika
• 65,850 points
89 views