When to use Hadoop, HBase, Hive and Pig?

Question

What are the benefits of using either&#160;Hadoop&#160;or&#160;HBase&#160;or&#160;Hive&#160;?From my understanding,&#160;HBase&#160;avoids using map-reduce and has a column oriented storage on top of HDFS.&#160;Hive&#160;is a sql-like interface for&#160;Hadoop&#160;and&#160;HBase.I would also like to know how&#160;Hive&#160;compares with&#160;Pig

Gitika · Answer

MapReduce is just a&#160;computing framework. HBase has nothing to do with it. That said, you can efficiently put or fetch data to/from HBase by writing MapReduce jobs. Alternatively, you can write sequential programs using other HBase APIs, such as Java, to put or fetch the data. But we use Hadoop, HBase, etc to deal with gigantic amounts of data, so that doesn't make much sense. Using normal sequential programs would be highly inefficient when your data is too huge.Coming back to the first part of your question, Hadoop is basically 2 things: a&#160;Distributed FileSystem (HDFS)&#160;+ a&#160;Computation or Processing framework (MapReduce). Like all other FS, HDFS also provides our storage, but in a fault-tolerant manner with high throughput and lower risk of data loss (because of the replication). But, being an FS, HDFS lacks&#160;random read and write access. This is where HBase comes into the picture. It's a&#160;distributed, scalable, big data store, modeled after Google's BigTable. It stores data as key/value pairs.Coming to Hive. It provides us data&#160;warehousing&#160;facilities on top of an existing Hadoop cluster. Along with that, it provides an&#160;SQL-like&#160;interface which makes your work easier, in case you are coming from an SQL background. You can create tables in Hive and store data there. Along with that you can even map your existing HBase tables to Hive and operate on them.While Pig is basically a&#160;dataflow language&#160;that allows us to process enormous amounts of data very easily and quickly. Pig basically has 2 parts: the Pig&#160;Interpreter&#160;and the language,&#160;PigLatin. You write Pig script in PigLatin and using Pig interpreter process them. Pig makes our life a lot easier, otherwise writing MapReduce is always not easy. In fact, in some cases, it can really become a pain.

When to use Hadoop HBase Hive and Pig

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Big Data Hadoop

Hadoop Hive Hbase: How to insert data into Hbase using Hive (JSON file)?

Can anyone help me out to install the following packages R-MR, R-HDFS, and R-HBase on R-HAdoop?

When and when not to use PigStore?

How to install and configure a multi-node Hadoop cluster?

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop Mapreduce word count Program

hadoop fs -put command?

Hadoop dfs -ls command?

How to know Hive and Hadoop versions from command prompt?

What are the relational operators available related to loading and storing in pig language?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES