Hadoop Distribution Differences

0 votes

Can somebody outline the various differences between the various Hadoop Distributions available:

using the Apache Hadoop distro as a baseline.

Is there a good reason to using one of these distributions over the standard Apache Hadoop distro?

Feb 18, 2019 in Big Data Hadoop by Neha
• 6,300 points
678 views

1 answer to this question.

0 votes

The Yahoo distribution is a version of Hadoop 20 that they run (ran?) on some subset of their clusters. It includes a set of patches for stability, bug fixes, etc. It is a source release; it does not have admin-friendly features like rpm or debian packages, etc.

The Cloudera distribution is packages as rpms and debs (the source is also available). This means you can get updates via standard methods, etc. It also includes stability and bug fix patches. It is constantly maintained (not to say Yahoo's isn't -- I suppose one could just go on github and check when they last updated it). It also packages Pig and Hive.

Cloudera's distribution of Hadoop 20 is in beta, and 18 is considered stable (more on this on the Cloudera blog). The 18 version also includes packages for Hive and Pig; for 20, you have to build them yourself (there aren't official releases of Pig or Hive that support 20 yet, although patches exist). There may well be significant overlap between the Cloudera and Yahoo versions of 20; both provide manifests, so you can check. The latest documentation of Cloudera's distros is at http://archive.cloudera.com

Yahoo does not provide support for their distribution; they provide their patched version as a service to the community, so the folks who are interested can build what Yahoo runs internally. Given the size of Yahoo clusters, that's a significant contribution, especially if you aren't a Hadoop developer who follows the JIRAs all the time. Cloudera supports their distribution commercially, as well as providing some community support via the Hadoop mailing lists and, for distro-specific issues, on their GetSatisfaction page.

Both are pretty different from the vanilla Apache distro since they patch it in between releases (the cloudera version of 20 has 60+ patches!).

answered Feb 18, 2019 by Frankie
• 9,830 points

Related Questions In Big Data Hadoop

0 votes
1 answer

What is Hadoop Distribution ?

Some companies release or sell products that ...READ MORE

answered Mar 27, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
960 views
0 votes
1 answer

Differences between Hadoop-common, Hadoop-core and Hadoop-client?

To help provide some additional details regarding ...READ MORE

answered Mar 29, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
3,209 views
0 votes
1 answer

How to find hadoop distribution and version?

Just Use the command Hadoop version ...READ MORE

answered Apr 6, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points

edited Apr 6, 2018 by kurt_cobain 2,031 views
0 votes
1 answer

What is -cp command in hadoop? How it works?

/user/cloudera/data1 is not a directory, it is ...READ MORE

answered Oct 17, 2018 in Big Data Hadoop by Frankie
• 9,830 points
4,194 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,078 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,575 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
109,076 views
+1 vote
1 answer
0 votes
1 answer

How compression works in Hadoop?

It basically depends on the file type ...READ MORE

answered Jul 27, 2018 in Big Data Hadoop by Frankie
• 9,830 points
2,019 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP