Apache Hadoop 2.0 and YARN: The News in Hadoop Community
Apache Software foundation (ASF), the open source group which manages the Hadoop Development has announced in its blog that Hadoop 2.0 is now Generally Available (GA). This announcement means that after a long wait, Apache Hadoop 2.0 and YARN are now ready for Production deployment.
Here is a Guide on “Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu”
Apache Hadoop 2.0 and YARN: What’s the Fuss about?
With its enterprise class features, the Hadoop Release 2.0 is a major milestone towards increased Hadoop adoption among businesses. The new release provides many enterprise-class features in Hadoop, namely:
YARN framework (MapReduce 2.0): YARN provides better resource management in Hadoop, resulting in improved cluster efficiency and application performance. This feature not only improves the MapReduce Data Processing but also enables Hadoop usage in other data processing applications.
HDFS High Availability (aka NameNode HA): In Hadoop 1.0 NameNode was the single point of failure in a Cluster, resulting in data loss in case of a NameNode failure. Hadoop 2.0 Architecture supports multiple NameNodes to remove this bottleneck. The NameNode HA feature will make Hadoop attractive to enterprises.
HDFS Federation: This feature allows horizontal scalability for Hadoop file system. Again one of the many sought after feature by enterprise class Hadoop users such as Amazon and eBay.
There are additional features such as Data Snapshot, Support for Windows, NFS access which will increase Hadoop adoption in the Industry to solve Big Data problems.
Want to learn more? Review the Apache Hadoop 2.0 and YARN tutorial from one of our expert.