In a very short span of time Apache Hadoop has moved from an emerging technology to an established solution for Big Data problems faced by today’s enterprises.
Also, from its launch in 2006, Amazon Web Services (AWS) has become synonym to Cloud Computing. CIA’s recent contract with Amazon to build a private cloud service inside the CIA’s data centers is the proof of growing popularity and reliability of AWS as a Cloud Computing vendor to various organizations.
If you are a Big Data, Hadoop, and Cloud Computing enthusiast, you can start your journey by creating an Apache Hadoop Cluster on Amazon EC2 without spending a single penny from your pocket! This exercise will not only help you in understanding the nitty-gritty of an Apache Hadoop Cluster but also make you familiar with AWS Cloud Computing Ecosystem.
Amazon also provides a hosted solution for Apache Hadoop, named Amazon Elastic MapReduce (EMR). As of now, there is no free tier service available for EMR. Only Pig and Hive are available for use.
Apache Hadoop Cluster on Amazon EC2
The first step towards your first Apache Hadoop Cluster is to create an account on Amazon.
The next step is to launch Amazon EC2 servers and configure these AWS EC2 servers to Apache Hadoop Installation.
The complete process can be summarized in three simple steps:
Create your own Amazon AWS account. It’s free and it’s outstanding in its intuitive design! It is the best place to start your Cloud Computing journey. Launch ‘t1.micro’ servers (free tier eligible usage) for your cluster.
Prepare these AWS EC2 servers for Hadoop Installation i.e. Upgrade OS packages, Install JDK 1.6, setup the hosts and password-less SSH from Master to Slaves.
Edit the following Core Hadoop Configuration files to setup the cluster.
Copy these configuration files to Secondary Name Node and Slave nodes. Start the HDFS and MapReduce services.
So, isn’t it easy to install Apache Hadoop Cluster on Amazon EC2 free tier Ubuntu server in just 30 minutes?
You can download the free cloud computing pdf guide for more details about the Apache Hadoop cluster setup.
This free cloud computing pdf guide provides step by step details with corresponding screen shots to setup a Multi-Node Apache Hadoop Cluster on AWS EC2.
Some Useful References: