AWS: Four node cluster on Hadoop

+1 vote

 To deploy a 4 node cluster of Hadoop in AWS which instance type can be used?

May 31, 2018 in Cloud Computing by DragonLord999
• 8,380 points
42 views

2 answers to this question.

0 votes

First let’s understand what actually happens in a Hadoop cluster, the Hadoop cluster follows a master slave concept. The master machine processes all the data, slave machines store the data and act as data nodes. Since all the storage happens at the slave, a higher capacity hard disk would be recommended and since master does all the processing, a higher RAM and a much better CPU is required. Therefore, you can select the configuration of your machine depending on your workload. For e.g. – In this case c4.8xlarge will be preferred for master machine whereas for slave machine we can select i2.large instance. If you don’t want to deal with configuring your instance and installing hadoop cluster manually, you can straight away launch an Amazon EMR (Elastic Map Reduce) instance which automatically configures the servers for you. You dump your data to be processed in S3, EMR picks it from there, processes it, and dumps it back into S3.

answered May 31, 2018 by Meci Matt
• 9,420 points
0 votes

Follow the following step one by one and you are good to go:-

Install Java And Hadoop

$ sudo apt-get update && sudo apt-get dist-upgrade

Install OpenJDK

  • Installing latest java
$ sudo apt-get install openjdk-8-jdk

Installing Hadoop

  • Download Hadoop from one of these mirrors. Select appropriate version number. Below command will download gzip file and copies it to Downloads directory, which is created using -P paramter.
$ wget http://apache.mirrors.tds.net/hadoop/common/hadoop-2.8.1/hadoop-2.8.1.tar.gz -P ~/Downloads
  • We will now try to extract it to /usr/local.
$ sudo tar zxvf ~/Downloads/hadoop-* -C /usr/local
  • Renaming the hadoop-* to hadoop under /usr/local directory.
$ sudo mv /usr/local/hadoop-* /usr/local/hadoop

Setting up Environmental Variables

  • To know where the java is installed (where the java executable is), execute the below command. Path may be different for you.

image

  • Open .bashrc file in your home directory with your favorite editor. Include the below lines .
$ vi ~/.bashrc

For Java:

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$PATH:$JAVA_HOME/bin

For Hadoop:

export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin

For Hadoop Configuration directory:

export HADOOP_CONF_DIF=/usr/local/hadoop/etc/hadoop


for further steps follow:-

https://medium.com/@jeevananandanne/setup-4-node-hadoop-cluster-on-aws-ec2-instances-1c1eeb4453bd

    answered Aug 21, 2018 by Priyaj
    • 56,940 points

    Related Questions In Cloud Computing

    +3 votes
    3 answers

    Is there a way to install apache drill on an EMR cluster on AWS that is already Running?

    It looks to be trying to get ...READ MORE

    answered Oct 11, 2018 in Cloud Computing by findingbugs
    • 4,750 points
    338 views
    +4 votes
    4 answers

    AWS S3 cli isn’t working on Windows server

    The error message has nothing to do ...READ MORE

    answered Aug 20, 2018 in Cloud Computing by Priyaj
    • 56,940 points
    171 views
    +4 votes
    2 answers

    Running JAR file on Amazon EMR created using Hadoop 2.7.5

    I suggest you recompile the code or ...READ MORE

    answered Oct 11, 2018 in Cloud Computing by findingbugs
    • 4,750 points
    280 views
    0 votes
    1 answer

    How can install MongoDB on AWS?

    Installing MongoDB Run these commands individually from the ...READ MORE

    answered Aug 13, 2018 in Cloud Computing by bug_seeker
    • 15,360 points
    37 views
    +1 vote
    3 answers

    How to upload files on aws elastic beanstalk?

    yes once you store it in (AWS) ...READ MORE

    answered Sep 3, 2018 in Cloud Computing by bug_seeker
    • 15,360 points
    1,172 views
    0 votes
    1 answer

    Can we host website on AWS EFS

    You can host a website on a ...READ MORE

    answered May 3, 2018 in Cloud Computing by DragonLord999
    • 8,380 points
    52 views
    0 votes
    1 answer

    AWS: Can we Disable Redis Instance Swap on ElastiCache

    Is it because you are having trouble ...READ MORE

    answered May 29, 2018 in Cloud Computing by Meci Matt
    • 9,420 points
    163 views
    0 votes
    1 answer

    AWS: Performance parameters when you launch instances in cluster placement group

    The network performance depends on the instance ...READ MORE

    answered Jun 19, 2018 in Cloud Computing by Meci Matt
    • 9,420 points
    60 views
    0 votes
    1 answer

    AWS node JS: Creating AWS credential file

    Yes, you may either do this automatically ...READ MORE

    answered May 11, 2018 in Cloud Computing by Meci Matt
    • 9,420 points
    35 views
    0 votes
    1 answer

    Can Java Enterprise Edition applications on AWS EC2

    It would be good if you start ...READ MORE

    answered May 15, 2018 in Cloud Computing by Meci Matt
    • 9,420 points
    143 views