Comprehensive HIVE (4 Blogs) Become a Certified Professional

Hive & Yarn Get Electrified By Spark

Last updated on Jul 04,2019 8.7K Views
Awanish
Awanish is a Sr. Research Analyst at Edureka. He has rich expertise... Awanish is a Sr. Research Analyst at Edureka. He has rich expertise in Big Data technologies like Hadoop, Spark, Storm, Kafka, Flink. Awanish also...


In this blog, let us see how to build Spark for a specific Hadoop version.

We will also learn how to build Spark with HIVE and YARN.

KM

Considering that you have Hadoop, jdk, mvn and git pre-installed and pre-configured on your system.

configure-Building-Yarn-and-Hive-on-Spark

Open Mozilla browser and Download Spark using below link.

https://edureka.wistia.com/medias/k14eamzaza/

browser-Building-Yarn-and-Hive-on-Spark

file-Building-Yarn-and-Hive-on-Spark

Open terminal.

terminal-Building-Yarn-and-Hive-on-Spark

Command: tar -xvf Downloads/spark-1.1.1.tgz

downloads-Building-Yarn-and-Hive-on-Spark

Command: ls

command-Building-Yarn-and-Hive-on-Spark

Open spark-1.1.1 directory.

directory-Building-Yarn-and-Hive-on-Spark

You can open pom.xml file. This file gives you the information about all the dependencies you need.

Do not edit it to stay out of trouble.

gedit-Building-Yarn-and-Hive-on-Spark

Command: cd spark-1.1.1/

Command: sudo gedit sbt/sbt-launch-lib.bash

command-2-Building-Yarn-and-Hive-on-Spark

Edit the file as below snapshot, save it and close it.

file-edit-Building-Yarn-and-Hive-on-Spark

We are reducing the memory to avoid object heap space issue as mentioned in below snapshot.

reduce-memory-Building-Yarn-and-Hive-on-Spark

Now, run the below command in the terminal to build spark for Hadoop 2.2.0 with HIVE and YARN.

Command: ./sbt/sbt -Pyarn -Phive -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests assembly

Note: My Hadoop version is 2.2.0, you can change it according to your Hadoop version.

For other Hadoop versions

# Apache Hadoop 2.0.5-alpha

-Dhadoop.version=2.0.5-alpha

# Cloudera CDH 4.2.0

-Dhadoop.version=2.0.0-cdh4.2.0

# Apache Hadoop 0.23.x

-Phadoop-0.23 -Dhadoop.version=0.23.7

# Apache Hadoop 2.3.X

-Phadoop-2.3 -Dhadoop.version=2.3.0

# Apache Hadoop 2.4.X

-Phadoop-2.4 -Dhadoop.version=2.4.0

hadoop-Building-Yarn-and-Hive-on-Spark

It will take some time for compiling and packaging, please wait till it completes.

compile-Building-Yarn-and-Hive-on-Spark

wait-Building-Yarn-and-Hive-on-Spark

Two jars spark-assembly-1.1.1-hadoop2.2.0.jar and spark-examples-1.1.1-hadoop2.2.0.jar gets created.

Path of spark-assembly-1.1.1-hadoop2.2.0.jar : /home/edureka/spark-1.1.1/assembly/target/scala-2.10/spark-assembly-1.1.1-hadoop2.2.0.jar

Path of spark-examples-1.1.1-hadoop2.2.0.jar : /home/edureka/spark-1.1.1/examples/target/scala-2.10/spark-examples-1.1.1-hadoop2.2.0.jar

Congratulations, you have successfully built Spark for Hive & Yarn.

Got a question for us? Please mention them in the comments section and we will get back to you.

Related Posts:

Get Started with Apache Spark

Apache Spark Lighting up the Big Data World

Apache Spark Ecosystem

Apache Spark with Hadoop-Why it matters?

Spark Functional Features

Start your Training in Apache Spark & Scala Today.

Comments
1 Comment

Browse Categories

webinar REGISTER FOR FREE WEBINAR
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP

Subscribe to our Newsletter, and get personalized recommendations.