Building Yarn and Hive on Spark - Edureka Blog

Comprehensive HIVE (4 Blogs) Become a Certified Professional

Become a Certified Professional

In this blog, let us see how to build Spark for a specific Hadoop version.

We will also learn how to build Spark with HIVE and YARN.

Considering that you have Hadoop, jdk, mvn and git pre-installed and pre-configured on your system.

Open Mozilla browser and Download Spark using below link.

https://edureka.wistia.com/medias/k14eamzaza/

Open terminal.

Command: tar -xvf Downloads/spark-1.1.1.tgz

Command: ls

Open spark-1.1.1 directory.

You can open pom.xml file. This file gives you the information about all the dependencies you need.

Do not edit it to stay out of trouble.

Command: cd spark-1.1.1/

Command: sudo gedit sbt/sbt-launch-lib.bash

Edit the file as below snapshot, save it and close it.

We are reducing the memory to avoid object heap space issue as mentioned in below snapshot.

Now, run the below command in the terminal to build spark for Hadoop 2.2.0 with HIVE and YARN.

Command: ./sbt/sbt -Pyarn -Phive -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests assembly

Note: My Hadoop version is 2.2.0, you can change it according to your Hadoop version.

For other Hadoop versions

# Apache Hadoop 2.0.5-alpha

-Dhadoop.version=2.0.5-alpha

# Cloudera CDH 4.2.0

-Dhadoop.version=2.0.0-cdh4.2.0

# Apache Hadoop 0.23.x

-Phadoop-0.23 -Dhadoop.version=0.23.7

# Apache Hadoop 2.3.X

-Phadoop-2.3 -Dhadoop.version=2.3.0

# Apache Hadoop 2.4.X

-Phadoop-2.4 -Dhadoop.version=2.4.0

It will take some time for compiling and packaging, please wait till it completes.

Two jars spark-assembly-1.1.1-hadoop2.2.0.jar and spark-examples-1.1.1-hadoop2.2.0.jar gets created.

Path of spark-assembly-1.1.1-hadoop2.2.0.jar : /home/edureka/spark-1.1.1/assembly/target/scala-2.10/spark-assembly-1.1.1-hadoop2.2.0.jar

Path of spark-examples-1.1.1-hadoop2.2.0.jar : /home/edureka/spark-1.1.1/examples/target/scala-2.10/spark-examples-1.1.1-hadoop2.2.0.jar

Congratulations, you have successfully built Spark for Hive & Yarn.

Got a question for us? Please mention them in the comments section and we will get back to you.

Related Posts:

Get Started with Apache Spark

Apache Spark Lighting up the Big Data World

Apache Spark Ecosystem

Apache Spark with Hadoop-Why it matters?

Start your Training in Apache Spark & Scala Today.

Recommended videos for you

Distributed Cache With MapReduce

Apache Spark Redefining Big Data Processing

Pig-Tutorial-Apache-Pig-Script-Hadoop-Pig-Tutorial-Edureka.jpeg

Pig Tutorial – Know Everything About Apache Pig Script

5 Scenarios: When To Use & When Not to Use Hadoop

filtering-on-hbase-using-mapreduce-filtering-pattern.jpg

Filtering on HBase Using MapReduce Filtering Pattern

Apache Spark Will Replace Hadoop ! Know Why

Streaming With Apache Spark and Scala

Hadoop Cluster With High Availability

hadoop-a-highly-available-and-secure-enterprise-data-warehousing-solution.jpg

Hadoop-A Highly Available And Secure Enterprise Data Warehousing Solution

When not to use Hadoop

Secure Your Hadoop Cluster With Kerberos

Big Data Processing With Apache Spark

Power of Python With BigData

HBase-Tutorial-Apache-HBase-Tutorial-for-Beginners-NoSQL-Databases-Hadoop-Tutorial-Edureka.jpeg

HBase Tutorial – A Complete Guide On Apache HBase

Hive-Tutorial-1-Hive-Tutorial-for-Beginners-Understanding-Hive-In-Depth-Edureka.jpeg

Hive Tutorial – Understanding Hive In Depth

MapReduce-Tutorial-What-is-MapReduce-Hadoop-MapReduce-Tutorial-Edureka.jpeg

MapReduce Tutorial – All You Need To Know About MapReduce

What is Apache Storm all about?

Spark SQL | Apache Spark

Apache-Hadoop-Tutorial-Hadoop-Tutorial-For-Beginners-Big-Data-Hadoop-Hadoop-Training-Edureka.jpeg

Hadoop Tutorial – A Complete Tutorial For Hadoop

Introduction to Apache Solr-1

Recommended blogs for you

Apache Spark with Hadoop – Why it Matters?

What is the difference between Big Data and Hadoop?

7 Ways Big Data Training Can Change Your Organization

Hive and Yarn Examples on Spark

Azure Data Engineer Roadmap in 2025

Helpful Hadoop Shell Commands

Install Apache Hadoop Cluster on Amazon EC2 free tier Ubuntu server in 30 minutes

Implementing Hadoop & R Analytic Skills in Banking Domain

Steps to Create UDF in Apache Pig

CCA-175-Spark-and-Hadoop-Developer-Certification_2-300x175.jpg

What is CCA-175 Spark and Hadoop Developer Certification?

Hadoop-Installation-Install-Hadoop-Edureka-1-300x176.png

Install Hadoop: Setting up a Single Node Hadoop Cluster

Commissioning and Decommissioning Nodes in a Hadoop Cluster

Jupyter Notebook Cheat Sheet : A Beginner’s Guide to Jupyter Notebook

How Predictive Analysis can Help you Combat Employee Attrition

Essential Hadoop Tools for Crunching Big Data

Install Puppet – Install Puppet in Four Simple Steps

Cloudera Hadoop: Getting started with CDH Distribution

What is Big Data Analytics – Turning Insights Into Action

What Is Splunk? A Beginners Guide To Understanding Splunk

Big Data and ETL are Family

Comments

1 Comment

Join the discussionCancel reply

REGISTER FOR FREE WEBINAR

webinar_success

Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP

Hive & Yarn Get Electrified By Spark

edureka.co