How to use ftp scheme using Yarn in Spark application

0 votes
Hi. I want to use FTP for my Spark application. I have learnt that Spark supports this but the problem is that I am using Yarn with spark and Yarn does not support this. I need to have the scheme in the local disk. I want to know how I can download schemes for Yarn. Please help
Mar 28, 2019 in Apache Spark by Rishi
944 views

1 answer to this question.

0 votes

In case Yarn does not support schemes that Spark supports, you will have to download the schemes on local disk before adding to Yarn's cache. You can use the in-built feature of Spark that let's you download the schemes that you want. Refer to the below command to do this:

val sc = new SparkContext(new SparkConf())

./bin/spark-submit <all your existing options> --spark.yarn.dist.forceDownloadSchemes= <list of schemes>
answered Mar 28, 2019 by Raj

Related Questions In Apache Spark

0 votes
1 answer

How to get SQL configuration in Spark using Python?

You can get the configuration details through ...READ MORE

answered Mar 18, 2019 in Apache Spark by John
977 views
0 votes
1 answer

How to set executors for static allocation in Spark Yarn?

Open Spark shell and run the following ...READ MORE

answered Mar 28, 2019 in Apache Spark by Raj
1,343 views
0 votes
1 answer

How to use Spark jars for Yarn distribution?

First, store upload this archive to hdfs and ...READ MORE

answered Mar 28, 2019 in Apache Spark by Raj
1,271 views
0 votes
1 answer

How to create paired RDD using subString method in Spark?

Hi, If you have a file with id ...READ MORE

answered Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
2,344 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,627 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,219 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
105,000 views
0 votes
1 answer

How to increase worker timeout in Spark application?

By default, the timeout is set to ...READ MORE

answered Mar 25, 2019 in Apache Spark by Hari
8,442 views
0 votes
5 answers

How to change the spark Session configuration in Pyspark?

You aren't actually overwriting anything with this ...READ MORE

answered Dec 14, 2020 in Apache Spark by Gitika
• 65,910 points
122,420 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP