Unable to submit the spark job in deployment mode - multinode cluster using ubuntu machines with yarn master

0 votes
Getting below exception while submitting the application using spark-submit.Please suggest me which configuration is missing

java.lang.Exception: You must specify a valid link name at org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:75) at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$distribute$1(Client.scala:409) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$6$anonfun$apply$3.apply(Client.scala:471) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$6$anonfun$apply$3.apply(Client.scala:470) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$6.apply(Client.scala:470) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$6.apply(Client.scala:468) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:468) at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:727) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142) at org.apache.spark.deploy.yarn.Client.run(Client.scala:1021) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
Jul 29, 2020 in Apache Spark by Ganendra
• 140 points

Can you share your code or command? It will be easy to understand what is the issue.
Thanks for your reply.

I am using spark-submit command as below:

spark-submit --deploy-mode cluster   --master yarn --jars /dependency_jars_path --class class-name  /classjar_path

Please suggest to me when this invalid link name error comes usually. Which configuration needs to be set to make it proper.

Thanks & Regards,


I think there is a problem with your path. You can check the below link. You will get the idea.


1 answer to this question.

0 votes


As you said you launched a multinode cluster, you have to use spark-submit command. You cannot run yarn-cluster mode via spark-shell because when you will run spark application, the driver program will be running as part application master container/process. So it is not possible to run cluster mode via spark-shell.

$ spark-submit –class com.df.SparkWordCount SparkWC.jar yarn-client
$ spark-submit –class com.df.SparkWordCount SparkWC.jar yarn-cluster
answered Jul 29, 2020 by MD
• 95,440 points
Thanks for your reply.

I am using spark-submit only not spark-shell. Below is the similar command:

spark-submit --deploy-mode cluster   --master yarn --jars /dependency_jars_path --class class-name  /classjar_path

I think there is a problem with your path. You can check the below link. You will get the idea.


Related Questions In Apache Spark

0 votes
1 answer

Copy file from local to hdfs from the spark job in yarn mode

Refer to the below code: import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem import ...READ MORE

answered Jul 24, 2019 in Apache Spark by Yogi
0 votes
0 answers

Unable to get the Job status and Group ID java- spark standalone program with databricks

package com.dataguise.test; import java.io.IOException; import java.util.concurrent.CountDownLatch; import java.util.concurrent.TimeUnit; import org.apache.spark.SparkContext; import org.apache.spark.SparkJobInfo; import ...READ MORE

Jul 23, 2020 in Apache Spark by kamboj
• 140 points

recategorized Jul 28, 2020 by Gitika 2,171 views
0 votes
1 answer

How to use ftp scheme using Yarn in Spark application?

In case Yarn does not support schemes ...READ MORE

answered Mar 28, 2019 in Apache Spark by Raj
0 votes
1 answer

How to launch spark application in cluster mode in Spark?

Hi, To launch spark application in cluster mode, ...READ MORE

answered Aug 2, 2019 in Apache Spark by Gitika
• 65,890 points
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
0 votes
1 answer
+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 5, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 88,251 views
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP