Unable to submit the spark job in deployment mode - multinode cluster(using ubuntu machines) with yarn master

0 votes
Getting below exception while submitting the application using spark-submit.Please suggest me which configuration is missing

java.lang.Exception: You must specify a valid link name at org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:75) at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$distribute$1(Client.scala:409) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$6$anonfun$apply$3.apply(Client.scala:471) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$6$anonfun$apply$3.apply(Client.scala:470) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$6.apply(Client.scala:470) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$6.apply(Client.scala:468) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:468) at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:727) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142) at org.apache.spark.deploy.yarn.Client.run(Client.scala:1021) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
Jul 29 in Apache Spark by Ganendra
• 140 points
114 views
Hi@Ganendra,

Can you share your code or command? It will be easy to understand what is the issue.
Thanks for your reply.

I am using spark-submit command as below:

spark-submit --deploy-mode cluster   --master yarn --jars /dependency_jars_path --class class-name  /classjar_path

Please suggest to me when this invalid link name error comes usually. Which configuration needs to be set to make it proper.

Thanks & Regards,

Gani

I think there is a problem with your path. You can check the below link. You will get the idea.

https://community.cloudera.com/t5/Support-Questions/Spark-job-fails-in-cluster-mode/td-p/58772

1 answer to this question.

0 votes

Hi@Ganendra,

As you said you launched a multinode cluster, you have to use spark-submit command. You cannot run yarn-cluster mode via spark-shell because when you will run spark application, the driver program will be running as part application master container/process. So it is not possible to run cluster mode via spark-shell.

$ spark-submit –class com.df.SparkWordCount SparkWC.jar yarn-client
$ spark-submit –class com.df.SparkWordCount SparkWC.jar yarn-cluster
answered Jul 29 by MD
• 57,640 points
Thanks for your reply.

I am using spark-submit only not spark-shell. Below is the similar command:

spark-submit --deploy-mode cluster   --master yarn --jars /dependency_jars_path --class class-name  /classjar_path

I think there is a problem with your path. You can check the below link. You will get the idea.

https://community.cloudera.com/t5/Support-Questions/Spark-job-fails-in-cluster-mode/td-p/58772

Related Questions In Apache Spark

0 votes
1 answer

Copy file from local to hdfs from the spark job in yarn mode

Refer to the below code: import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem import ...READ MORE

answered Jul 24, 2019 in Apache Spark by Yogi
1,326 views
0 votes
0 answers

Unable to get the Job status and Group ID java- spark standalone program with databricks

package com.dataguise.test; import java.io.IOException; import java.util.concurrent.CountDownLatch; import java.util.concurrent.TimeUnit; import org.apache.spark.SparkContext; import org.apache.spark.SparkJobInfo; import ...READ MORE

Jul 23 in Apache Spark by kamboj
• 140 points

recategorized Jul 28 by Gitika 162 views
0 votes
1 answer

How to use ftp scheme using Yarn in Spark application?

In case Yarn does not support schemes ...READ MORE

answered Mar 28, 2019 in Apache Spark by Raj
290 views
0 votes
1 answer

How to launch spark application in cluster mode in Spark?

Hi, To launch spark application in cluster mode, ...READ MORE

answered Aug 2, 2019 in Apache Spark by Gitika
• 37,700 points
2,279 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,950 points
5,911 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,950 points
892 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyF ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
37,428 views
0 votes
1 answer