Cannot load file to spark: "org.apache.spark.sql.AnalysisException: Path does not exist"

0 votes

I am trying to upload a file from hdfs to Spark, but it is not working. Please help.

scala> val dataRDD = spark.read.textFile("file:///user/edureka_565414/Module5/AppleStore.csv").rdd
org.apache.spark.sql.AnalysisException: Path does not exist: file:/user/edureka_565414/Module5/AppleStore.csv;
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:382)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:370)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:344)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:506)
at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:542)
at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:515)
... 48 elided
Jul 31, 2019 in Apache Spark by Ritu
5,497 views

1 answer to this question.

0 votes

Since the file is in HDFS so you have to give the hdfs link instead of using file while mentioning the path of the dataset. Use the hdfs path, it should work:

scala> val dataRDD = spark.read.textFile("hdfs:///user/edureka_565414/Module5/AppleStore.csv").rdd
answered Jul 31, 2019 by Tina

Related Questions In Apache Spark

0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve given input columns

The string Productivity has to be enclosed between single ...READ MORE

answered Jul 10, 2019 in Apache Spark by Tina
15,323 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve "`id`" given input columns

I have used a header-less csv file ...READ MORE

answered Jul 13, 2019 in Apache Spark by Puneet
12,141 views
0 votes
2 answers

Error : split value is not a member of org.apache.spark.sql.Row

var d=rdd2col.rdd.map(x=>x.split(",")) or val names=rd ...READ MORE

answered Aug 5 in Apache Spark by Ramkumar Ramasamy.
3,787 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,950 points
6,375 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,950 points
989 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyF ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
42,057 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,320 points
2,383 views
+1 vote
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,450 points
4,347 views
0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

answered Dec 10, 2018 in Apache Spark by Akshay
36,571 views