Cannot load file to spark: "org.apache.spark.sql.AnalysisException: Path does not exist"

0 votes

I am trying to upload a file from hdfs to Spark, but it is not working. Please help.

scala> val dataRDD = spark.read.textFile("file:///user/edureka_565414/Module5/AppleStore.csv").rdd
org.apache.spark.sql.AnalysisException: Path does not exist: file:/user/edureka_565414/Module5/AppleStore.csv;
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:382)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:370)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:344)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:506)
at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:542)
at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:515)
... 48 elided
Jul 31 in Apache Spark by Ritu
132 views

1 answer to this question.

0 votes

Since the file is in HDFS so you have to give the hdfs link instead of using file while mentioning the path of the dataset. Use the hdfs path, it should work:

scala> val dataRDD = spark.read.textFile("hdfs:///user/edureka_565414/Module5/AppleStore.csv").rdd
answered Jul 31 by Tina

Related Questions In Apache Spark

0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve given input columns

The string Productivity has to be enclosed between single ...READ MORE

answered Jul 10 in Apache Spark by Tina
633 views
0 votes
1 answer
0 votes
1 answer

Error : split value is not a member of org.apache.spark.sql.Row

spark.read.csv is used when loading into a ...READ MORE

answered Jul 10 in Apache Spark by Rishi
179 views
0 votes
1 answer

Error : split value is not a member of org.apache.spark.sql.Row

spark.read.csv is used when loading into a ...READ MORE

answered Jul 22 in Apache Spark by Firoz
203 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
2,739 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
289 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
13,572 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
995 views
0 votes
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,290 points
1,203 views
0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Simple and easy: line.foreach(println) READ MORE

answered Dec 10, 2018 in Apache Spark by Kuber
9,792 views