Spark: java.io.FileNotFoundException

+1 vote

While executing a query I am getting the below error:

val ddd=spark.sql("select Year, sum(countTotal) as total_count from df3 group by Year order by total_count desc limit 10 ")
ddd.show()

df3: org.apache.spark.sql.DataFrame = [holiday: int, workingday: int ... 13 more fields]
+-------+----------+-----+------+--------+---------+------+----------+----------+---+-----+----+---------+------------+-------------+
|holiday|workingday| temp| atemp|humidity|windspeed|casual|registered|countTotal|Day|Month|Year|EventTime|season_|weather_|
+-------+----------+-----+------+--------+---------+------+----------+----------+---+-----+----+---------+------------+-------------+
| 0| 0| 9.84|14.395| 81| 0.0| 3| 13| 16| 1| 1|2011| 0:0:0| 1| 1|
| 0| 0| 9.02|13.635| 80| 0.0| 8| 32| 40| 1| 1|2011| 1:0:0| 1| 1|
| 0| 0| 9.02|13.635| 80| 0.0| 5| 27| 32| 1| 1|2011| 2:0:0| 1| 1|
| 0| 0| 9.84|14.395| 75| 0.0| 3| 10| 13| 1| 1|2011| 3:0:0| 1| 1|
| 0| 0| 9.84|14.395| 75| 0.0| 0| 1| 1| 1| 1|2011| 4:0:0| 1| 1|
| 0| 0| 9.84| 12.88| 75| 6.0032| 0| 1| 1| 1| 1|2011| 5:0:0| 1| 2|
| 0| 0| 9.02|13.635| 80| 0.0| 2| 0| 2| 1| 1|2011| 6:0:0| 1| 1|
| 0| 0| 8.2| 12.88| 86| 0.0| 1| 2| 3| 1| 1|2011| 7:0:0| 1| 1|
| 0| 0| 9.84|14.395| 75| 0.0| 1| 7| 8| 1| 1|2011| 8:0:0| 1| 1|
| 0| 0|13.12|17.425| 76| 0.0| 8| 6| 14| 1| 1|2011| 9:0:0| 1| 1|
| 0| 0|15.58|19.695| 76| 16.9979| 12| 24| 36| 1| 1|2011| 10:0:0| 1| 1|
| 0| 0|14.76|16.665| 81| 19.0012| 26| 30| 56| 1| 1|2011| 11:0:0| 1| 1|
| 0| 0|17.22| 21.21| 77| 19.0012| 29| 55| 84| 1| 1|2011| 12:0:0| 1| 1|
| 0| 0|18.86|22.725| 72| 19.9995| 47| 47| 94| 1| 1|2011| 13:0:0| 1| 2|
| 0| 0|18.86|22.725| 72| 19.0012| 35| 71| 106| 1| 1|2011| 14:0:0| 1| 2|
| 0| 0|18.04| 21.97| 77| 19.9995| 40| 70| 110| 1| 1|2011| 15:0:0| 1| 2|
| 0| 0|17.22| 21.21| 82| 19.9995| 41| 52| 93| 1| 1|2011| 16:0:0| 1| 2|
| 0| 0|18.04| 21.97| 82| 19.0012| 15| 52| 67| 1| 1|2011| 17:0:0| 1| 2|
| 0| 0|17.22| 21.21| 88| 16.9979| 9| 26| 35| 1| 1|2011| 18:0:0| 1| 3|
| 0| 0|17.22| 21.21| 88| 16.9979| 6| 31| 37| 1| 1|2011| 19:0:0| 1| 3|
+-------+----------+-----+------+--------+---------+------+----------+----------+---+-----+----+---------+------------+-------------+

ddd: org.apache.spark.sql.DataFrame = [Year: int, sum(countTotal): bigint]
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 178.0 failed 1 times, most recent failure: Lost task 0.0 in stage 178.0 (TID 1220, localhost, executor driver): java.io.FileNotFoundException: /tmp/blockmgr-a6766964-7801-4d25-bb63-cdcd5bc5fd6d/03/temp_shuffle_d5df2b2a-c0b3-4414-bc7d-ff85578f5cb0 (No such file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:102)
at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:115)
at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:229)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:152)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
Jul 16 in Apache Spark by Tilka
465 views
Same on me

Pyspark 2.4.0 when I train GMM

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.

Related Questions In Apache Spark

0 votes
1 answer

Get Spark SQL configuration in Java

You will need to use Spark session ...READ MORE

answered Mar 18 in Apache Spark by John
53 views
0 votes
1 answer

Spark error: Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable.

Give  read-write permissions to  C:\tmp\hive folder Cd to winutils bin folder ...READ MORE

answered Jul 11 in Apache Spark by Rajiv
391 views
0 votes
1 answer

Spark: java.sql.SQLException: No suitable driver

The missing driver is the JDBC one ...READ MORE

answered Jul 24 in Apache Spark by John
653 views
0 votes
1 answer

Unable to run the java- spark standalone program

Though there is nothing wrong with the ...READ MORE

answered Jul 30 in Apache Spark by Lohit
92 views
0 votes
1 answer
+1 vote
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,760 points
3,549 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,760 points
440 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
18,169 views