spark metadata 0 doesn t exist while Compacting batch 9 Structured streaming error

+1 vote

We have Streaming Application implemented using Spark Structured Streaming. which tries to read data from kafka topics and write it to HDFS Location.

Sometimes application fails giving error :

_spark_metadata/0 doesn't exist while compacting batch 9

java.lang.IllegalStateException: history/1523305060336/_spark_metadata/9.compact doesn't exist when compacting batch 19 (compactInterval: 10)

not able to resolve this issue.

only one solution i found and that is to delete checkpoint location files which will read topic/data from beginning if we run the application again, which is not feasible solution for production application.

can any one tell some solution for this error so i need not have to delete checkpoint and i can continue from where last run was failed.

Deleting check point location which will start application from beginning and read all previous data.

sample code of application:

spark.
readStream.
format("kafka")
.option("kafka.bootstrap.servers", <server list>)
.option("subscribe", <topic>)
.load()

 spark.
 writeStream.
 format("csv").
 option("format", "append").
 option("path",hdfsPath).
 option("checkpointlocation","")
 .outputmode(append).start

need solution without deleting check pointing location.

May 31, 2019 in Apache Spark by AzimKangda
• 130 points
4,839 views
HI,

I encountered the same behavior

same code

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.

Related Questions In Apache Spark

0 votes
1 answer

Getting error while connecting zookeeper in Kafka - Spark Streaming integration

I guess you need provide this kafka.bootstrap.servers ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 13,490 points
3,386 views
0 votes
1 answer

Spark streaming with Kafka dependency error

Your error is with the version of ...READ MORE

answered Jul 5, 2018 in Apache Spark by Shubham
• 13,490 points
1,858 views
0 votes
1 answer

Error while using Spark SQL filter API

You have to use "===" instead of ...READ MORE

answered Feb 4, 2019 in Apache Spark by Omkar
• 69,180 points
1,243 views
0 votes
1 answer

Error while reading multiline Json

peopleDF: org.apache.spark.sql.DataFrame = [_corrupt_record: string] The above that ...READ MORE

answered May 23, 2019 in Apache Spark by Conny
3,297 views
0 votes
1 answer

Spark: Error while instantiating "org.apache.spark.sql.hive.HiveSessionState"

Seems like you have not started the ...READ MORE

answered Jul 25, 2019 in Apache Spark by Rohit
8,913 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
13,562 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
4,457 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
116,586 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP