_spark_metadata/0 doesn't exist while Compacting batch 9 Structured streaming error

0 votes

We have Streaming Application implemented using Spark Structured Streaming. which tries to read data from kafka topics and write it to HDFS Location.

Sometimes application fails giving error :

_spark_metadata/0 doesn't exist while compacting batch 9

java.lang.IllegalStateException: history/1523305060336/_spark_metadata/9.compact doesn't exist when compacting batch 19 (compactInterval: 10)

not able to resolve this issue.

only one solution i found and that is to delete checkpoint location files which will read topic/data from beginning if we run the application again, which is not feasible solution for production application.

can any one tell some solution for this error so i need not have to delete checkpoint and i can continue from where last run was failed.

Deleting check point location which will start application from beginning and read all previous data.

sample code of application:

spark.
readStream.
format("kafka")
.option("kafka.bootstrap.servers", <server list>)
.option("subscribe", <topic>)
.load()

 spark.
 writeStream.
 format("csv").
 option("format", "append").
 option("path",hdfsPath).
 option("checkpointlocation","")
 .outputmode(append).start

need solution without deleting check pointing location.

May 31 in Apache Spark by AzimKangda
• 120 points
40 views
HI,

I encountered the same behavior

same code

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.

Related Questions In Apache Spark

0 votes
1 answer

Getting error while connecting zookeeper in Kafka - Spark Streaming integration

I guess you need provide this kafka.bootstrap.servers ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 12,790 points
384 views
0 votes
1 answer

Spark streaming with Kafka dependency error

Your error is with the version of ...READ MORE

answered Jul 5, 2018 in Apache Spark by Shubham
• 12,790 points
126 views
0 votes
1 answer

Error while using Spark SQL filter API

You have to use "===" instead of ...READ MORE

answered Feb 4 in Apache Spark by Omkar
• 66,880 points
20 views
0 votes
1 answer

Error while reading multiline Json

peopleDF: org.apache.spark.sql.DataFrame = [_corrupt_record: string] The above that ...READ MORE

answered May 23 in Apache Spark by Conny
16 views
0 votes
1 answer
0 votes
1 answer
0 votes
0 answers
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,390 points
1,829 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,390 points
157 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
9,057 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.