Getting error while connecting zookeeper in Kafka - Spark Streaming integration

0 votes

I am trying to read from a Kafka topic using Spark streaming. But my Zookeeper connection keeps on disconnecting. While checking the zookeeper logs I am getting following:

 
WARN clients.NetworkClient: Bootstrap broker [zk host]:2181 disconnected

 

I am using CDH 5.11 distribution & I have installed Spark 2.1 using parcels. My error is as following:

java.io.EOFException
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:154)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:135)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:134)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:183)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:974)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$org$apache$spark$sql$kafka010$KafkaSource$$fetchLatestOffsets$1.apply(KafkaSource.scala:374)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$org$apache$spark$sql$kafka010$KafkaSource$$fetchLatestOffsets$1.apply(KafkaSource.scala:372)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$withRetriesWithoutInterrupt$1.apply$mcV$sp(KafkaSource.scala:442)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$withRetriesWithoutInterrupt$1.apply(KafkaSource.scala:441)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$withRetriesWithoutInterrupt$1.apply(KafkaSource.scala:441)
at org.apache.spark.util.UninterruptibleThread.runUninterruptibly(UninterruptibleThread.scala:79)
at org.apache.spark.sql.kafka010.KafkaSource.withRetriesWithoutInterrupt(KafkaSource.scala:440)
at org.apache.spark.sql.kafka010.KafkaSource.org$apache$spark$sql$kafka010$KafkaSource$$fetchLatestOffsets(KafkaSource.scala:372)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$initialPartitionOffsets$1.apply(KafkaSource.scala:141)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$initialPartitionOffsets$1.apply(KafkaSource.scala:138)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.kafka010.KafkaSource.initialPartitionOffsets$lzycompute(KafkaSource.scala:138)
at org.apache.spark.sql.kafka010.KafkaSource.initialPartitionOffsets(KafkaSource.scala:121)
at org.apache.spark.sql.kafka010.KafkaSource.getOffset(KafkaSource.scala:157)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$9$$anonfun$apply$5.apply(StreamExecution.scala:391)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$9$$anonfun$apply$5.apply(StreamExecution.scala:391)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:265)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:46)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$9.apply(StreamExecution.scala:390)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$9.apply(StreamExecution.scala:388)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$constructNextBatch(StreamExecution.scala:388)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$populateStartOffsets(StreamExecution.scala:362)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$1.apply$mcV$sp(StreamExecution.scala:260)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$1.apply(StreamExecution.scala:257)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$1.apply(StreamExecution.scala:257)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:265)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:46)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1.apply$mcZ$sp(StreamExecution.scala:257)
at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:43)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:252)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:187)
18/03/21 10:47:27 DEBUG clients.NetworkClient: Node -2 disconnected.
18/03/21 10:47:27 WARN clients.NetworkClient: Bootstrap broker [zk host]:2181 disconnected
18/03/21 10:47:27 DEBUG clients.NetworkClient: Sending metadata request {topics=[topic2345]} to node -1
18/03/21 10:47:27 DEBUG network.Selector: Connection with /[zk host] disconnected
May 24, 2018 in Apache Spark by user-code
2,576 views

1 answer to this question.

0 votes
I guess you need provide this kafka.bootstrap.servers parameter and then pass the list of kafka brokers instead of zookeeper’s address. The new Kafka Consumer uses bootstrap server instead of zookeeper address.
answered May 24, 2018 by Shubham
• 13,490 points

Related Questions In Apache Spark

+1 vote
1 answer

getting null values in spark dataframe while reading data from hbase

Can you share the screenshots for the ...READ MORE

answered Jul 31, 2018 in Apache Spark by kurt_cobain
• 9,390 points
2,066 views
0 votes
1 answer

What are the levels of parallelism in spark streaming ?

> In order to reduce the processing ...READ MORE

answered Jul 27, 2018 in Apache Spark by zombie
• 3,790 points
4,418 views
0 votes
1 answer

Error while using Spark SQL filter API

You have to use "===" instead of ...READ MORE

answered Feb 4, 2019 in Apache Spark by Omkar
• 69,210 points
549 views
0 votes
1 answer
+1 vote
2 answers
0 votes
1 answer

Is it possible to run Apache Spark without Hadoop?

Though Spark and Hadoop were the frameworks designed ...READ MORE

answered May 3, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
983 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,521 views
0 votes
1 answer

Spark streaming with Kafka dependency error

Your error is with the version of ...READ MORE

answered Jul 5, 2018 in Apache Spark by Shubham
• 13,490 points
1,131 views
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP