Getting error while connecting zookeeper in Kafka - Spark Streaming integration

0 votes

I am trying to read from a Kafka topic using Spark streaming. But my Zookeeper connection keeps on disconnecting. While checking the zookeeper logs I am getting following:

 
WARN clients.NetworkClient: Bootstrap broker [zk host]:2181 disconnected

 

I am using CDH 5.11 distribution & I have installed Spark 2.1 using parcels. My error is as following:

java.io.EOFException
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:154)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:135)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:134)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:183)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:974)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$org$apache$spark$sql$kafka010$KafkaSource$$fetchLatestOffsets$1.apply(KafkaSource.scala:374)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$org$apache$spark$sql$kafka010$KafkaSource$$fetchLatestOffsets$1.apply(KafkaSource.scala:372)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$withRetriesWithoutInterrupt$1.apply$mcV$sp(KafkaSource.scala:442)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$withRetriesWithoutInterrupt$1.apply(KafkaSource.scala:441)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$withRetriesWithoutInterrupt$1.apply(KafkaSource.scala:441)
at org.apache.spark.util.UninterruptibleThread.runUninterruptibly(UninterruptibleThread.scala:79)
at org.apache.spark.sql.kafka010.KafkaSource.withRetriesWithoutInterrupt(KafkaSource.scala:440)
at org.apache.spark.sql.kafka010.KafkaSource.org$apache$spark$sql$kafka010$KafkaSource$$fetchLatestOffsets(KafkaSource.scala:372)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$initialPartitionOffsets$1.apply(KafkaSource.scala:141)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$initialPartitionOffsets$1.apply(KafkaSource.scala:138)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.kafka010.KafkaSource.initialPartitionOffsets$lzycompute(KafkaSource.scala:138)
at org.apache.spark.sql.kafka010.KafkaSource.initialPartitionOffsets(KafkaSource.scala:121)
at org.apache.spark.sql.kafka010.KafkaSource.getOffset(KafkaSource.scala:157)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$9$$anonfun$apply$5.apply(StreamExecution.scala:391)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$9$$anonfun$apply$5.apply(StreamExecution.scala:391)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:265)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:46)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$9.apply(StreamExecution.scala:390)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$9.apply(StreamExecution.scala:388)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$constructNextBatch(StreamExecution.scala:388)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$populateStartOffsets(StreamExecution.scala:362)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$1.apply$mcV$sp(StreamExecution.scala:260)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$1.apply(StreamExecution.scala:257)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$1.apply(StreamExecution.scala:257)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:265)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:46)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1.apply$mcZ$sp(StreamExecution.scala:257)
at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:43)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:252)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:187)
18/03/21 10:47:27 DEBUG clients.NetworkClient: Node -2 disconnected.
18/03/21 10:47:27 WARN clients.NetworkClient: Bootstrap broker [zk host]:2181 disconnected
18/03/21 10:47:27 DEBUG clients.NetworkClient: Sending metadata request {topics=[topic2345]} to node -1
18/03/21 10:47:27 DEBUG network.Selector: Connection with /[zk host] disconnected
May 24, 2018 in Apache Spark by user-code
344 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes
I guess you need provide this kafka.bootstrap.servers parameter and then pass the list of kafka brokers instead of zookeeper’s address. The new Kafka Consumer uses bootstrap server instead of zookeeper address.
answered May 24, 2018 by Shubham
• 12,270 points

Related Questions In Apache Spark

+1 vote
1 answer

getting null values in spark dataframe while reading data from hbase

Can you share the screenshots for the ...READ MORE

answered Jul 31, 2018 in Apache Spark by kurt_cobain
• 9,260 points
192 views
0 votes
1 answer

What are the levels of parallelism in spark streaming ?

> In order to reduce the processing ...READ MORE

answered Jul 26, 2018 in Apache Spark by zombie
• 3,690 points
198 views
0 votes
1 answer

Error while using Spark SQL filter API

You have to use "===" instead of ...READ MORE

answered Feb 4 in Apache Spark by Omkar
• 66,050 points
20 views
0 votes
1 answer

Error reading avro dataset in spark

For avro, you need to download and ...READ MORE

answered Feb 4 in Apache Spark by Omkar
• 66,050 points
89 views
0 votes
1 answer
0 votes
0 answers
0 votes
1 answer

Is it possible to run Apache Spark without Hadoop?

Though Spark and Hadoop were the frameworks designed ...READ MORE

answered May 2 in Big Data Hadoop by ravikiran
• 1,540 points
18 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,070 points
1,684 views
0 votes
1 answer

Spark streaming with Kafka dependency error

Your error is with the version of ...READ MORE

answered Jul 5, 2018 in Apache Spark by Shubham
• 12,270 points
110 views
0 votes
1 answer

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.