How to commit message offsets in Kafka for reliable data pipeline

0 votes
I am creating a high fault tolerant data pipeline where I don’t want to miss any transaction or re-read any transaction. I am using Kafka for this application.

So, I want to know, how can I commit the offset of every transaction after consuming them? Another thing in my mind is, should I commit the messages to the zookeeper or should I commit them locally?
Jul 10, 2018 in Apache Kafka by coldcode
• 2,080 points
2,305 views

1 answer to this question.

0 votes

You can use auto.commit.enable to allow Kafka to commit messages, and you can set auto.commit.interval.ms to specify the interval for committing the offset of transactions. So, you will be getting the time interval for committing offsets. You need to do some testing to get the rate at which the messages that are getting consumed and set the time accordingly.

Generally, keeping very short interval for committing offset increases the read/write overhead in the zookeeper and it becomes slow as the zookeeper is also monitoring the whole Kafka cluster & maintaining metadata about it. 

Hope this helps!

To know more about Kafka, I would recommend you to enroll with Kafka training online today.

Thanks.

answered Jul 10, 2018 by Shubham
• 13,490 points

Related Questions In Apache Kafka

0 votes
1 answer
0 votes
1 answer

Is there any change in consumer offsets if a new partition(s) is added to a Kafka topic?

Yes, it stays the same. An offset is ...READ MORE

answered Jul 9, 2018 in Apache Kafka by nitinrawat895
• 11,380 points
1,217 views
0 votes
1 answer

How to delete a topic in Kafka 0.8.1.1?

Deleting topic isn't always working in 0.8.1.1 Deletion ...READ MORE

answered Sep 4, 2018 in Apache Kafka by nitinrawat895
• 11,380 points
1,043 views
0 votes
1 answer

Explain to me the functionality of Kafka in a Big-Data Cluster

Let me explain to you about Apache ...READ MORE

answered Apr 30, 2019 in Apache Kafka by ravikiran
• 4,620 points
503 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
8,643 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,525 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
76,431 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
3,312 views
0 votes
1 answer

How to reset the offset of messages consumed from Kafka?

The reset option only prints the result ...READ MORE

answered Jul 10, 2018 in Apache Kafka by Shubham
• 13,490 points
12,424 views
+2 votes
10 answers

Writing the Kafka consumer output to a file

System.out.println(String.valueOf(output.offset()) + ": " + new String(bytes, ...READ MORE

answered Dec 7, 2018 in Apache Kafka by Harsh
27,192 views
webinar REGISTER FOR FREE WEBINAR X
Send OTP
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP