How to commit message offsets in Kafka for reliable data pipeline?

0 votes
I am creating a high fault tolerant data pipeline where I don’t want to miss any transaction or re-read any transaction. I am using Kafka for this application.

So, I want to know, how can I commit the offset of every transaction after consuming them? Another thing in my mind is, should I commit the messages to the zookeeper or should I commit them locally?
Jul 9, 2018 in Apache Kafka by coldcode
• 2,020 points
339 views

1 answer to this question.

0 votes

You can use auto.commit.enable to allow Kafka to commit messages, and you can set auto.commit.interval.ms to specify the interval for committing the offset of transactions. So, you will be getting the time interval for committing offsets. You need to do some testing to get the rate at which the messages that are getting consumed and set the time accordingly.

Generally, keeping very short interval for committing offset increases the read/write overhead in the zookeeper and it becomes slow as the zookeeper is also monitoring the whole Kafka cluster & maintaining metadata about it. 

answered Jul 9, 2018 by Shubham
• 13,300 points

Related Questions In Apache Kafka

0 votes
1 answer

Is there any change in consumer offsets if a new partition(s) is added to a Kafka topic?

Yes, it stays the same. An offset is ...READ MORE

answered Jul 9, 2018 in Apache Kafka by nitinrawat895
• 10,670 points
305 views
0 votes
1 answer

How to delete a topic in Kafka 0.8.1.1?

Deleting topic isn't always working in 0.8.1.1 Deletion ...READ MORE

answered Sep 4, 2018 in Apache Kafka by nitinrawat895
• 10,670 points
154 views
0 votes
1 answer

Explain to me the functionality of Kafka in a Big-Data Cluster

Let me explain to you about Apache ...READ MORE

answered Apr 30 in Apache Kafka by ravikiran
• 4,560 points
78 views
0 votes
0 answers

How to check pending messages in KAFKA topic?

Let say we have first_topic which has kafka_server_brokertopicmetrics_messagesin_total{instance="localhost:1120",job="kafka",topic="first_topic"}  ...READ MORE

May 2 in Apache Kafka by anonymous
553 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
2,981 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
332 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
14,739 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
1,095 views
0 votes
1 answer

How to reset the offset of messages consumed from Kafka?

The reset option only prints the result ...READ MORE

answered Jul 9, 2018 in Apache Kafka by Shubham
• 13,300 points
2,412 views
0 votes
10 answers

Writing the Kafka consumer output to a file

System.out.println(String.valueOf(output.offset()) + ": " + new String(bytes, ...READ MORE

answered Dec 7, 2018 in Apache Kafka by Harsh
6,789 views