How to commit message offsets in Kafka for reliable data pipeline

Question

I am creating a high fault tolerant data pipeline where I don’t want to miss any transaction or re-read any transaction. I am using Kafka for this application.

So, I want to know, how can I commit the offset of every transaction after consuming them? Another thing in my mind is, should I commit the messages to the zookeeper or should I commit them locally?

Shubham · Answer 1 · Jul 10, 2018

You can use auto.commit.enable to allow Kafka to commit messages, and you can set auto.commit.interval.ms to specify the interval for committing the offset of transactions. So, you will be getting the time interval for committing offsets. You need to do some testing to get the rate at which the messages that are getting consumed and set the time accordingly.

Generally, keeping very short interval for committing offset increases the read/write overhead in the zookeeper and it becomes slow as the zookeeper is also monitoring the whole Kafka cluster & maintaining metadata about it.

Hope this helps!

To know more about Kafka, I would recommend you to enroll with Kafka training online today.

Thanks.