[KAFKA-4682] Committed offsets should not be deleted if a consumer is still active (KIP-211) - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.1.0
Component/s: None
Labels:
- kip

Description

Kafka will delete committed offsets that are older than offsets.retention.minutes

If there is an active consumer on a low traffic partition, it is possible that Kafka will delete the committed offset for that consumer. Once the offset is deleted, a restart or a rebalance of that consumer will cause the consumer to not find any committed offset and start consuming from earliest/latest (depending on auto.offset.reset). I'm not sure, but a broker failover might also cause you to start reading from auto.offset.reset (due to broker restart, or coordinator failover).

I think that Kafka should only delete offsets for inactive consumers. The timer should only start after a consumer group goes inactive. For example, if a consumer group goes inactive, then after 1 week, delete the offsets for that consumer group. This is a solution that junrao mentioned in https://issues.apache.org/jira/browse/KAFKA-3806?focusedCommentId=15323521&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15323521

The current workarounds are to:

Commit an offset on every partition you own on a regular basis, making sure that it is more frequent than offsets.retention.minutes (a broker-side setting that a consumer might not be aware of)
or
Turn the value of offsets.retention.minutes up really really high. You have to make sure it is higher than any valid low-traffic rate that you want to support. For example, if you want to support a topic where someone produces once a month, you would have to set offsetes.retention.mintues to 1 month.
or
Turn on enable.auto.commit (this is essentially #1, but easier to implement).

None of these are ideal.

#1 can be spammy. It requires your consumers know something about how the brokers are configured. Sometimes it is out of your control. Mirrormaker, for example, only commits offsets on partitions where it receives data. And it is duplication that you need to put into all of your consumers.

#2 has disk-space impact on the broker (in __consumer_offsets) as well as memory-size on the broker (to answer OffsetFetch).

#3 I think has the potential for message loss (the consumer might commit on messages that are not yet fully processed)

Attachments

Issue Links

contains

KAFKA-5510 Streams should commit all offsets regularly

Resolved

KAFKA-5664 Disable auto offset commit in ConsoleConsumer if no group is provided

Resolved

relates to

KAFKA-6951 Implement offset expiration semantics for unsubscribed topics

Open

links to

GitHub Pull Request #4896

mentioned in: Page Loading...; Page Loading...; Page Loading...

(2 mentioned in)

Activity

People

Assignee:: Vahid Hashemian

Reporter:: James Cheng

Votes:: 12 Vote for this issue

Watchers:: 21 Start watching this issue

Dates

Created:: 21/Jan/17 00:53

Updated:: 10/Sep/18 00:16

Resolved:: 27/Jun/18 23:00