Kafka
  1. Kafka
  2. KAFKA-1006

Consumer loses messages of a new topic with auto.offset.reset = largest

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.8.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Consumer currently uses auto.offset.reset = largest by default. If a new topic is created, consumer's topic watcher is fired. The consumer will first finish partition reassignment as part of rebalance and then start consuming from the tail of each partition. Until the partition reassignment is over, the server may have appended new messages to the new topic, consumer won't consume these messages. Thus, multiple batches of messages may be lost when a topic is newly created.

      The fix is to start consuming from the earliest offset for newly created topics.

        Activity

        Swapnil Ghike created issue -
        Swapnil Ghike made changes -
        Field Original Value New Value
        Description Mirror maker currently uses auto.offset.reset = largest on the consumer side by default. If a new topic is created, consumer's topic watcher is fired. The consumer will first finish partition reassignment as part of rebalance and then start consuming from the tail of each partition. Until the partition reassignment is over, the server may have appended new messages to the new topic, mirror maker won't consume these messages. Thus, multiple batches of messages may be lost when a topic is newly created.

        The fix is to start consuming from the earliest offset for newly created topics.
        Consumer currently uses auto.offset.reset = largest by default. If a new topic is created, consumer's topic watcher is fired. The consumer will first finish partition reassignment as part of rebalance and then start consuming from the tail of each partition. Until the partition reassignment is over, the server may have appended new messages to the new topic, consumer won't consume these messages. Thus, multiple batches of messages may be lost when a topic is newly created.

        The fix is to start consuming from the earliest offset for newly created topics.
        Swapnil Ghike made changes -
        Summary Mirror maker loses messages of a new topic Consumer loses messages of a new topic with auto.offset.reset = largest
        Guozhang Wang made changes -
        Assignee Guozhang Wang [ guozhang ]
        Hide
        Guozhang Wang added a comment -

        Propose the following fix:

        1. Add one more property in ConsumerConfig besides auto.offset.reset, named new.topic.offset.reset, which can be either largest or smallest, with default to smallest.

        2. In handleTopicEvent, when new topic is added, record the new topic in a list.

        3. In handleOffsetOutOfRange, if the topic is recorded as new topic, use the new config, otherwise use the global config.

        4. The list will be checked/cleared on commit offsets.

        Show
        Guozhang Wang added a comment - Propose the following fix: 1. Add one more property in ConsumerConfig besides auto.offset.reset, named new.topic.offset.reset, which can be either largest or smallest, with default to smallest. 2. In handleTopicEvent, when new topic is added, record the new topic in a list. 3. In handleOffsetOutOfRange, if the topic is recorded as new topic, use the new config, otherwise use the global config. 4. The list will be checked/cleared on commit offsets.
        Hide
        Neha Narkhede added a comment -

        Instead of adding a new property, we can treat this as the right behavior in all cases where a consumer discovers a new topic. I think this is a fairly easy change that has a big impact on usability and we should probably fix it even before the consumer rewrite in 0.9.

        Show
        Neha Narkhede added a comment - Instead of adding a new property, we can treat this as the right behavior in all cases where a consumer discovers a new topic. I think this is a fairly easy change that has a big impact on usability and we should probably fix it even before the consumer rewrite in 0.9.
        Neha Narkhede made changes -
        Labels usability

          People

          • Assignee:
            Guozhang Wang
            Reporter:
            Swapnil Ghike
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development