Kafka
  1. Kafka
  2. KAFKA-346

Don't call commitOffsets() during rebalance

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7, 0.7.1, 0.8.0
    • Fix Version/s: None
    • Component/s: core
    • Labels:
      None

      Description

      A sample use-case

      If I read the source correctly then offsets can be committed at any
      time (whenever there is a change in consumer or broker zk registry).
      Our application doesn't use auto-commit in order to batch some
      messages together, process them and if everything went fine, we call
      commitOffsets(). If, for any reason, the processing of messages does
      not succeed, we rely on Kafka's promise to re-deliver the messages.

      But if ZKRebalancerListener triggers a rebalance before our "batch" of
      messages is full, then offsets will be committed even if the messages
      have not been processed yet by our application. So if then processing
      of these messages fails, we basically lost them, right?

      (See discussion at http://www.mail-archive.com/kafka-users@incubator.apache.org/msg01415.html)

      1. KAFKA-346.patch
        11 kB
        Peter Romianowski

        Issue Links

          Activity

          Hide
          Peter Romianowski added a comment -

          I could now write a patch that - as Jun suggested - does not commit offsets, if autoCommit has been turned off. Together with KAFKA-345 that adds a listener for rebalance-events, we could simply use a listener and throw away all messages in our batch upon a rebalance. That would work fine for us.

          My question is: Do you think this is the behavior others would expect? Even though the current behavior is broken IMHO, the new one would introduce quite some duplicate messages which have to be handled by the application.

          Are there better ideas on how to fix this?

          Show
          Peter Romianowski added a comment - I could now write a patch that - as Jun suggested - does not commit offsets, if autoCommit has been turned off. Together with KAFKA-345 that adds a listener for rebalance-events, we could simply use a listener and throw away all messages in our batch upon a rebalance. That would work fine for us. My question is: Do you think this is the behavior others would expect? Even though the current behavior is broken IMHO, the new one would introduce quite some duplicate messages which have to be handled by the application. Are there better ideas on how to fix this?
          Hide
          Jay Kreps added a comment -

          Personally I think this makes sense. If you set auto-commit we commit for you. If you say no to auto-commit you handle commits.

          Show
          Jay Kreps added a comment - Personally I think this makes sense. If you set auto-commit we commit for you. If you say no to auto-commit you handle commits.
          Hide
          Peter Romianowski added a comment -

          Added a patch against trunk. In order to compile KAFKA-345 has to be applied.

          Show
          Peter Romianowski added a comment - Added a patch against trunk. In order to compile KAFKA-345 has to be applied.
          Hide
          Gwen Shapira added a comment -

          This is already fixed on trunk.

          Show
          Gwen Shapira added a comment - This is already fixed on trunk.

            People

            • Assignee:
              Unassigned
              Reporter:
              Peter Romianowski
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development