Kafka
  1. Kafka
  2. KAFKA-262

Bug in the consumer rebalancing logic causes one consumer to release partitions that it does not own

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7
    • Fix Version/s: 0.7.1
    • Component/s: core
    • Labels:
      None

      Description

      The consumer maintains a cache of topics and partitions it owns along with the fetcher queues corresponding to those. But while releasing partition ownership, this cache is not cleared. This leads the consumer to release a partition that it does not own any more. This can also lead the consumer to commit offsets for partitions that it no longer consumes from.

      The rebalance operation goes through following steps -

      1. close fetchers
      2. commit offsets
      3. release partition ownership.
      4. rebalance, add topic, partition and fetcher queues to the topic registry, for all topics that the consumer process currently wants to own.
      5. If the consumer runs into conflict for one topic or partition, the rebalancing attempt fails, and it goes to step 1.

      Say, there are 2 consumers in a group, c1 and c2. Both are consuming topic1 with partitions 0-0, 0-1 and 1-0. Say c1 owns 0-0 and 0-1 and c2 owns 1-0.

      1. Broker 1 goes down. This triggers rebalancing attempt in c1 and c2.
      2. c1's release partition ownership and during step 4 (above), fails to rebalance.
      3. Meanwhile, c2 completes rebalancing successfully, and owns partition 0-1 and starts consuming data.
      4. c1 starts next rebalancing attempt and during step 3 (above), it releases partition 0-1. During step 4, it owns partition 0-0 again, and starts consuming data.
      5. Effectively, rebalancing has completed successfully, but there is no owner for partition 0-1 registered in Zookeeper.

      1. kafka-262.patch
        25 kB
        Neha Narkhede
      2. kafka-262-v3.patch
        24 kB
        Neha Narkhede

        Activity

          People

          • Assignee:
            Neha Narkhede
            Reporter:
            Neha Narkhede
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 24h
              24h
              Remaining:
              Remaining Estimate - 24h
              24h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development