Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-262

Bug in the consumer rebalancing logic causes one consumer to release partitions that it does not own

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.7
    • 0.7.1
    • core
    • None

    Description

      The consumer maintains a cache of topics and partitions it owns along with the fetcher queues corresponding to those. But while releasing partition ownership, this cache is not cleared. This leads the consumer to release a partition that it does not own any more. This can also lead the consumer to commit offsets for partitions that it no longer consumes from.

      The rebalance operation goes through following steps -

      1. close fetchers
      2. commit offsets
      3. release partition ownership.
      4. rebalance, add topic, partition and fetcher queues to the topic registry, for all topics that the consumer process currently wants to own.
      5. If the consumer runs into conflict for one topic or partition, the rebalancing attempt fails, and it goes to step 1.

      Say, there are 2 consumers in a group, c1 and c2. Both are consuming topic1 with partitions 0-0, 0-1 and 1-0. Say c1 owns 0-0 and 0-1 and c2 owns 1-0.

      1. Broker 1 goes down. This triggers rebalancing attempt in c1 and c2.
      2. c1's release partition ownership and during step 4 (above), fails to rebalance.
      3. Meanwhile, c2 completes rebalancing successfully, and owns partition 0-1 and starts consuming data.
      4. c1 starts next rebalancing attempt and during step 3 (above), it releases partition 0-1. During step 4, it owns partition 0-0 again, and starts consuming data.
      5. Effectively, rebalancing has completed successfully, but there is no owner for partition 0-1 registered in Zookeeper.

      Attachments

        1. kafka-262.patch
          25 kB
          Neha Narkhede
        2. kafka-262-v3.patch
          24 kB
          Neha Narkhede

        Activity

          People

            nehanarkhede Neha Narkhede
            nehanarkhede Neha Narkhede
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified