Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-8951

Avoid unnecessary rebalances and downtime for "safe" partitions

    XMLWordPrintableJSON

Details

    Description

      With cooperative rebalancing, any partition that is encoded in one consumer's Subscription cannot be re-assigned to a different consumer during that rebalance. The partition must be removed from the assignment and revoked by its old owner before triggering a second rebalance during which it can be assigned. This is to enforce a synchronization barrier so that no two consumers can ever own the same partition at the same time

      This leads to down time for that partition plus a second rebalance, which may not always be necessary. In Streams for example, the consumer will pause all partitions of an active task until it is running (ie has been initialized and restored). It should be safe to give these partitions away, provided they are not resumed between sending the joinGroup request and receiving the syncGroup response.

      One proposal would be to modify two methods in the ConsumerPartitionAssignor interface. 1) ConsumerPartitionAssignor#subscriptionUserData would be passed in the set of `ownedPartitions` that will be included in the subscription, allowing it to remove any that it knows are safe to give away.

      2) ConsumerPartitionAssignor#onAssignment would be passed the set of revoked partitions, allowing it to remove any that it knows were already reassigned and should not trigger another rebalance.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ableegoldman A. Sophie Blee-Goldman
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: