Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-8767

Optimize StickyAssignor for Cooperative mode

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.4.0
    • None
    • clients, consumer
    • None

    Description

      In some rare cases, the StickyAssignor will fail to balance an assignment without violating stickiness despite a balanced and sticky assignment being possible. The implications of this for cooperative rebalancing are that an unnecessary additional rebalance will be triggered.

      This was seen to happen for example when each consumer is subscribed to some random subset of all topics and all their subscriptions change to a different random subset, as occurs in AbstractStickyAssignorTest#testReassignmentWithRandomSubscriptionsAndChanges.

      The initial assignment after the random subscription change obviously involved migrating partitions, so following the cooperative protocol those partitions are removed from the balanced first assignment, and a second rebalance is triggered. In some cases, during the second rebalance the assignor was unable to reach a balanced assignment without migrating a few partitions, even though one must have been possible (since the first assignment was balanced). A third rebalance was needed to reach a stable balanced state.

      Under the conditions in the previously mentioned test (between 20-40 consumers, 10-20 topics (with 0-20 partitions) this third rebalance was required roughly 30% of the time. Some initial improvements to the sticky assignment logic reduced this to under 15%, but we should consider closing this gap and optimizing the cooperative sticky assignment

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            ableegoldman A. Sophie Blee-Goldman
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: