Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-6039 Improve TaskAssignor to be more load balanced
  3. KAFKA-4969

State-store workload-aware StreamsPartitionAssignor

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.6.0
    • streams
    • None

    Description

      Currently, StreamPartitionsAssigner does not distinguish different "types" of tasks. For example, task can be stateless of have one or multiple stores.

      This can lead to an suboptimal task placement: assume there are 2 stateless and 2 stateful tasks and the app is running with 2 instances. To share the "store load" it would be good to place one stateless and one stateful task per instance. Right now, there is no guarantee about this, and it can happen, that one instance processed both stateless tasks while the other processes both stateful tasks.

      We should improve StreamPartitionAssignor and introduce "task types" including a cost model for task placement. We should consider the following parameters:

      • number of stores
      • number of sources/sinks
      • number of processors
      • regular task vs standby task
      • in the case of standby tasks, which tasks have progressed the most with respect to restoration

      This improvement should be backed by a design document in the project wiki (no KIP required though) as it's a fairly complex change.

       

      There have been some additional discussions around task assignment on a related PR https://github.com/apache/kafka/pull/5390

      Attachments

        Issue Links

          Activity

            People

              bbejeck Bill Bejeck
              mjsax Matthias J. Sax
              Votes:
              3 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: