Currently, StreamPartitionsAssigner does not distinguish different "types" of tasks. For example, task can be stateless of have one or multiple stores.
This can lead to an suboptimal task placement: assume there are 2 stateless and 2 stateful tasks and the app is running with 2 instances. To share the "store load" it would be good to place one stateless and one stateful task per instance. Right now, there is no guarantee about this, and it can happen, that one instance processed both stateless tasks while the other processes both stateful tasks.
We should improve StreamPartitionAssignor and introduce "task types" including a cost model for task placement. We should consider the following parameters:
- number of stores
- number of sources/sinks
- number of processors
- regular task vs standby task
- in the case of standby tasks, which tasks have progressed the most with respect to restoration
This improvement should be backed by a design document in the project wiki (no KIP required though) as it's a fairly complex change.
There have been some additional discussions around task assignment on a related PR https://github.com/apache/kafka/pull/5390