Details
-
Sub-task
-
Status: Closed
-
Minor
-
Resolution: Won't Fix
-
1.5.4, 1.6.1
-
None
-
None
Description
In credit-based network flow control, the required credits on receiver side are calculated by backlog plus initial credit which is equal to the value in parameter taskmanager.network.memory.buffers-per-channel. We plus the initial credit as backlog overhead in order to decrease the possibility of waiting credits on sender side. The best result is concurrent work between sender and receiver, not block each other.
We found a bad case in some rebalance or rescale scenarios, the outqueue usage reaches 100% on sender side, but the inqueue usage is about 50% or less. That means the credit announcement is not enough for sender side although there are still many free credit resources on receiver side. So it is not reasonable resulting in wasting resources.
It would be better if we can adjust the credit overhead to debug the performance online. And it needs another separate parameter to define initial credit not messed with taskmanager.network.memory.buffers-per-channel