Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Later
-
1.13.0
-
None
Description
Based on the scheduler benchmark introduced in FLINK-21731, we find that there's a procedure related to DefaultScheduler#allocateSlots that has O(N^2) complexity, which is: ExecutionGraphToInputsLocationsRetrieverAdapter#getConsumedResultPartitionsProducers.
The original implementation is:
for all SchedulingExecutionVertex in DefaultScheduler: for all ConsumedPartitionGroup of the SchedulingExecutionVertex: for all IntermediateResultPartition in the ConsumedPartitionGroup: get producer of the IntermediateResultPartition
This procedure has O(N^2) complexity.
We can see that for each SchedulingExecutionVertex, the producers of its ConsumedPartitionGroup is calculated separately. For the SchedulingExecutionVertices in the same ConsumerVertexGroup, they have the same ConsumedPartitionGroup. Therefore, we don't need to calculate the producers over and over again. We can use a local cache to cache the producers. This will decrease the complexity from O(N^2) to O(N).
Attachments
Issue Links
- links to