Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-10429 Redesign Flink Scheduling, introducing dedicated Scheduler component
  3. FLINK-14362

Change DefaultSchedulingResultPartition to return correct partition state

    XMLWordPrintableJSON

Details

    Description

      Currently DefaultSchedulingResultPartition#getState() returns the state of partitions based on the partition producer's state. The state is used to make scheduling decision.

      However, it does not correctly reflect the true state of a partition.
      For example, when producer task turns to RUNNING but has not produced any data yet, it's consumers should not be scheduled to reduce unnecessary resource cost in lazy scheduling mode. However, the partition state will be RUNNING in DefaultSchedulingResultPartition and will trigger the scheduling of its consumers. This may lead to some vertices scheduled earlier than expected with no data to consume, which means a waste of resources.

      I'd propose to change the enums in ResultPartitionState to be:

      • CREATED // partition is just created or is just reset
      • CONSUMABLE // pipelined partition has data produced or blocking partition's parent result finishes. Corresponds to IntermediateResultPartition#isConsumable.

      The CONSUMABLE state is what the scheduler really cares to make scheduling decisions.
      The DefaultResultPartition should be fixed to return ResultPartitionState based the true partition state, rather than the vertex state.

      Attachments

        Issue Links

          Activity

            People

              zhuzh Zhu Zhu
              zhuzh Zhu Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m