Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-16430

FLIP-119 Pipelined Region Scheduling

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Done
    • 1.11.0
    • 1.12.0
    • Runtime / Coordination
    • None
    • Hide
      Beginning from Flink 1.12, jobs will be scheduled in the unit of pipelined regions. A pipelined region is a set of pipelined connected tasks. This means that, for streaming jobs which consist of multiple regions, it no longer waits for all tasks to acquire slots before starting to deploy tasks. Instead, any region can be deployed once it has acquired enough slots for within tasks. For batch jobs, tasks will not be assigned slots and get deployed individually. Instead, a task will be deployed together with all other tasks in the same region, once the region has acquired enough slots.
      Show
      Beginning from Flink 1.12, jobs will be scheduled in the unit of pipelined regions. A pipelined region is a set of pipelined connected tasks. This means that, for streaming jobs which consist of multiple regions, it no longer waits for all tasks to acquire slots before starting to deploy tasks. Instead, any region can be deployed once it has acquired enough slots for within tasks. For batch jobs, tasks will not be assigned slots and get deployed individually. Instead, a task will be deployed together with all other tasks in the same region, once the region has acquired enough slots.

    Description

      Pipelined region scheduling is targeting to allow batch jobs with PIPELINED data exchanges to run without the risk to encounter a resource deadlock.

      More details see FLIP-119

      Attachments

        Issue Links

          1.
          Replace FailoverTopology with SchedulingTopology Sub-task Closed Gary Yao
          2.
          Add PipelinedRegion Interface to Topology Sub-task Closed Gary Yao
          3.
          Implement PipelinedRegionSchedulingStrategy Sub-task Closed Zhu Zhu
          4.
          Introduce GlobalDataExchangeMode for JobGraph Generation Sub-task Closed Zhu Zhu
          5.
          Blink Planner set GlobalDataExchangeMode Sub-task Closed Zhu Zhu  
          6.
          Simplify SchedulingStrategy#onPartitionConsumable(...) parameters Sub-task Closed Zhu Zhu
          7.
          Implement PipelinedRegion interface for SchedulingTopology Sub-task Closed Gary Yao  
          8.
          Drop generic Types in SchedulingTopology Interface Sub-task Closed Gary Yao
          9.
          Migrate RestartPipelinedRegionFailoverStrategyBuildingTest to PipelinedRegionComputeUtilTest Sub-task Closed Gary Yao  
          10.
          Implements bulk allocation for physical slots Sub-task Closed Zhu Zhu  
          11.
          Introduce PreferredLocationsRetriever Sub-task Closed Zhu Zhu  
          12.
          Allocates slots in bulks on pipelined region scheduling Sub-task Closed Zhu Zhu  
          13.
          Implement FIFO Physical Slot Assignment in SlotPoolImpl Sub-task Closed Zhu Zhu  
          14.
          Integrate pipelined region scheduling Sub-task Closed Zhu Zhu  
          15.
          Unify slot request timeout handling for streaming and batch tasks Sub-task Closed Zhu Zhu  
          16.
          Avoid scheduling deadlocks caused by cyclic input dependencies between regions Sub-task Closed Zhu Zhu  
          17.
          Expose more details in logs for debugging bulk slot allocation failures Sub-task Closed Unassigned  
          18.
          Simplify tests of SlotPoolImpl Sub-task Closed Zhilong Hong
          19.
          Enable pipelined scheduling by default Sub-task Closed Zhu Zhu  
          20.
          Improve pipelined region scheduling performance Sub-task Closed Zhu Zhu  

          Activity

            People

              zhuzh Zhu Zhu
              zhuzh Zhu Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 337h 40m
                  337h 40m