Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-29344

Make Adaptive Scheduler supports Fine-Grained Resource Management



    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Runtime / Coordination
    • None


      This ticket is a reflection of the following Slack discussion:

      Donatien Schmitz
      Adaptive Scheduler thread:
      Hey all, it seems like the Adaptive Scheduler does not support fine grain resource management. I have fixed it and would like to know if you would be interested in a PR or if it was purposely designed to not support Fine grain resource management.

      @Donatien Schmitz: I’m concerned that we don’t have a lot of review capacity right now, and I’m now aware of any users asking for it.

      I couldn’t find a ticket for adding this feature, did you find one?
      If not, can you add one? This will allow us to at least making this feature show up on google, and people might comment on it, if they need it.

      If the change is fairly self-contained, is unlikely to cause instabilities, then we can also consider merging it

      @Xintong Song what do you think?

      Xintong Song
      @rmetzger, thanks for involving me.
      @Donatien Schmitz, thanks for bringing this up, and for volunteering on fixing this. Could you explain a bit more about how do you plan to fix this?
      Fine-grained resource management is not yet supported by adaptive scheduler, because there’s an issue that we haven’t find a good solution for. Namely, if only part of the resource requirements can be fulfilled, how do we decide which requirements should be fulfilled. E.g., say the job declares it needs 10 slots with resource 1 for map tasks, and another 10 slots with resource 2 for reduce tasks. If there’s not enough resources (say only 10 slots can be allocated for simplicity), how many slots for map / reduce tasks should be allocated? Obviously, <10 map, 0 reduce> & <0 map, 10 reduce> would not work. For this example, a proportional scale-down (<5 map, 5 reduce>) seems reasonable. However, a proportional scale-down is not always easy (e.g., requirements is <100 map, 1 reduce>), and the issue grows more complicated if you take lots of stages and the differences of slot sizes into consideration.
      I’d like to see adaptive scheduler also supports fine-grained resource management. If there’s a good solution to the above issue, I’d love to help review the effort.

      Donatien Schmitz
      Dear Robert and Xintong, thanks for reading and reacting to my message! I'll reply tomorrow (GTM +1 time) if that's quite alright with you. Best, Donatien Schmitz

      Donatien Schmitz
      @Xintong Song

      • We are working on fine-grain scheduling for resource optimisation of long running or periodic jobs. One of the feature we are experiencing is a "rescheduling plan", a mapping of operators and Resource Profiles that can be dynamically applied to a running job. This rescheduling would be triggered by policies about some metrics (focus on RocksDB in our case).
      • While developing this new feature, we decided to implement it on the Adpative Scheduler instead of the Base Scheduler because the logic brought by the state machine already present made it more logical: transitions from states Executing -> Cancelling -> Rescheduling -> Waiting for Resources -> Creating -> Executing
      • In our case we are working on a POC and thus focusing on a real simple job with a // of 1. The issue you brought is indeed something we have faced while raising the // of the job.
      • If you create a Jira Ticket we can discuss it over there if you'd like!

      Donatien Schmitz
      @rmetzger The changes do not break the default resource management but does not fix the issue brought out by Xintong.


        Issue Links



              chesnay Chesnay Schepler
              xtsong Xintong Song
              0 Vote for this issue
              6 Start watching this issue