Hadoop YARN
  1. Hadoop YARN
  2. YARN-1197

Support changing resources of an allocated container

    Details

    • Type: Task Task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.1.0-beta
    • Fix Version/s: None
    • Labels:
      None

      Description

      The current YARN resource management logic assumes resource allocated to a container is fixed during the lifetime of it. When users want to change a resource
      of an allocated container the only way is releasing it and allocating a new container with expected size.
      Allowing run-time changing resources of an allocated container will give us better control of resource usage in application side

      1. mapreduce-project.patch.ver.1
        6 kB
        Wangda Tan (No longer used)
      2. tools-project.patch.ver.1
        2 kB
        Wangda Tan (No longer used)
      3. yarn-1197.pdf
        132 kB
        Wangda Tan (No longer used)
      4. yarn-1197-scheduler-v1.pdf
        137 kB
        Wangda Tan (No longer used)
      5. yarn-1197-v2.pdf
        162 kB
        Wangda Tan (No longer used)
      6. yarn-1197-v3.pdf
        162 kB
        Wangda Tan (No longer used)
      7. yarn-1197-v4.pdf
        162 kB
        Wangda Tan (No longer used)
      8. yarn-1197-v5.pdf
        162 kB
        Wangda Tan (No longer used)
      9. yarn-api-protocol.patch.ver.1
        18 kB
        Wangda Tan (No longer used)
      10. yarn-pb-impl.patch.ver.1
        70 kB
        Wangda Tan (No longer used)
      11. yarn-server-common.patch.ver.1
        10 kB
        Wangda Tan (No longer used)
      12. yarn-server-nodemanager.patch.ver.1
        33 kB
        Wangda Tan (No longer used)
      13. yarn-server-resourcemanager.patch.ver.1
        115 kB
        Wangda Tan (No longer used)

        Issue Links

        1.
        Common PB type definitions for container resizing Sub-task Closed Wangda Tan (No longer used)
         
        2.
        AM-RM protocol changes to support container resizing Sub-task Closed Wangda Tan (No longer used)
         
        3. Protocol changes and implementations in NM side to support change container resource Sub-task Open Wangda Tan
         
        4. Make AMRMClient support send increase container request and get increased/decreased containers Sub-task Open Wangda Tan (No longer used)
         
        5. Protocol changes and implementations in RM side to support change container resource Sub-task Open Wangda Tan (No longer used)
         
        6. Make NMClient support change container resources Sub-task Open Wangda Tan (No longer used)
         
        7. [YARN-1197] Make ContainersMonitor can support change monitoring size of an allocated container in NM side Sub-task Open Wangda Tan (No longer used)
         
        8. [YARN-1197] Add newly decreased container to NodeStatus in NM side Sub-task Open Wangda Tan (No longer used)
         
        9. [YARN-1197] Add changeContainersResource interface and implementations to ContainerManagementProtocol Sub-task Open Wangda Tan (No longer used)
         
        10. [YARN-1197] Add increase container request to YarnScheduler allocate API Sub-task Open Wangda Tan (No longer used)
         
        11. [YARN-1197] Add increased/decreased container to Allocation Sub-task Open Wangda Tan (No longer used)
         
        12. [YARN-1197] Modify ApplicationMasterService to support changing container resource Sub-task Open Wangda Tan (No longer used)
         
        13. [YARN-1197] Modify ResourceTrackerService to support passing decreased containers to RMNode Sub-task Open Wangda Tan (No longer used)
         
        14. [YARN-1197] Add pullDecreasedContainer API to RMNode which can be used by scheduler to get newly decreased Containers Sub-task Open Wangda Tan (No longer used)
         
        15. [YARN-1197] Add methods in FiCaSchedulerApp to support add/reserve/unreserve/allocate/pull change container requests/results Sub-task Open Wangda Tan (No longer used)
         
        16. [YARN-1197] Add methods in FiCaSchedulerNode to support increase/decrease/reserve/unreserve change container requests/results Sub-task Open Wangda Tan (No longer used)
         
        17. [YARN-1197] Add APIs in CSQueue to support decrease container resource and unreserve increase request Sub-task Open Wangda Tan (No longer used)
         
        18. [YARN-1197] Add implementations to CapacityScheduler to support increase/decrease container resource Sub-task Open Wangda Tan (No longer used)
         
        19. [YARN-1197] Add implementations to FairScheduler to support increase/decrease container resource Sub-task Open Sandy Ryza
         

          Activity

          Wangda Tan created issue -
          Bikas Saha made changes -
          Field Original Value New Value
          Description Currently, YARN cannot support merge several containers in one node to a big container, which can make us incrementally ask resources, merge them to a bigger one, and launch our processes. The user scenario is,

          In some applications (like OpenMPI) has their own daemons in each node (one for each node) in their original implementation, and their user's processes are directly launched by its local daemon (like task-tracker in MRv1, but it's per-application). Many functionalities are depended on the pipes created when a process forked by its father, like IO-forwarding, process monitoring (it will do more logic than what NM did for us) and may cause some scalability issues.

          A very common resource request in MPI world is, "give me 100G memory in the cluster, I will launch 100 processes in this resource". In current YARN, we have following two choices to make this happen,
          1) Send allocation request with 1G memory iteratively, until we got 100G memories in total. Then ask NM launch such 100 MPI processes. That will cause some problems like cannot support IO-forwarding, processes monitoring, etc. as mentioned above.
          2) Send a larger resource request, like 10G. But we may encounter following problems,
             2.1 Such a large resource request is hard to get at one time.
             2.2 We cannot use other resources more than the number we specified in the node (we can only launch one daemon in one node).
             2.3 Hard to decide how much resource to ask.

          So my proposal is,
          1) We can incrementally send resource request with small resources like before, until we get enough resources in total
          2) Merge resource in the same node, make only one big container in each node
          3) Launch daemons in each node, and the daemon will spawn its local processes and manage them.

          For example,
          We need to run 10 processes, 1G for each, finally we got
          container 1, 2, 3, 4, 5 in node1.
          container 6, 7, 8 in node2.
          container 9, 10 in node3.
          Then we will,
          merge [1, 2, 3, 4, 5] to container_11 with 5G, launch a daemon, and the daemon will launch 5 processes
          merge [6, 7, 8] to container_12 with 3G, launch a daemon, and the daemon will launch 3 processes
          merge [9, 10] to container_13 with 2G, launch a daemon, and the daemon will launch 2 processes
          Currently, YARN cannot support merge several containers in one node to a big container, which can make us incrementally ask resources, merge them to a bigger one, and launch our processes. The user scenario is described in the comments.
          Bikas Saha made changes -
          Link This issue is depended upon by YARN-896 [ YARN-896 ]
          Bikas Saha made changes -
          Summary Add container merge support in YARN Support increasing resources of an allocated container
          Bikas Saha made changes -
          Summary Support increasing resources of an allocated container Support changing resources of an allocated container
          Bikas Saha made changes -
          Assignee Tan, Wangda [ wangda ]
          Bikas Saha made changes -
          Assignee Tan, Wangda [ wangda ] Wangda Tan [ gp.leftnoteasy ]
          Bikas Saha made changes -
          Assignee Wangda Tan [ gp.leftnoteasy ]
          Wangda Tan (No longer used) made changes -
          Attachment yarn-1197.pdf [ 12604489 ]
          Wangda Tan (No longer used) made changes -
          Attachment yarn-1197-v2.pdf [ 12606770 ]
          Wangda Tan (No longer used) made changes -
          Assignee Wangda Tan [ gp.leftnoteasy ]
          Wangda Tan (No longer used) made changes -
          Attachment yarn-1197-v3.pdf [ 12607745 ]
          Wangda Tan (No longer used) made changes -
          Attachment yarn-1197-v4.pdf [ 12614261 ]
          Wangda Tan (No longer used) made changes -
          Attachment mapreduce-project.patch.ver.1 [ 12615527 ]
          Attachment tools-project.patch.ver.1 [ 12615528 ]
          Attachment yarn-api-protocol.patch.ver.1 [ 12615529 ]
          Attachment yarn-pb-impl.patch.ver.1 [ 12615530 ]
          Attachment yarn-server-common.patch.ver.1 [ 12615531 ]
          Attachment yarn-server-nodemanager.patch.ver.1 [ 12615532 ]
          Attachment yarn-server-resourcemanager.patch.ver.1 [ 12615533 ]
          Wangda Tan (No longer used) made changes -
          Description Currently, YARN cannot support merge several containers in one node to a big container, which can make us incrementally ask resources, merge them to a bigger one, and launch our processes. The user scenario is described in the comments.
          Wangda Tan (No longer used) made changes -
          Description The current YARN resource management logic assumes resource allocated to a container is fixed during the lifetime of it. When users want to change a resource
          of an allocated container the only way is releasing it and allocating a new container with expected size.
          Allowing run-time changing resources of an allocated container will give us better control of resource usage in application side
          Junping Du made changes -
          Link This issue relates to YARN-1011 [ YARN-1011 ]
          Wangda Tan (No longer used) made changes -
          Attachment yarn-1197-v5.pdf [ 12617834 ]
          Wangda Tan (No longer used) made changes -
          Attachment yarn-1197-scheduler-v1.pdf [ 12618425 ]
          Wangda Tan (No longer used) made changes -
          Assignee Wangda Tan [ gp.leftnoteasy ]
          Jeff Zhang made changes -
          Assignee Jeff Zhang [ zjffdu ]
          Jeff Hammerbacher made changes -
          Link This issue relates to SPARK-3174 [ SPARK-3174 ]
          Jeff Zhang made changes -
          Assignee Jeff Zhang [ zjffdu ]

            People

            • Assignee:
              Unassigned
              Reporter:
              Wangda Tan
            • Votes:
              10 Vote for this issue
              Watchers:
              72 Start watching this issue

              Dates

              • Created:
                Updated:

                Development