Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4726

[Umbrella] Allocation reuse for application upgrades

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      See overview doc at YARN-4692, copying the sub-section to track all related efforts.

      Once auto-­restart of containers is taken care of (YARN-4725), we need to address what I believe is the second most important reason for service containers to restart : upgrades. Once a service is running on YARN, the way container allocation-­lifecycle works, any time the container exits, YARN will reclaim the resources. During an upgrade, with multitude of other applications running in the system, giving up and getting back resources allocated to the service is hard to manage. Things like N​ode­Labels in YARN ​help this cause but are not straight­forward to use to address the app­-specific use­cases.

      We need a first class way of letting application reuse the same resource­allocation for multiple launches of the processes inside the container. This is done by decoupling allocation lifecycle and the process life­cycle.

      The JIRA YARN-1040 initiated this conversation. We need two things here:

      • (1) (​Task) ​the ApplicationMaster should be able to use the same container-allocation and issue multiple s​tartContainer​requests to the NodeManager.
      • (2) (Task) To support the upgrade of the ApplicationMaster itself, clients should be able to inform YARN to restart AM within the same allocation but with new bits.

      The JIRAs YARN-3417 and YARN-4470 talk about the second task above ...

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vinodkv Vinod Kumar Vavilapalli
              Votes:
              0 Vote for this issue
              Watchers:
              31 Start watching this issue

              Dates

                Created:
                Updated: