Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-556 [Umbrella] RM Restart phase 2 - Work preserving restart
  3. YARN-1367

After restart NM should resync with the RM without killing containers

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.6.0
    • Component/s: resourcemanager
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      After RM restart, the RM sends a resync response to NMs that heartbeat to it. Upon receiving the resync response, the NM kills all containers and re-registers with the RM. The NM should be changed to not kill the container and instead inform the RM about all currently running containers including their allocations etc. After the re-register, the NM should send all pending container completions to the RM as usual.

        Attachments

        1. YARN-1367.003.patch
          9 kB
          Anubhav Dhoot
        2. YARN-1367.002.patch
          21 kB
          Anubhav Dhoot
        3. YARN-1367.001.patch
          21 kB
          Anubhav Dhoot
        4. YARN-1367.prototype.patch
          36 kB
          Anubhav Dhoot

          Activity

            People

            • Assignee:
              adhoot Anubhav Dhoot
              Reporter:
              bikassaha Bikas Saha
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: