Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5103

With NM recovery enabled, restarting NM multiple times results in AM restart

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: yarn
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      AM is restarted when NM is restarted multiple times even though NM recovery is enabled.

      NM log on which AM attempt 1 was running
       ERROR launcher.RecoveredContainerLaunch (RecoveredContainerLaunch.java:call(88)) - Unable to recover container container_e12_1463043063682_0002_01_000001
      java.io.IOException: java.lang.InterruptedException
      	at org.apache.hadoop.util.Shell.runCommand(Shell.java:579)
      	at org.apache.hadoop.util.Shell.run(Shell.java:487)
      	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
      	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:478)
      	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.isContainerProcessAlive(LinuxContainerExecutor.java:542)
      	at org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor.reacquireContainer(ContainerExecutor.java:185)
      	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.reacquireContainer(LinuxContainerExecutor.java:445)
      	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.RecoveredContainerLaunch.call(RecoveredContainerLaunch.java:83)
      	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.RecoveredContainerLaunch.call(RecoveredContainerLaunch.java:46)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      

        Attachments

        1. YARN-5103-v2.patch
          2 kB
          Junping Du
        2. YARN-5103-demo.patch
          2 kB
          Junping Du
        3. YARN-5103.patch
          2 kB
          Junping Du

          Issue Links

            Activity

              People

              • Assignee:
                djp Junping Du
                Reporter:
                ssathish@hortonworks.com Sumana Sathish
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: