Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-17646

Nodemanager is not started after installation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.4.0
    • None
    • None

    Description

      Nodemanager is down on one of the nodes after installation. This has impacted
      most of the splits in todays run (ambari-2.4.0.0-817).
      Nodemanager is found be down on one of the nodes in 3 node cluster and its
      running on other two nodes.
      Live cluster is available here <https://172.22.66.85:8443/> and is alive for
      another 24hrs

      Below error is seen in nodemanager.log :

      2016-07-10 04:40:59,678 INFO recovery.NMLeveldbStateStoreService
      (NMLeveldbStateStoreService.java:checkVersion(1022)) - Loaded NM state version
      info 1.0
      2016-07-10 04:40:59,889 WARN nodemanager.LinuxContainerExecutor
      (LinuxContainerExecutor.java:init(195)) - Exit code from container executor
      initialization is : 24
      ExitCodeException exitCode=24: File /etc/hadoop/2.4.2.0-258/0 must be owned by
      root, but is owned by 2530

      at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
      at org.apache.hadoop.util.Shell.run(Shell.java:487)
      at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
      at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux
      ContainerExecutor.java:192)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag
      er.java:236)
      at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManag
      er(NodeManager.java:547)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java
      :595)
      2016-07-10 04:40:59,893 INFO nodemanager.ContainerExecutor
      (ContainerExecutor.java:logOutput(322)) -
      2016-07-10 04:40:59,893 INFO service.AbstractService
      (AbstractService.java:noteFailure(272)) - Service NodeManager failed in state
      INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed
      to initialize container executor
      org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize
      container executor
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag
      er.java:238)
      at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManag
      er(NodeManager.java:547)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java
      :595)
      Caused by: java.io.IOException: Linux container executor not configured
      properly (error=24)
      at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux
      ContainerExecutor.java:198)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag
      er.java:236)
      ... 3 more
      Caused by: ExitCodeException exitCode=24: File /etc/hadoop/2.4.2.0-258/0 must
      be owned by root, but is owned by 2530

      at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
      at org.apache.hadoop.util.Shell.run(Shell.java:487)
      at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
      at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux
      ContainerExecutor.java:192)
      ... 4 more
      2016-07-10 04:40:59,895 FATAL nodemanager.NodeManager
      (NodeManager.java:initAndStartNodeManager(550)) - Error starting NodeManager
      org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize
      container executor
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag
      er.java:238)
      at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManag
      er(NodeManager.java:547)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java
      :595)
      Caused by: java.io.IOException: Linux container executor not configured
      properly (error=24)
      at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux
      ContainerExecutor.java:198)
      at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag
      er.java:236)
      ... 3 more
      Caused by: ExitCodeException exitCode=24: File /etc/hadoop/2.4.2.0-258/0 must
      be owned by root, but is owned by 2530

      at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
      at org.apache.hadoop.util.Shell.run(Shell.java:487)
      at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
      at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux
      ContainerExecutor.java:192)
      ... 4 more
      2016-07-10 04:40:59,898 INFO nodemanager.NodeManager
      (LogAdapter.java:info(45)) - SHUTDOWN_MSG:
      /************************************************************
      SHUTDOWN_MSG: Shutting down NodeManager at nat-d7-xals-ambarieu-
      newamb-242-1-1/172.22.66.62

      Attachments

        1. AMBARI-17646.patch
          2 kB
          Andrew Onischuk

        Issue Links

          Activity

            People

              aonishuk Andrew Onischuk
              aonishuk Andrew Onischuk
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: