Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-1694

RM is shutting down when an NM is added to cluster without updating the hostname in /etc/hosts

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • 2.3.0
    • None
    • resourcemanager
    • None

    Description

      A New NM is added to cluster, but the hostname mapping of this NM is not updated in /etc/hosts in RM.
      NM registration is successful without any problems.

      When a job is submitted, RM shuts down with below exception.

      2013-10-04 04:37:37,611 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler
      java.lang.IllegalArgumentException: java.net.UnknownHostException: host-10-18-40-120
      at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
      at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:247)
      at org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:195)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.createContainerToken(LeafQueue.java:1296)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1344)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1210)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1169)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:870)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:645)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:707)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:751)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:93)
      at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:449)
      at java.lang.Thread.run(Thread.java:662)
      Caused by: java.net.UnknownHostException: host-10-18-40-120
      ... 15 more
      2013-10-04 04:37:37,614 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              sunilg Sunil G
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: