Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-1011 [Umbrella] Schedule containers based on utilization of currently allocated containers
  3. YARN-9111

NM crashes because Fair scheduler promotes a container that has not been pulled by AM

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • YARN-1011
    • None
    • fairscheduler, nodemanager
    • None

    Description

      2018-10-19 22:34:35,052 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
       java.lang.NullPointerException
       at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:323)
       at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.handle(ContainerManagerImpl.java:1649)
       at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.handle(ContainerManagerImpl.java:185)
       at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
       at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
       at java.lang.Thread.run(Thread.java:748)
       2018-10-19 22:34:35,054 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
       2018-10-19 22:34:35,059 DEBUG org.apache.hadoop.service.AbstractService: Service: NodeManager entered state STOPPED

       

       
      When a container is allocated by RM to an application, its container token is not generated until the AM pulls that container from RM.

      However, it the scheduler decides to promote that container before it is pulled by the AM, it does not have container token to work with.

      The current code does not update/generate the container token as such. When container promotion is sent to NM to process, the NM crashes on NPE.

      Attachments

        Activity

          People

            Unassigned Unassigned
            haibochen Haibo Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: