Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9928

ATSv2 can make NM go down with a FATAL error while it is resyncing with RM

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.1.0
    • Fix Version/s: None
    • Component/s: ATSv2
    • Labels:
      None

      Description

      Encountered the below FATAL errorĀ in the NodeManager which was under heavy load and was also resyncing with RM at the same. This caused the NM to go down.

      2019-09-18 11:22:44,899 FATAL event.AsyncDispatcher (AsyncDispatcher.java:dispatch(203)) - Error in dispatcher thread
      java.lang.NullPointerException
          at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerCreatedEvent(NMTimelinePublisher.java:216)
          at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerEvent(NMTimelinePublisher.java:383)
          at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1520)
          at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1511)
          at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
          at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
          at java.lang.Thread.run(Thread.java:748)
      

        Attachments

          Activity

            People

            • Assignee:
              tarunparimi Tarun Parimi
              Reporter:
              tarunparimi Tarun Parimi
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: