Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6827

[ATS1/1.5] NPE exception while publishing recovering applications into ATS during RM restart.

    Details

    • Hadoop Flags:
      Reviewed

      Description

      While recovering application, it is observed that NPE exception is thrown as below.

      017-07-13 14:08:12,476 ERROR org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher: Error when publishing entity [YARN_APPLICATION,application_1499929227397_0001]
      java.lang.NullPointerException
      	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:178)
      	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.putEntity(TimelineServiceV1Publisher.java:368)
      

      This is because in RM service start, active services are started first in Non HA case and later ATSv1 services are started. In HA case, tansitionToActive event has come first before ATS service are started.

      This gives sufficient time to active services recover the applications which tries to publish into ATSv1 while recovering. Since ATS services are not started yet, it throws NPE.

        Attachments

        1. YARN-6827.01.patch
          1 kB
          Rohith Sharma K S

          Activity

            People

            • Assignee:
              rohithsharma Rohith Sharma K S
              Reporter:
              rohithsharma Rohith Sharma K S
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: