Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6827

[ATS1/1.5] NPE exception while publishing recovering applications into ATS during RM restart.

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      While recovering application, it is observed that NPE exception is thrown as below.

      017-07-13 14:08:12,476 ERROR org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher: Error when publishing entity [YARN_APPLICATION,application_1499929227397_0001]
      java.lang.NullPointerException
      	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:178)
      	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.putEntity(TimelineServiceV1Publisher.java:368)
      

      This is because in RM service start, active services are started first in Non HA case and later ATSv1 services are started. In HA case, tansitionToActive event has come first before ATS service are started.

      This gives sufficient time to active services recover the applications which tries to publish into ATSv1 while recovering. Since ATS services are not started yet, it throws NPE.

      Attachments

        1. YARN-6827.01.patch
          1 kB
          Rohith Sharma K S

        Activity

          People

            rohithsharma Rohith Sharma K S
            rohithsharma Rohith Sharma K S
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: