Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2928 YARN Timeline Service v.2: alpha 1
  3. YARN-5210

NPE in Distributed Shell while publishing DS_CONTAINER_START event and other miscellaneous issues

    XMLWordPrintableJSON

Details

    Description

      Found a couple of issues while testing ATSv2.

      • There is a NPE while publishing DS_CONTAINER_START_EVENT which in turn means that this event is not published.
        2016-06-07 23:19:00,020 [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #0] INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl: Unchecked exception is thrown from onContainerStarted for Container container_e77_1465311876353_0007_01_000002
        java.lang.NullPointerException
                at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:389)
                at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.putContainerEntity(ApplicationMaster.java:1284)
                at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishContainerStartEvent(ApplicationMaster.java:1235)
                at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.access$1200(ApplicationMaster.java:175)
                at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$NMCallbackHandler.onContainerStarted(ApplicationMaster.java:986)
                at org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl$StatefulContainer$StartContainerTransition.transition(NMClientAsyncImpl.java:454)
                at org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl$StatefulContainer$StartContainerTransition.transition(NMClientAsyncImpl.java:436)
                at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
                at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
                at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
                at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
                at org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl$StatefulContainer.handle(NMClientAsyncImpl.java:617)
                at org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl$ContainerEventProcessor.run(NMClientAsyncImpl.java:676)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                at java.lang.Thread.run(Thread.java:745)
        
      • Created time is not reported from distributed shell for both DS_CONTAINER and DS_APP_ATTEMPT entities.
        As can be seen below, when we query DS_APP_ATTEMPT entities, we do not get createdtime in response.
          [
            {
              "metrics": [ ],
              "events": [ ],
              "type": "DS_APP_ATTEMPT",
              "id": "appattempt_1465246237936_0003_000001",
              "isrelatedto": { },
              "relatesto": { },
              "info": {
                "UID": "yarn-cluster!application_1465246237936_0003!DS_APP_ATTEMPT!appattempt_1465246237936_0003_000001"
              },
              "configs": { }
            }
          ]
        

        As can be seen from response received upon querying a DS_CONTAINER entity we can see that createdtime is not present and DS_CONTAINER_START is not present either(due to NPE pointed above).

          {
            "metrics": [ ],
            "events": [
              {
                "id": "DS_CONTAINER_END",
                "timestamp": 1465314587480,
                "info": {
                  "Exit Status": 0,
                  "State": "COMPLETE"
                }
              }
            ],
            "type": "DS_CONTAINER",
            "id": "container_e77_1465311876353_0003_01_000002",
            "isrelatedto": { },
            "relatesto": { },
            "info": {
              "UID": "yarn-cluster!application_1465311876353_0003!DS_CONTAINER!container_e77_1465311876353_0003_01_000002"
            },
            "configs": { }
          }
        

      Attachments

        1. YARN-5210-YARN-2928.01.patch
          3 kB
          Varun Saxena

        Activity

          People

            varun_saxena Varun Saxena
            varun_saxena Varun Saxena
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: