Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4427

NPE on handleNMContainerStatus when NM is registering to RM

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Seen the following in one of our environment when AM got allocated container but failed to updated in the ZK Where cluster is having network problem for sometime(up and down).

      2015-12-07 16:39:38,489 | WARN  | IPC Server handler 49 on 26003 | IPC Server handler 49 on 26003, call org.apache.hadoop.yarn.server.api.ResourceTrackerPB.registerNodeManager from 9.91.8.220:52169 Call#17 Retry#0 | Server.java:2107
      java.lang.NullPointerException
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.handleNMContainerStatus(ResourceTrackerService.java:286)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.registerNodeManager(ResourceTrackerService.java:395)
              at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceTrackerPBServiceImpl.registerNodeManager(ResourceTrackerPBServiceImpl.java:54)
              at org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService$2.callBlockingMethod(ResourceTracker.java:79)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:973)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
      

      Corresponding code, it might not match with branch-2.7/Trunk since we had modified internally.

       284	RMAppAttempt rmAppAttempt = rmApp.getRMAppAttempt(appAttemptId);
       285	Container masterContainer = rmAppAttempt.getMasterContainer();
       286	if (masterContainer.getId().equals(containerStatus.getContainerId())
       287       && containerStatus.getContainerState() == ContainerState.COMPLETE) {
       288     ContainerStatus status =
       289         ContainerStatus.newInstance(containerStatus.getContainerId(),
       290           containerStatus.getContainerState(), containerStatus.getDiagnostics(),
       291           containerStatus.getContainerExitStatus());
       292     // sending master container finished event.
       293     RMAppAttemptContainerFinishedEvent evt =
       294         new RMAppAttemptContainerFinishedEvent(appAttemptId, status,
       295             nodeId);
       296     rmContext.getDispatcher().getEventHandler().handle(evt);
       297   }
      

      Attachments

        Activity

          People

            brahmareddy Brahma Reddy Battula
            brahmareddy Brahma Reddy Battula
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: