Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5837

NPE when getting node status of a decommissioned node after an RM restart

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.3, 3.0.0-alpha1
    • 2.8.0, 2.7.4, 3.0.0-alpha2
    • None
    • None

    Description

      If you decommission a node, the yarn node command shows it like this:

      >> bin/yarn node -list -all
      2016-11-04 08:54:37,169 INFO client.RMProxy: Connecting to ResourceManager at 0.0.0.0/0.0.0.0:8032
      Total Nodes:1
               Node-Id	     Node-State	Node-Http-Address	Number-of-Running-Containers
      192.168.1.69:57560	 DECOMMISSIONED	192.168.1.69:8042	                           0
      

      And a full report like this:

      >> bin/yarn node -status 192.168.1.69:57560
      2016-11-04 08:55:08,928 INFO client.RMProxy: Connecting to ResourceManager at 0.0.0.0/0.0.0.0:8032
      Node Report :
      	Node-Id : 192.168.1.69:57560
      	Rack : /default-rack
      	Node-State : DECOMMISSIONED
      	Node-Http-Address : 192.168.1.69:8042
      	Last-Health-Update : Fri 04/Nov/16 08:53:58:802PDT
      	Health-Report :
      	Containers : 0
      	Memory-Used : 0MB
      	Memory-Capacity : 8192MB
      	CPU-Used : 0 vcores
      	CPU-Capacity : 8 vcores
      	Node-Labels :
      	Resource Utilization by Node :
      	Resource Utilization by Containers : PMem:0 MB, VMem:0 MB, VCores:0.0
      

      If you then restart the ResourceManager, you get this report:

      >> bin/yarn node -list -all
      2016-11-04 08:57:18,512 INFO client.RMProxy: Connecting to ResourceManager at 0.0.0.0/0.0.0.0:8032
      Total Nodes:4
               Node-Id	     Node-State	Node-Http-Address	Number-of-Running-Containers
       192.168.1.69:-1	 DECOMMISSIONED	  192.168.1.69:-1	                           0
      

      And when you try to get the full report on the now "-1" node, you get an NPE:

      >> bin/yarn node -status 192.168.1.69:-1
      2016-11-04 08:57:57,385 INFO client.RMProxy: Connecting to ResourceManager at 0.0.0.0/0.0.0.0:8032
      Exception in thread "main" java.lang.NullPointerException
      	at org.apache.hadoop.yarn.client.cli.NodeCLI.printNodeStatus(NodeCLI.java:296)
      	at org.apache.hadoop.yarn.client.cli.NodeCLI.run(NodeCLI.java:116)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
      	at org.apache.hadoop.yarn.client.cli.NodeCLI.main(NodeCLI.java:63)
      

      Attachments

        1. YARN-5837.branch-2.7.001.patch
          6 kB
          Robert Kanter
        2. YARN-5837.001.patch
          6 kB
          Robert Kanter

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rkanter Robert Kanter
            rkanter Robert Kanter
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment