Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4723

NodesListManager$UnknownNodeId ClassCastException

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Saw the following in an RM log:

      2016-02-16 22:55:35,207 [IPC Server handler 5 on 8030] WARN ipc.Server: IPC Server handler 5 on 8030, call org.apache.hadoop.ipc.ProtobufRpcEngine$Server@6c403aff
      java.lang.ClassCastException: org.apache.hadoop.yarn.server.resourcemanager.NodesListManager$UnknownNodeId cannot be cast to org.apache.hadoop.yarn.api.records.impl.pb.NodeIdPBImpl
              at org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:247)
              at org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:271)
              at org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:220)
              at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.convertToProtoFormat(AllocateResponsePBImpl.java:712)
              at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.access$500(AllocateResponsePBImpl.java:68)
              at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl$6$1.next(AllocateResponsePBImpl.java:658)
              at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl$6$1.next(AllocateResponsePBImpl.java:647)
              at com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336)
              at com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323)
              at org.apache.hadoop.yarn.proto.YarnServiceProtos$AllocateResponseProto$Builder.addAllUpdatedNodes(YarnServiceProtos.java:9335)
              at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.mergeLocalToBuilder(AllocateResponsePBImpl.java:144)
              at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.mergeLocalToProto(AllocateResponsePBImpl.java:175)
              at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.getProto(AllocateResponsePBImpl.java:96)
              at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:61)
              at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:608)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
              at org.apache.hadoop.ipc.Server.call(Server.java:2267)
              at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:648)
              at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:615)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2217)
      
      1. YARN-4723.001.patch
        4 kB
        Kuhu Shukla
      2. YARN-4723.002.patch
        4 kB
        Kuhu Shukla
      3. YARN-4723-branch-2.7.001.patch
        4 kB
        Kuhu Shukla

        Issue Links

          Activity

          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Closing the JIRA as part of 2.7.3 release.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #9376 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9376/)
          YARN-4723. NodesListManager$UnknownNodeId ClassCastException. (jlowe: rev 6b0f813e898cbd14b2ae52ecfed6d30bce8cb6b7)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9376 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9376/ ) YARN-4723 . NodesListManager$UnknownNodeId ClassCastException. (jlowe: rev 6b0f813e898cbd14b2ae52ecfed6d30bce8cb6b7) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java hadoop-yarn-project/CHANGES.txt
          Hide
          jlowe Jason Lowe added a comment -

          Thanks to Kuhu for the contribution and to Daniel for additional review! I committed this to trunk, branch-2, branch-2.8, and branch-2.7.

          Show
          jlowe Jason Lowe added a comment - Thanks to Kuhu for the contribution and to Daniel for additional review! I committed this to trunk, branch-2, branch-2.8, and branch-2.7.
          Hide
          jlowe Jason Lowe added a comment -

          +1 committing this.

          Show
          jlowe Jason Lowe added a comment - +1 committing this.
          Hide
          templedf Daniel Templeton added a comment -

          The patch seems fine to me.

          Show
          templedf Daniel Templeton added a comment - The patch seems fine to me.
          Hide
          kshukla Kuhu Shukla added a comment -

          This is for the 2.7 patch.

          Show
          kshukla Kuhu Shukla added a comment - This is for the 2.7 patch.
          Hide
          kshukla Kuhu Shukla added a comment -

          Findbugs:

          	
          Bug type SIC_INNER_SHOULD_BE_STATIC (click for details) 
          In class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKSyncOperationCallback
          At ZKRMStateStore.java:[lines 118-127]
          

          Checkstyle warnings are unrelated. Same thing with asf license warnings. Test failures are known issues.

          Show
          kshukla Kuhu Shukla added a comment - Findbugs: Bug type SIC_INNER_SHOULD_BE_STATIC (click for details) In class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKSyncOperationCallback At ZKRMStateStore.java:[lines 118-127] Checkstyle warnings are unrelated. Same thing with asf license warnings. Test failures are known issues.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 7m 45s branch-2.7 passed
          +1 compile 0m 24s branch-2.7 passed with JDK v1.8.0_72
          +1 compile 0m 25s branch-2.7 passed with JDK v1.7.0_95
          +1 checkstyle 0m 23s branch-2.7 passed
          +1 mvnsite 0m 40s branch-2.7 passed
          +1 mvneclipse 0m 17s branch-2.7 passed
          -1 findbugs 1m 9s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in branch-2.7 has 1 extant Findbugs warnings.
          +1 javadoc 0m 21s branch-2.7 passed with JDK v1.8.0_72
          +1 javadoc 0m 23s branch-2.7 passed with JDK v1.7.0_95
          +1 mvninstall 0m 27s the patch passed
          +1 compile 0m 26s the patch passed with JDK v1.8.0_72
          +1 javac 0m 26s the patch passed
          +1 compile 0m 24s the patch passed with JDK v1.7.0_95
          +1 javac 0m 24s the patch passed
          +1 checkstyle 0m 18s the patch passed
          +1 mvnsite 0m 28s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          -1 whitespace 0m 0s The patch has 1486 line(s) that end in whitespace. Use git apply --whitespace=fix.
          -1 whitespace 0m 41s The patch has 99 line(s) with tabs.
          +1 findbugs 1m 12s the patch passed
          +1 javadoc 0m 16s the patch passed with JDK v1.8.0_72
          +1 javadoc 0m 21s the patch passed with JDK v1.7.0_95
          -1 unit 49m 13s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_72.
          -1 unit 51m 4s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95.
          -1 asflicense 44m 56s Patch generated 77 ASF License warnings.
          163m 19s



          Reason Tests
          JDK v1.8.0_72 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
            hadoop.yarn.server.resourcemanager.TestAMAuthorization
          JDK v1.7.0_95 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
            hadoop.yarn.server.resourcemanager.TestAMAuthorization



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:c420dfe
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12789992/YARN-4723-branch-2.7.001.patch
          JIRA Issue YARN-4723
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 93157c9b39d4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2.7 / 8ceb9f3
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          findbugs https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-warnings.html
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/whitespace-eol.txt
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/whitespace-tabs.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10638/testReport/
          asflicense https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/10638/console
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 45s branch-2.7 passed +1 compile 0m 24s branch-2.7 passed with JDK v1.8.0_72 +1 compile 0m 25s branch-2.7 passed with JDK v1.7.0_95 +1 checkstyle 0m 23s branch-2.7 passed +1 mvnsite 0m 40s branch-2.7 passed +1 mvneclipse 0m 17s branch-2.7 passed -1 findbugs 1m 9s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in branch-2.7 has 1 extant Findbugs warnings. +1 javadoc 0m 21s branch-2.7 passed with JDK v1.8.0_72 +1 javadoc 0m 23s branch-2.7 passed with JDK v1.7.0_95 +1 mvninstall 0m 27s the patch passed +1 compile 0m 26s the patch passed with JDK v1.8.0_72 +1 javac 0m 26s the patch passed +1 compile 0m 24s the patch passed with JDK v1.7.0_95 +1 javac 0m 24s the patch passed +1 checkstyle 0m 18s the patch passed +1 mvnsite 0m 28s the patch passed +1 mvneclipse 0m 12s the patch passed -1 whitespace 0m 0s The patch has 1486 line(s) that end in whitespace. Use git apply --whitespace=fix. -1 whitespace 0m 41s The patch has 99 line(s) with tabs. +1 findbugs 1m 12s the patch passed +1 javadoc 0m 16s the patch passed with JDK v1.8.0_72 +1 javadoc 0m 21s the patch passed with JDK v1.7.0_95 -1 unit 49m 13s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_72. -1 unit 51m 4s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. -1 asflicense 44m 56s Patch generated 77 ASF License warnings. 163m 19s Reason Tests JDK v1.8.0_72 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization JDK v1.7.0_95 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization Subsystem Report/Notes Docker Image:yetus/hadoop:c420dfe JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12789992/YARN-4723-branch-2.7.001.patch JIRA Issue YARN-4723 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 93157c9b39d4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2.7 / 8ceb9f3 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-warnings.html whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/whitespace-tabs.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10638/testReport/ asflicense https://builds.apache.org/job/PreCommit-YARN-Build/10638/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/10638/console Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          kshukla Kuhu Shukla added a comment -

          TestSystemMetricsPublisher and TestZKRMStateStore run fine locally and I don't see correlation between the change and the test failure. Requesting Jason Lowe for more comments/review. Thank you very much!

          Show
          kshukla Kuhu Shukla added a comment - TestSystemMetricsPublisher and TestZKRMStateStore run fine locally and I don't see correlation between the change and the test failure. Requesting Jason Lowe for more comments/review. Thank you very much!
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 11s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 27s trunk passed
          +1 compile 0m 25s trunk passed with JDK v1.8.0_72
          +1 compile 0m 28s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 18s trunk passed
          +1 mvnsite 0m 35s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 1m 6s trunk passed
          +1 javadoc 0m 20s trunk passed with JDK v1.8.0_72
          +1 javadoc 0m 26s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 29s the patch passed
          +1 compile 0m 25s the patch passed with JDK v1.8.0_72
          +1 javac 0m 25s the patch passed
          +1 compile 0m 26s the patch passed with JDK v1.7.0_95
          +1 javac 0m 26s the patch passed
          +1 checkstyle 0m 16s the patch passed
          +1 mvnsite 0m 33s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 1m 15s the patch passed
          +1 javadoc 0m 18s the patch passed with JDK v1.8.0_72
          +1 javadoc 0m 23s the patch passed with JDK v1.7.0_95
          -1 unit 66m 23s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_72.
          -1 unit 67m 30s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95.
          +1 asflicense 0m 17s Patch does not generate ASF License warnings.
          149m 55s



          Reason Tests
          JDK v1.8.0_72 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
            hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher
            hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore
            hadoop.yarn.server.resourcemanager.TestAMAuthorization
          JDK v1.7.0_95 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
            hadoop.yarn.server.resourcemanager.TestAMAuthorization



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12789990/YARN-4723.002.patch
          JIRA Issue YARN-4723
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 50b9dccc411e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 8808779
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-YARN-Build/10637/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/10637/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10637/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt https://builds.apache.org/job/PreCommit-YARN-Build/10637/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10637/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/10637/console
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 11s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 27s trunk passed +1 compile 0m 25s trunk passed with JDK v1.8.0_72 +1 compile 0m 28s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 18s trunk passed +1 mvnsite 0m 35s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 6s trunk passed +1 javadoc 0m 20s trunk passed with JDK v1.8.0_72 +1 javadoc 0m 26s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 29s the patch passed +1 compile 0m 25s the patch passed with JDK v1.8.0_72 +1 javac 0m 25s the patch passed +1 compile 0m 26s the patch passed with JDK v1.7.0_95 +1 javac 0m 26s the patch passed +1 checkstyle 0m 16s the patch passed +1 mvnsite 0m 33s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 1m 15s the patch passed +1 javadoc 0m 18s the patch passed with JDK v1.8.0_72 +1 javadoc 0m 23s the patch passed with JDK v1.7.0_95 -1 unit 66m 23s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_72. -1 unit 67m 30s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 17s Patch does not generate ASF License warnings. 149m 55s Reason Tests JDK v1.8.0_72 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher   hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore   hadoop.yarn.server.resourcemanager.TestAMAuthorization JDK v1.7.0_95 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12789990/YARN-4723.002.patch JIRA Issue YARN-4723 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 50b9dccc411e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 8808779 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/10637/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10637/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10637/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt https://builds.apache.org/job/PreCommit-YARN-Build/10637/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10637/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/10637/console Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          kshukla Kuhu Shukla added a comment -

          Attaching patch for branch-2.7.

          Show
          kshukla Kuhu Shukla added a comment - Attaching patch for branch-2.7.
          Hide
          kshukla Kuhu Shukla added a comment -

          Updated patch with checkstyle and test comment rectified.

          Show
          kshukla Kuhu Shukla added a comment - Updated patch with checkstyle and test comment rectified.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks, Kuhu! Patch looks pretty good, just a couple of nits:

          • In the test NODE_USABLE should be NODE_UNUSABLE for the assert message
          • Please look into the checkstyle issues.
          Show
          jlowe Jason Lowe added a comment - Thanks, Kuhu! Patch looks pretty good, just a couple of nits: In the test NODE_USABLE should be NODE_UNUSABLE for the assert message Please look into the checkstyle issues.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 9s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 46s trunk passed
          +1 compile 0m 30s trunk passed with JDK v1.8.0_72
          +1 compile 0m 30s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 18s trunk passed
          +1 mvnsite 0m 36s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 1m 4s trunk passed
          +1 javadoc 0m 23s trunk passed with JDK v1.8.0_72
          +1 javadoc 0m 25s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 30s the patch passed
          +1 compile 0m 24s the patch passed with JDK v1.8.0_72
          +1 javac 0m 24s the patch passed
          +1 compile 0m 27s the patch passed with JDK v1.7.0_95
          +1 javac 0m 27s the patch passed
          -1 checkstyle 0m 16s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: patch generated 3 new + 66 unchanged - 0 fixed = 69 total (was 66)
          +1 mvnsite 0m 31s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 1m 15s the patch passed
          +1 javadoc 0m 19s the patch passed with JDK v1.8.0_72
          +1 javadoc 0m 23s the patch passed with JDK v1.7.0_95
          -1 unit 67m 42s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_72.
          -1 unit 68m 13s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95.
          +1 asflicense 0m 18s Patch does not generate ASF License warnings.
          152m 24s



          Reason Tests
          JDK v1.8.0_72 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
            hadoop.yarn.server.resourcemanager.TestAMAuthorization
          JDK v1.7.0_95 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
            hadoop.yarn.server.resourcemanager.TestAMAuthorization



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12789649/YARN-4723.001.patch
          JIRA Issue YARN-4723
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 4c03a178a475 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / d6b181c
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10628/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/10628/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/10628/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10628/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt https://builds.apache.org/job/PreCommit-YARN-Build/10628/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10628/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/10628/console
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 9s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 46s trunk passed +1 compile 0m 30s trunk passed with JDK v1.8.0_72 +1 compile 0m 30s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 18s trunk passed +1 mvnsite 0m 36s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 1m 4s trunk passed +1 javadoc 0m 23s trunk passed with JDK v1.8.0_72 +1 javadoc 0m 25s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 30s the patch passed +1 compile 0m 24s the patch passed with JDK v1.8.0_72 +1 javac 0m 24s the patch passed +1 compile 0m 27s the patch passed with JDK v1.7.0_95 +1 javac 0m 27s the patch passed -1 checkstyle 0m 16s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: patch generated 3 new + 66 unchanged - 0 fixed = 69 total (was 66) +1 mvnsite 0m 31s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 1m 15s the patch passed +1 javadoc 0m 19s the patch passed with JDK v1.8.0_72 +1 javadoc 0m 23s the patch passed with JDK v1.7.0_95 -1 unit 67m 42s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_72. -1 unit 68m 13s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 18s Patch does not generate ASF License warnings. 152m 24s Reason Tests JDK v1.8.0_72 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization JDK v1.7.0_95 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12789649/YARN-4723.001.patch JIRA Issue YARN-4723 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 4c03a178a475 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / d6b181c Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10628/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10628/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10628/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10628/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt https://builds.apache.org/job/PreCommit-YARN-Build/10628/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10628/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/10628/console Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          kshukla Kuhu Shukla added a comment -

          Attaching preliminary patch based on approach#2 by Jason Lowe. Also added the change to not put such a node in active RMNodes map but in inactive map.

          Show
          kshukla Kuhu Shukla added a comment - Attaching preliminary patch based on approach#2 by Jason Lowe . Also added the change to not put such a node in active RMNodes map but in inactive map.
          Hide
          jlowe Jason Lowe added a comment -

          I haven't looked at it in detail, but can we simply avoid doing any node update processing for dummy nodes (e.g.: port == -1) when processing the decommission transition?

          Show
          jlowe Jason Lowe added a comment - I haven't looked at it in detail, but can we simply avoid doing any node update processing for dummy nodes (e.g.: port == -1) when processing the decommission transition?
          Hide
          kshukla Kuhu Shukla added a comment -

          The primary reason for this failure is the UnknownNodeId object. Even if we do not put this dummy nodeId in the active RMNodes, and instead put it in inactiveRMNodes, the transition from NEW to DECOMMISSIONED that makes the node unusable(NODE_UNUSABLE) will trigger a NODE_UPDATE which instead would populate the updatedNodes in the AllocateResponse.

            @Override
            public void handle(NodesListManagerEvent event) {
              RMNode eventNode = event.getNode();
              switch (event.getType()) {
              case NODE_UNUSABLE:
                LOG.debug(eventNode + " reported unusable");
                unusableRMNodesConcurrentSet.add(eventNode);
                for(RMApp app: rmContext.getRMApps().values()) {
                  if (!app.isAppFinalStateStored()) {
                    this.rmContext
                        .getDispatcher()
                        .getEventHandler()
                        .handle(
                            new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode,
                                RMAppNodeUpdateType.NODE_UNUSABLE));
                  }
                }
          

          That being said, we should not add the node to active list, but the way to solve this problem is to get rid of UnknownNodeId and have an anonymous classes to initialize these dummy nodes.

          For the unit test, I did call allocate for this scenario but that did not replicate the issue until I explicitly set the updatedNodes to an UnknownNodeId object.

          Asking Jason Lowe, Daniel Templeton for comments and corrections.

          Excerpt from a sample test :

          AllocateRequest allocateRequest =
                  Records.newRecord(AllocateRequest.class);
              AllocateResponse resp = rmClient.allocate(allocateRequest);
              NodeReport report = new NodeReportPBImpl();
              report.setNodeId(new NodesListManager.UnknownNodeId("host2"));
              List<NodeReport> reports = new ArrayList<NodeReport>();
              reports.add(report);
              resp.setUpdatedNodes(reports);
              allocateRequest =
                  Records.newRecord(AllocateRequest.class);
              YarnServiceProtos.AllocateResponseProto p = ((AllocateResponsePBImpl) resp).getProto();
          

          Proposed change in NodesListManager.java:

          private void setDecomissionedNMs() {
              Set<String> excludeList = hostsReader.getExcludedHosts();
              for (final String host : excludeList) {
                NodeId nodeId = makeUnknownNodeId(host);
                RMNodeImpl rmNode = new RMNodeImpl(nodeId,
                    rmContext, host, -1, -1, makeUnknownNode(host), null, null);
                rmContext.getInactiveRMNodes().putIfAbsent(rmNode.getNodeID().getHost(),rmNode);
                rmNode.handle(new RMNodeEvent(rmNode.getNodeID(), RMNodeEventType
                    .DECOMMISSION));
              }
            }
          
            Node makeUnknownNode(final String host) {
              return new Node() {
                @Override
                public String getNetworkLocation() {
                  return null;
                }
          
                @Override
                public void setNetworkLocation(String location) {
          
                }
          
                @Override
                public String getName() {
                  return host;
                }
          
                @Override
                public Node getParent() {
                  return null;
                }
          
                @Override
                public void setParent(Node parent) {
          
                }
          
                @Override
                public int getLevel() {
                  return 0;
                }
          
                @Override
                public void setLevel(int i) {
          
                }
              };
            }
          
          Show
          kshukla Kuhu Shukla added a comment - The primary reason for this failure is the UnknownNodeId object. Even if we do not put this dummy nodeId in the active RMNodes, and instead put it in inactiveRMNodes, the transition from NEW to DECOMMISSIONED that makes the node unusable(NODE_UNUSABLE) will trigger a NODE_UPDATE which instead would populate the updatedNodes in the AllocateResponse. @Override public void handle(NodesListManagerEvent event) { RMNode eventNode = event.getNode(); switch (event.getType()) { case NODE_UNUSABLE: LOG.debug(eventNode + " reported unusable" ); unusableRMNodesConcurrentSet.add(eventNode); for (RMApp app: rmContext.getRMApps().values()) { if (!app.isAppFinalStateStored()) { this .rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_UNUSABLE)); } } That being said, we should not add the node to active list, but the way to solve this problem is to get rid of UnknownNodeId and have an anonymous classes to initialize these dummy nodes. For the unit test, I did call allocate for this scenario but that did not replicate the issue until I explicitly set the updatedNodes to an UnknownNodeId object. Asking Jason Lowe , Daniel Templeton for comments and corrections. Excerpt from a sample test : AllocateRequest allocateRequest = Records.newRecord(AllocateRequest.class); AllocateResponse resp = rmClient.allocate(allocateRequest); NodeReport report = new NodeReportPBImpl(); report.setNodeId( new NodesListManager.UnknownNodeId( "host2" )); List<NodeReport> reports = new ArrayList<NodeReport>(); reports.add(report); resp.setUpdatedNodes(reports); allocateRequest = Records.newRecord(AllocateRequest.class); YarnServiceProtos.AllocateResponseProto p = ((AllocateResponsePBImpl) resp).getProto(); Proposed change in NodesListManager.java: private void setDecomissionedNMs() { Set< String > excludeList = hostsReader.getExcludedHosts(); for ( final String host : excludeList) { NodeId nodeId = makeUnknownNodeId(host); RMNodeImpl rmNode = new RMNodeImpl(nodeId, rmContext, host, -1, -1, makeUnknownNode(host), null , null ); rmContext.getInactiveRMNodes().putIfAbsent(rmNode.getNodeID().getHost(),rmNode); rmNode.handle( new RMNodeEvent(rmNode.getNodeID(), RMNodeEventType .DECOMMISSION)); } } Node makeUnknownNode( final String host) { return new Node() { @Override public String getNetworkLocation() { return null ; } @Override public void setNetworkLocation( String location) { } @Override public String getName() { return host; } @Override public Node getParent() { return null ; } @Override public void setParent(Node parent) { } @Override public int getLevel() { return 0; } @Override public void setLevel( int i) { } }; }
          Hide
          kshukla Kuhu Shukla added a comment -

          The updatedNodes in NodeReport is picking up the UnknownNodeIds after they transition from active to decommissioned. Will update shortly.

          Show
          kshukla Kuhu Shukla added a comment - The updatedNodes in NodeReport is picking up the UnknownNodeIds after they transition from active to decommissioned. Will update shortly.
          Hide
          jlowe Jason Lowe added a comment -

          This appears to be related to the change in YARN-3102. Kuhu Shukla could you take a look? It appears the canned unknown nodes are trying to be sent as part of a response to an allocate call.

          Show
          jlowe Jason Lowe added a comment - This appears to be related to the change in YARN-3102 . Kuhu Shukla could you take a look? It appears the canned unknown nodes are trying to be sent as part of a response to an allocate call.

            People

            • Assignee:
              kshukla Kuhu Shukla
              Reporter:
              jlowe Jason Lowe
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development