Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6536

TestAMRMClient.testAMRMClientWithSaslEncryption fails intermittently

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.1
    • Fix Version/s: 2.9.0, 3.0.0-alpha4, 2.8.2
    • Component/s: None
    • Labels:
      None
    • Target Version/s:

      Description

      java.lang.AssertionError: expected:<2> but was:<1>
      	at org.junit.Assert.fail(Assert.java:88)
      	at org.junit.Assert.failNotEquals(Assert.java:743)
      	at org.junit.Assert.assertEquals(Assert.java:118)
      	at org.junit.Assert.assertEquals(Assert.java:555)
      	at org.junit.Assert.assertEquals(Assert.java:542)
      	at org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAllocation(TestAMRMClient.java:1005)
      	at org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.registerAndAllocate(TestAMRMClient.java:703)
      	at org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithSaslEncryption(TestAMRMClient.java:675)
      

        Activity

        Hide
        ebadger Eric Badger added a comment -

        Steven Rand, Jian He, looks like this is a test that you recently added. Can you take a look into the failure?

        Show
        ebadger Eric Badger added a comment - Steven Rand , Jian He , looks like this is a test that you recently added. Can you take a look into the failure?
        Hide
        jlowe Jason Lowe added a comment -

        This is the assertion that is failing:

            assertEquals(2, amClient.release.size());
        

        The problem is that the code above is doing this:

            while (allocatedContainerCount < containersRequestedAny
                && iterationsLeft-- > 0) {
              AllocateResponse allocResponse = amClient.allocate(0.1f);
              assertEquals(0, amClient.ask.size());
              assertEquals(0, amClient.release.size());
              
              assertEquals(nodeCount, amClient.getClusterNodeCount());
              allocatedContainerCount += allocResponse.getAllocatedContainers().size();
              for(Container container : allocResponse.getAllocatedContainers()) {
                ContainerId rejectContainerId = container.getId();
                releases.add(rejectContainerId);
                amClient.releaseAssignedContainer(rejectContainerId);
              }
        [...]
        

        If it takes more than one iteration to get two containers allocated then amClient.release will be 1 instead of 2 and the test will fail. Part of the problem here is that it's using a full-blown minicluster and doing hacky 100msec sleeps in hopes that the node heartbeats in the interim. Having the test wield something like a MockRM and MockNM that it can explicitly control the heartbeats relative to allocate calls would make this test more deterministic (and faster).

        Minimally the test should be checking releases.size() to see if all of the containers were released across the iterations rather than amClient.release.size().

        Show
        jlowe Jason Lowe added a comment - This is the assertion that is failing: assertEquals(2, amClient.release.size()); The problem is that the code above is doing this: while (allocatedContainerCount < containersRequestedAny && iterationsLeft-- > 0) { AllocateResponse allocResponse = amClient.allocate(0.1f); assertEquals(0, amClient.ask.size()); assertEquals(0, amClient.release.size()); assertEquals(nodeCount, amClient.getClusterNodeCount()); allocatedContainerCount += allocResponse.getAllocatedContainers().size(); for (Container container : allocResponse.getAllocatedContainers()) { ContainerId rejectContainerId = container.getId(); releases.add(rejectContainerId); amClient.releaseAssignedContainer(rejectContainerId); } [...] If it takes more than one iteration to get two containers allocated then amClient.release will be 1 instead of 2 and the test will fail. Part of the problem here is that it's using a full-blown minicluster and doing hacky 100msec sleeps in hopes that the node heartbeats in the interim. Having the test wield something like a MockRM and MockNM that it can explicitly control the heartbeats relative to allocate calls would make this test more deterministic (and faster). Minimally the test should be checking releases.size() to see if all of the containers were released across the iterations rather than amClient.release.size() .
        Hide
        jlowe Jason Lowe added a comment -

        I don't have time to rework the test to stop using a full minicluster, but here's a quick patch that should solve the sporadic failure that was reported.

        Show
        jlowe Jason Lowe added a comment - I don't have time to rework the test to stop using a full minicluster, but here's a quick patch that should solve the sporadic failure that was reported.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 13s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 13m 20s trunk passed
        +1 compile 0m 20s trunk passed
        +1 checkstyle 0m 14s trunk passed
        +1 mvnsite 0m 23s trunk passed
        +1 mvneclipse 0m 17s trunk passed
        -1 findbugs 0m 30s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client in trunk has 2 extant Findbugs warnings.
        +1 javadoc 0m 14s trunk passed
        +1 mvninstall 0m 19s the patch passed
        +1 compile 0m 18s the patch passed
        +1 javac 0m 18s the patch passed
        +1 checkstyle 0m 12s the patch passed
        +1 mvnsite 0m 20s the patch passed
        +1 mvneclipse 0m 14s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 0m 34s the patch passed
        +1 javadoc 0m 12s the patch passed
        +1 unit 19m 17s hadoop-yarn-client in the patch passed.
        +1 asflicense 0m 17s The patch does not generate ASF License warnings.
        38m 36s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ac17dc
        JIRA Issue YARN-6536
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12865563/YARN-6536.001.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux c206944de38a 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / cb672a4
        Default Java 1.8.0_121
        findbugs v3.1.0-RC1
        findbugs https://builds.apache.org/job/PreCommit-YARN-Build/15775/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client-warnings.html
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15775/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/15775/console
        Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 13s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 13m 20s trunk passed +1 compile 0m 20s trunk passed +1 checkstyle 0m 14s trunk passed +1 mvnsite 0m 23s trunk passed +1 mvneclipse 0m 17s trunk passed -1 findbugs 0m 30s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client in trunk has 2 extant Findbugs warnings. +1 javadoc 0m 14s trunk passed +1 mvninstall 0m 19s the patch passed +1 compile 0m 18s the patch passed +1 javac 0m 18s the patch passed +1 checkstyle 0m 12s the patch passed +1 mvnsite 0m 20s the patch passed +1 mvneclipse 0m 14s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 34s the patch passed +1 javadoc 0m 12s the patch passed +1 unit 19m 17s hadoop-yarn-client in the patch passed. +1 asflicense 0m 17s The patch does not generate ASF License warnings. 38m 36s Subsystem Report/Notes Docker Image:yetus/hadoop:0ac17dc JIRA Issue YARN-6536 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12865563/YARN-6536.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux c206944de38a 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / cb672a4 Default Java 1.8.0_121 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-YARN-Build/15775/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client-warnings.html Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15775/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client Console output https://builds.apache.org/job/PreCommit-YARN-Build/15775/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        ebadger Eric Badger added a comment -

        +1 this makes sense to me. Not all allocations necessarily have to come back in a single iteration, so checking the aggregate allocations would be better than checking the most recent allocation.

        Show
        ebadger Eric Badger added a comment - +1 this makes sense to me. Not all allocations necessarily have to come back in a single iteration, so checking the aggregate allocations would be better than checking the most recent allocation.
        Hide
        eepayne Eric Payne added a comment -

        Thanks Jason Lowe
        +1

        Show
        eepayne Eric Payne added a comment - Thanks Jason Lowe +1
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11651 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11651/)
        YARN-6536. TestAMRMClient.testAMRMClientWithSaslEncryption fails (epayne: rev fdf5192bbbb3c81e5fb221758297605271139dc9)

        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11651 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11651/ ) YARN-6536 . TestAMRMClient.testAMRMClientWithSaslEncryption fails (epayne: rev fdf5192bbbb3c81e5fb221758297605271139dc9) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java

          People

          • Assignee:
            jlowe Jason Lowe
            Reporter:
            ebadger Eric Badger
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development