Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10336

TestBalancer failing intermittently because of not reseting UserGroupInformation completely

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha1
    • Fix Version/s: 2.8.0, 2.7.4, 3.0.0-alpha1
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The unit test TestBalancer failed sometimes.

      I looked for the reason. I found two main reasons causing this.

      • 1st. The test TestBalancer#testBalancerWithKeytabs executed timeout.
        org.apache.hadoop.hdfs.server.balancer.TestBalancer
        testBalancerWithKeytabs(org.apache.hadoop.hdfs.server.balancer.TestBalancer)  Time elapsed: 300.41 sec  <<< ERROR!
        java.lang.Exception: test timed out after 300000 milliseconds
        	at java.lang.Thread.sleep(Native Method)
        	at org.apache.hadoop.hdfs.server.balancer.Dispatcher.waitForMoveCompletion(Dispatcher.java:1122)
        	at org.apache.hadoop.hdfs.server.balancer.Dispatcher.dispatchBlockMoves(Dispatcher.java:1096)
        	at org.apache.hadoop.hdfs.server.balancer.Dispatcher.dispatchAndCheckContinue(Dispatcher.java:1060)
        	at org.apache.hadoop.hdfs.server.balancer.Balancer.runOneIteration(Balancer.java:635)
        	at org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:689)
        	at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testUnknownDatanode(TestBalancer.java:1098)
        	at org.apache.hadoop.hdfs.server.balancer.TestBalancer.access$000(TestBalancer.java:125)
        
      • 2nd. The test TestBalancer#testBalancerWithKeytabs reset the UGI not completely sometimes in the finally block. And this affected the other unit tests threw IOException, like this:
        testBalancerWithNonZeroThreadsForMove(org.apache.hadoop.hdfs.server.balancer.TestBalancer)  Time elapsed: 0 sec  <<< ERROR!
        java.io.IOException: Running in secure mode, but config doesn't have a keytab
        	at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:300)
        

        And there were not only one test will be affected by this. We should add a line to do before doing reset UGI operation and can avoid the potenial exception happens.

        UserGroupInformation.reset();
        
      1. HDFS-10336.001.patch
        2 kB
        Yiqun Lin
      2. HDFS-10336.002.patch
        1 kB
        Yiqun Lin
      3. HDFS-10336.003.patch
        1 kB
        Yiqun Lin
      4. HDFS-10336.003-simplefix.patch
        0.7 kB
        Yiqun Lin

        Issue Links

          Activity

          Hide
          linyiqun Yiqun Lin added a comment -

          Attach a initial patch, thanks review.

          Show
          linyiqun Yiqun Lin added a comment - Attach a initial patch, thanks review.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 10s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          0 mvndep 0m 13s Maven dependency ordering for branch
          +1 mvninstall 6m 30s trunk passed
          +1 compile 5m 39s trunk passed with JDK v1.8.0_92
          +1 compile 6m 42s trunk passed with JDK v1.7.0_95
          +1 checkstyle 1m 5s trunk passed
          +1 mvnsite 1m 47s trunk passed
          +1 mvneclipse 0m 27s trunk passed
          +1 findbugs 3m 28s trunk passed
          +1 javadoc 2m 0s trunk passed with JDK v1.8.0_92
          +1 javadoc 2m 49s trunk passed with JDK v1.7.0_95
          0 mvndep 0m 13s Maven dependency ordering for patch
          +1 mvninstall 1m 26s the patch passed
          +1 compile 5m 41s the patch passed with JDK v1.8.0_92
          +1 javac 5m 41s the patch passed
          +1 compile 6m 41s the patch passed with JDK v1.7.0_95
          +1 javac 6m 41s the patch passed
          +1 checkstyle 1m 4s the patch passed
          +1 mvnsite 1m 47s the patch passed
          +1 mvneclipse 0m 27s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 3m 58s the patch passed
          +1 javadoc 1m 59s the patch passed with JDK v1.8.0_92
          +1 javadoc 2m 46s the patch passed with JDK v1.7.0_95
          +1 unit 6m 52s hadoop-common in the patch passed with JDK v1.8.0_92.
          -1 unit 57m 55s hadoop-hdfs in the patch failed with JDK v1.8.0_92.
          +1 unit 7m 20s hadoop-common in the patch passed with JDK v1.7.0_95.
          +1 unit 53m 30s hadoop-hdfs in the patch passed with JDK v1.7.0_95.
          +1 asflicense 0m 25s Patch does not generate ASF License warnings.
          184m 14s



          Reason Tests
          JDK v1.8.0_92 Failed junit tests hadoop.hdfs.server.namenode.TestEditLog
            hadoop.hdfs.TestHFlush
            hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:fbe3e86
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12800941/HDFS-10336.001.patch
          JIRA Issue HDFS-10336
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 6be656d62494 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 4beff01
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_92 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15305/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_92.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15305/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_92.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15305/testReport/
          modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: .
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15305/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 10s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 13s Maven dependency ordering for branch +1 mvninstall 6m 30s trunk passed +1 compile 5m 39s trunk passed with JDK v1.8.0_92 +1 compile 6m 42s trunk passed with JDK v1.7.0_95 +1 checkstyle 1m 5s trunk passed +1 mvnsite 1m 47s trunk passed +1 mvneclipse 0m 27s trunk passed +1 findbugs 3m 28s trunk passed +1 javadoc 2m 0s trunk passed with JDK v1.8.0_92 +1 javadoc 2m 49s trunk passed with JDK v1.7.0_95 0 mvndep 0m 13s Maven dependency ordering for patch +1 mvninstall 1m 26s the patch passed +1 compile 5m 41s the patch passed with JDK v1.8.0_92 +1 javac 5m 41s the patch passed +1 compile 6m 41s the patch passed with JDK v1.7.0_95 +1 javac 6m 41s the patch passed +1 checkstyle 1m 4s the patch passed +1 mvnsite 1m 47s the patch passed +1 mvneclipse 0m 27s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 3m 58s the patch passed +1 javadoc 1m 59s the patch passed with JDK v1.8.0_92 +1 javadoc 2m 46s the patch passed with JDK v1.7.0_95 +1 unit 6m 52s hadoop-common in the patch passed with JDK v1.8.0_92. -1 unit 57m 55s hadoop-hdfs in the patch failed with JDK v1.8.0_92. +1 unit 7m 20s hadoop-common in the patch passed with JDK v1.7.0_95. +1 unit 53m 30s hadoop-hdfs in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 25s Patch does not generate ASF License warnings. 184m 14s Reason Tests JDK v1.8.0_92 Failed junit tests hadoop.hdfs.server.namenode.TestEditLog   hadoop.hdfs.TestHFlush   hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot Subsystem Report/Notes Docker Image:yetus/hadoop:fbe3e86 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12800941/HDFS-10336.001.patch JIRA Issue HDFS-10336 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 6be656d62494 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 4beff01 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_92 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/15305/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_92.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15305/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_92.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15305/testReport/ modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: . Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15305/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          rakeshr Rakesh R added a comment -

          Thanks Yiqun Lin for the contribution. Could you please rebase the patch as HADOOP-13251 has modified UserGroupInformation#reset code.

          1st. The test TestBalancer#testBalancerWithKeytabs executed timeout.

          Increasing timeout is one approach, but am interested to know the reason behind 300000millis timeout. Did you see any specific case for exceeding the current value?

          2nd. The test TestBalancer#testBalancerWithKeytabs reset the UGI not completely sometimes in the finally block.

          +1 for UserGroupInformation.reset();

          Show
          rakeshr Rakesh R added a comment - Thanks Yiqun Lin for the contribution. Could you please rebase the patch as HADOOP-13251 has modified UserGroupInformation#reset code. 1st. The test TestBalancer#testBalancerWithKeytabs executed timeout. Increasing timeout is one approach, but am interested to know the reason behind 300000millis timeout. Did you see any specific case for exceeding the current value? 2nd. The test TestBalancer#testBalancerWithKeytabs reset the UGI not completely sometimes in the finally block. +1 for UserGroupInformation.reset();
          Hide
          linyiqun Yiqun Lin added a comment - - edited

          Thanks Rakesh R for review.

          Increasing timeout is one approach, but am interested to know the reason behind 300000millis timeout. Did you see any specific case for exceeding the current value?

          I tested many times in my local, it seems good and runs quickly. I'm not so sure for the case that exceeding the 30s now. Post the patch for addressing your comments.

          Show
          linyiqun Yiqun Lin added a comment - - edited Thanks Rakesh R for review. Increasing timeout is one approach, but am interested to know the reason behind 300000millis timeout. Did you see any specific case for exceeding the current value? I tested many times in my local, it seems good and runs quickly. I'm not so sure for the case that exceeding the 30s now. Post the patch for addressing your comments.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 29s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 10m 1s trunk passed
          +1 compile 1m 5s trunk passed
          +1 checkstyle 0m 39s trunk passed
          +1 mvnsite 1m 14s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 1m 59s trunk passed
          +1 javadoc 1m 4s trunk passed
          +1 mvninstall 1m 3s the patch passed
          +1 compile 1m 2s the patch passed
          +1 javac 1m 2s the patch passed
          +1 checkstyle 0m 25s the patch passed
          +1 mvnsite 1m 10s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 13s the patch passed
          +1 javadoc 1m 9s the patch passed
          -1 unit 84m 38s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 30s The patch does not generate ASF License warnings.
          110m 49s



          Reason Tests
          Failed junit tests hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
            hadoop.hdfs.server.namenode.TestFsck
            hadoop.hdfs.TestCrcCorruption
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
            hadoop.hdfs.server.namenode.TestEditLog
            hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.server.namenode.TestCacheDirectives



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:85209cc
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12814117/HDFS-10336.002.patch
          JIRA Issue HDFS-10336
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 394be24a9414 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 23c3ff8
          Default Java 1.8.0_91
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15933/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15933/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15933/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 29s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 10m 1s trunk passed +1 compile 1m 5s trunk passed +1 checkstyle 0m 39s trunk passed +1 mvnsite 1m 14s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 1m 59s trunk passed +1 javadoc 1m 4s trunk passed +1 mvninstall 1m 3s the patch passed +1 compile 1m 2s the patch passed +1 javac 1m 2s the patch passed +1 checkstyle 0m 25s the patch passed +1 mvnsite 1m 10s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 13s the patch passed +1 javadoc 1m 9s the patch passed -1 unit 84m 38s hadoop-hdfs in the patch failed. +1 asflicense 0m 30s The patch does not generate ASF License warnings. 110m 49s Reason Tests Failed junit tests hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer   hadoop.hdfs.server.namenode.TestFsck   hadoop.hdfs.TestCrcCorruption   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure   hadoop.hdfs.server.namenode.TestEditLog   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.server.namenode.TestCacheDirectives Subsystem Report/Notes Docker Image:yetus/hadoop:85209cc JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12814117/HDFS-10336.002.patch JIRA Issue HDFS-10336 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 394be24a9414 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 23c3ff8 Default Java 1.8.0_91 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/15933/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15933/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15933/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          xiaochen Xiao Chen added a comment -

          Thanks for the contribution. I'm +1 (non-binding) to the reset too, and I think it's the goal of this jira judging from the title.

          I'm neural on changing the timeouts. It's no harm increasing it, but that doesn't solve the problem - jenkins slave might be super slow sometimes. Another source of intermittency is that TestBalancer#testBalancerWithKeytabs calls TestBalancer#testUnknownDatanode internally, which is reported in HDFS-7267.

          Show
          xiaochen Xiao Chen added a comment - Thanks for the contribution. I'm +1 (non-binding) to the reset too, and I think it's the goal of this jira judging from the title. I'm neural on changing the timeouts. It's no harm increasing it, but that doesn't solve the problem - jenkins slave might be super slow sometimes. Another source of intermittency is that TestBalancer#testBalancerWithKeytabs calls TestBalancer#testUnknownDatanode internally, which is reported in HDFS-7267 .
          Hide
          linyiqun Yiqun Lin added a comment -

          Thanks Xiao Chen for review. I agree on your opinion that sometimes jenkins slave will be super slow and cause the timeout. To do the reset operation is actually that I want to do in my patch.

          Show
          linyiqun Yiqun Lin added a comment - Thanks Xiao Chen for review. I agree on your opinion that sometimes jenkins slave will be super slow and cause the timeout. To do the reset operation is actually that I want to do in my patch.
          Hide
          xiaochen Xiao Chen added a comment -

          Thanks Yiqun Lin.
          Please update the timeout of testUnknownDatanodeSimple to be the same, since it's calling the same underlying method.

          Also, looking closer, testBalancerWithKeytabs has a 5 minute timeout, not 30s. Are you sure the test passes after bumping this to 10 mins? Would prefer to allow the test to pass sooner if possible. I'm okay to defer this to a separate jira too.

          Show
          xiaochen Xiao Chen added a comment - Thanks Yiqun Lin . Please update the timeout of testUnknownDatanodeSimple to be the same, since it's calling the same underlying method. Also, looking closer, testBalancerWithKeytabs has a 5 minute timeout, not 30s. Are you sure the test passes after bumping this to 10 mins? Would prefer to allow the test to pass sooner if possible. I'm okay to defer this to a separate jira too.
          Hide
          linyiqun Yiqun Lin added a comment -

          Hi, Xiao Chen, as you can see that the unit test TestBalancer#testBalancerWithKeytabs will be timeout in 300s which was memtioned in description. I think 10 mins is a long enough time, because the test only costs around 15 seconds in my local. If it still happens timeout in these tests, maybe we would like to do further optimization ranther than increasing the timeout. I suggest that we can do the optimization work in a separate jira. This jira is focus on the reseting UGI in the test that we can see from the title. Post a new patch for addressing the comments.

          Show
          linyiqun Yiqun Lin added a comment - Hi, Xiao Chen , as you can see that the unit test TestBalancer#testBalancerWithKeytabs will be timeout in 300s which was memtioned in description. I think 10 mins is a long enough time, because the test only costs around 15 seconds in my local. If it still happens timeout in these tests, maybe we would like to do further optimization ranther than increasing the timeout. I suggest that we can do the optimization work in a separate jira. This jira is focus on the reseting UGI in the test that we can see from the title. Post a new patch for addressing the comments.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 27s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 8m 40s trunk passed
          +1 compile 1m 1s trunk passed
          +1 checkstyle 0m 27s trunk passed
          +1 mvnsite 0m 53s trunk passed
          +1 mvneclipse 0m 12s trunk passed
          +1 findbugs 1m 46s trunk passed
          +1 javadoc 0m 57s trunk passed
          +1 mvninstall 0m 48s the patch passed
          +1 compile 0m 45s the patch passed
          +1 javac 0m 45s the patch passed
          +1 checkstyle 0m 28s the patch passed
          +1 mvnsite 0m 53s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 50s the patch passed
          +1 javadoc 0m 58s the patch passed
          +1 unit 73m 12s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          95m 2s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12816744/HDFS-10336.003.patch
          JIRA Issue HDFS-10336
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux e3ea115e5ee6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 9d46a49
          Default Java 1.8.0_91
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16004/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16004/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 27s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 8m 40s trunk passed +1 compile 1m 1s trunk passed +1 checkstyle 0m 27s trunk passed +1 mvnsite 0m 53s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 1m 46s trunk passed +1 javadoc 0m 57s trunk passed +1 mvninstall 0m 48s the patch passed +1 compile 0m 45s the patch passed +1 javac 0m 45s the patch passed +1 checkstyle 0m 28s the patch passed +1 mvnsite 0m 53s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 50s the patch passed +1 javadoc 0m 58s the patch passed +1 unit 73m 12s hadoop-hdfs in the patch passed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 95m 2s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12816744/HDFS-10336.003.patch JIRA Issue HDFS-10336 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux e3ea115e5ee6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 9d46a49 Default Java 1.8.0_91 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16004/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16004/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          xiaochen Xiao Chen added a comment -

          +1 (non-binding) on patch 3. Thanks, Yiqun Lin.

          Show
          xiaochen Xiao Chen added a comment - +1 (non-binding) on patch 3. Thanks, Yiqun Lin .
          Hide
          rakeshr Rakesh R added a comment -

          Thanks Yiqun Lin for the explanation and the work.

          This jira is focus on the reseting UGI in the test that we can see from the title.

          Agreed. In that case, how about doing a simple fix UserGroupInformation.reset(); with this jira and creates a patch with only this line change as shown below. Like you mentioned create another jira for the test timeout failures.

                 // Reset UGI so that other tests are not affected.
          +      UserGroupInformation.reset();
                 UserGroupInformation.setConfiguration(new Configuration());
          

          I think 10 mins is a long enough time, because the test only costs around 15 seconds in my local. If it still happens timeout in these tests, maybe we would like to do further optimization ranther than increasing the timeout. I suggest that we can do the optimization work in a separate jira.

          I ran 5 times, it took 45 secs(maximum in all run) and 18 secs(minimum in all run) in my local env. Could you please create a jira if not raised yet for handling the timeout case separately. Also, good to add a reference link to this jira for future references. Thanks!

          Show
          rakeshr Rakesh R added a comment - Thanks Yiqun Lin for the explanation and the work. This jira is focus on the reseting UGI in the test that we can see from the title. Agreed. In that case, how about doing a simple fix UserGroupInformation.reset(); with this jira and creates a patch with only this line change as shown below. Like you mentioned create another jira for the test timeout failures. // Reset UGI so that other tests are not affected. + UserGroupInformation.reset(); UserGroupInformation.setConfiguration( new Configuration()); I think 10 mins is a long enough time, because the test only costs around 15 seconds in my local. If it still happens timeout in these tests, maybe we would like to do further optimization ranther than increasing the timeout. I suggest that we can do the optimization work in a separate jira. I ran 5 times, it took 45 secs(maximum in all run) and 18 secs(minimum in all run) in my local env. Could you please create a jira if not raised yet for handling the timeout case separately. Also, good to add a reference link to this jira for future references. Thanks!
          Hide
          linyiqun Yiqun Lin added a comment -

          It seems a good idea. I have created a separate jira HDFS-10602 for handling timeout case in TestBalancer. Post a simple fix patch as Rakesh R said.

          Show
          linyiqun Yiqun Lin added a comment - It seems a good idea. I have created a separate jira HDFS-10602 for handling timeout case in TestBalancer. Post a simple fix patch as Rakesh R said.
          Hide
          ajisakaa Akira Ajisaka added a comment -

          +1 for the simple fix.

          Show
          ajisakaa Akira Ajisaka added a comment - +1 for the simple fix.
          Hide
          rakeshr Rakesh R added a comment -

          +1 (non-binding). Thanks Yiqun Lin for reporting and fixing it.

          Show
          rakeshr Rakesh R added a comment - +1 (non-binding). Thanks Yiqun Lin for reporting and fixing it.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 32s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 10m 35s trunk passed
          +1 compile 1m 30s trunk passed
          +1 checkstyle 0m 42s trunk passed
          +1 mvnsite 1m 13s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 2m 27s trunk passed
          +1 javadoc 1m 21s trunk passed
          +1 mvninstall 1m 31s the patch passed
          +1 compile 1m 25s the patch passed
          +1 javac 1m 25s the patch passed
          +1 checkstyle 0m 43s the patch passed
          +1 mvnsite 1m 37s the patch passed
          +1 mvneclipse 0m 19s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 51s the patch passed
          +1 javadoc 1m 1s the patch passed
          -1 unit 85m 35s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          115m 42s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
            hadoop.hdfs.server.namenode.ha.TestHAAppend



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12816818/HDFS-10336.003-simplefix.patch
          JIRA Issue HDFS-10336
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 2910b8871ad2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 5252562
          Default Java 1.8.0_91
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/16008/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16008/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16008/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 32s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 10m 35s trunk passed +1 compile 1m 30s trunk passed +1 checkstyle 0m 42s trunk passed +1 mvnsite 1m 13s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 2m 27s trunk passed +1 javadoc 1m 21s trunk passed +1 mvninstall 1m 31s the patch passed +1 compile 1m 25s the patch passed +1 javac 1m 25s the patch passed +1 checkstyle 0m 43s the patch passed +1 mvnsite 1m 37s the patch passed +1 mvneclipse 0m 19s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 51s the patch passed +1 javadoc 1m 1s the patch passed -1 unit 85m 35s hadoop-hdfs in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 115m 42s Reason Tests Failed junit tests hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness   hadoop.hdfs.server.namenode.ha.TestHAAppend Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12816818/HDFS-10336.003-simplefix.patch JIRA Issue HDFS-10336 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 2910b8871ad2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 5252562 Default Java 1.8.0_91 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/16008/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16008/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16008/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          ajisakaa Akira Ajisaka added a comment -

          Committed this to trunk. Thanks Yiqun Lin for the contribution and thanks Rakesh R and Xiao Chen for the reviews!

          Show
          ajisakaa Akira Ajisaka added a comment - Committed this to trunk. Thanks Yiqun Lin for the contribution and thanks Rakesh R and Xiao Chen for the reviews!
          Hide
          linyiqun Yiqun Lin added a comment -

          Thanks Akira Ajisaka for commit!

          Show
          linyiqun Yiqun Lin added a comment - Thanks Akira Ajisaka for commit!
          Hide
          zhz Zhe Zhang added a comment -

          I just backported this patch to branch-2, branch-2.8 and branch-2.7.

          Show
          zhz Zhe Zhang added a comment - I just backported this patch to branch-2, branch-2.8 and branch-2.7.

            People

            • Assignee:
              linyiqun Yiqun Lin
              Reporter:
              linyiqun Yiqun Lin
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development