Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1827

TestBlockReplacement waits forever, errs without giving information

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In method checkBlocks(), TestBlockReplacement waits forever on a condition. Failures result in Hudson/Jenkins "Timeout occurred" error message with no information about where or why. Need to replace with TimeoutException that throws a stack trace and useful info about the failure mode.

      Also investigate possible cause of failure.

      1. 1827_TestBlockReplacement_v3.patch
        5 kB
        Matt Foley
      2. 1827_TestBlockReplacement_v2.patch
        5 kB
        Matt Foley
      3. 1827_TestBlockReplacement_v2.patch
        5 kB
        Matt Foley
      4. TestBlockReplacement.java.patch
        5 kB
        Matt Foley

        Activity

        Hide
        Matt Foley added a comment -

        Replaced the infinite waits with a 20-second timeout. Throw TimeoutException WITH useful information as to current state. Now if it errs in the future, at least we'll be able to see why.

        Also found what appears to be an argument reversal in the assert in line 180. My best reading of the meaning of the args, the surrounding comments, and the intent of the caller, says that it should be changed. If I'm correct, this would explain the failure.

        Finally, added more comments for clarity, and replaced a spurious Boolean with boolean.

        Posted here for information, but I'm going to subordinate this bug to HDFS-1295 (which it was blocking) and submit a single patch to that Jira.

        Show
        Matt Foley added a comment - Replaced the infinite waits with a 20-second timeout. Throw TimeoutException WITH useful information as to current state. Now if it errs in the future, at least we'll be able to see why. Also found what appears to be an argument reversal in the assert in line 180. My best reading of the meaning of the args, the surrounding comments, and the intent of the caller, says that it should be changed. If I'm correct, this would explain the failure. Finally, added more comments for clarity, and replaced a spurious Boolean with boolean. Posted here for information, but I'm going to subordinate this bug to HDFS-1295 (which it was blocking) and submit a single patch to that Jira.
        Hide
        Matt Foley added a comment -

        As part of HDFS-1295, this patch caused TestBlockReplacement to pass in build PreCommit-HDFS-Build/346, whereas it had failed before.

        Show
        Matt Foley added a comment - As part of HDFS-1295 , this patch caused TestBlockReplacement to pass in build PreCommit-HDFS-Build/346, whereas it had failed before.
        Hide
        Matt Foley added a comment -

        Cleaned up a couple comments. Please review.

        Show
        Matt Foley added a comment - Cleaned up a couple comments. Please review.
        Hide
        Todd Lipcon added a comment -

        +1, nice test improvements.

        Show
        Todd Lipcon added a comment - +1, nice test improvements.
        Hide
        Matt Foley added a comment -

        Thanks, Todd.
        Uploading again to try to trigger Hudson QA.

        BTW, this fix will be submitted as a stand-alone patch, not as part of HDFS-1295 as previously stated.

        Show
        Matt Foley added a comment - Thanks, Todd. Uploading again to try to trigger Hudson QA. BTW, this fix will be submitted as a stand-alone patch, not as part of HDFS-1295 as previously stated.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12477617/1827_TestBlockReplacement_v2.patch
        against trunk revision 1097329.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/426//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/426//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/426//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12477617/1827_TestBlockReplacement_v2.patch against trunk revision 1097329. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/426//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/426//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/426//console This message is automatically generated.
        Hide
        Matt Foley added a comment -

        The failed test, org.apache.hadoop.hdfs.server.datanode.TestFiDataTransferProtocol2.pipeline_Fi_17, appears unrelated to the changes in TestBlockReplacement, and also appears to have failed on both the "before-patch" and "after-patch" diff. Ignoring.

        Ready for commit.

        Show
        Matt Foley added a comment - The failed test, org.apache.hadoop.hdfs.server.datanode.TestFiDataTransferProtocol2.pipeline_Fi_17, appears unrelated to the changes in TestBlockReplacement, and also appears to have failed on both the "before-patch" and "after-patch" diff. Ignoring. Ready for commit.
        Hide
        Matt Foley added a comment -

        Sat too long, have to re-base.

        Show
        Matt Foley added a comment - Sat too long, have to re-base.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12478656/1827_TestBlockReplacement_v3.patch
        against trunk revision 1101137.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.hdfs.TestDFSShell
        org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
        org.apache.hadoop.hdfs.TestFileConcurrentReader

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/471//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/471//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/471//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478656/1827_TestBlockReplacement_v3.patch against trunk revision 1101137. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/471//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/471//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/471//console This message is automatically generated.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > The failed test, org.apache.hadoop.hdfs.server.datanode.TestFiDataTransferProtocol2.pipeline_Fi_17, ...

        There was a NullPointerException; created HDFS-1908.

        Show
        Tsz Wo Nicholas Sze added a comment - > The failed test, org.apache.hadoop.hdfs.server.datanode.TestFiDataTransferProtocol2.pipeline_Fi_17, ... There was a NullPointerException; created HDFS-1908 .
        Hide
        Tsz Wo Nicholas Sze added a comment -

        I have committed this. Thanks, Matt!

        Also thanks Todd for reviewing it.

        Show
        Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Matt! Also thanks Todd for reviewing it.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #637 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/637/)

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #637 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/637/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #673 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/673/)

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #673 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/673/ )

          People

          • Assignee:
            Matt Foley
            Reporter:
            Matt Foley
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development