Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-927

DFSInputStream retries too many times for new block locations

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.20-append, 0.21.0, 0.22.0
    • Fix Version/s: 0.20.2, 0.21.0
    • Component/s: hdfs-client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I think this is a regression caused by HDFS-127 – DFSInputStream is supposed to only go back to the NN max.block.acquires times, but in trunk it goes back twice as many - the default is 3, but I am counting 7 calls to getBlockLocations before an exception is thrown.

      1. hdfs-927-branch-0.21.txt
        14 kB
        Todd Lipcon
      2. hdfs-927-branch0.20.txt
        13 kB
        Todd Lipcon
      3. hdfs-927.txt
        14 kB
        Todd Lipcon

        Issue Links

          Activity

          Hide
          Nicolas Spiegelberg added a comment -

          This should be pulled into the branch-0.20-append branch.

          Show
          Nicolas Spiegelberg added a comment - This should be pulled into the branch-0.20-append branch.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #275 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/275/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #275 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/275/ )
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #146 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/146/)

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #146 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/146/ )
          Hide
          stack added a comment -

          We ran a vote on applying patch to 0.20 and 0.21 branches and vote passed (Follow the thread that starts here http://www.mail-archive.com/hbase-dev@hadoop.apache.org/msg17043.html).

          Patches have been applied to 0.20 and 0.21 branches.

          Thanks for the patch Todd Lipcon.

          Show
          stack added a comment - We ran a vote on applying patch to 0.20 and 0.21 branches and vote passed (Follow the thread that starts here http://www.mail-archive.com/hbase-dev@hadoop.apache.org/msg17043.html ). Patches have been applied to 0.20 and 0.21 branches. Thanks for the patch Todd Lipcon.
          Hide
          Todd Lipcon added a comment -

          Tests passed on branch-0.20.

          Show
          Todd Lipcon added a comment - Tests passed on branch-0.20.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Sure, 0.20 sounds good.

          Show
          Tsz Wo Nicholas Sze added a comment - Sure, 0.20 sounds good.
          Hide
          stack added a comment -

          How we want to run the vote on this? Will i start with 0.20 first?

          Show
          stack added a comment - How we want to run the vote on this? Will i start with 0.20 first?
          Hide
          Todd Lipcon added a comment -

          Ran tests on branch-0.21. Everything passed except two unrelated failures already tracked by other jiras.

          Show
          Todd Lipcon added a comment - Ran tests on branch-0.21. Everything passed except two unrelated failures already tracked by other jiras.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #183 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/183/)
          DFSInputStream retries too many times for new block location

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #183 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/183/ ) DFSInputStream retries too many times for new block location
          Hide
          stack added a comment -

          I applied the TRUNK patch.

          Show
          stack added a comment - I applied the TRUNK patch.
          Hide
          Todd Lipcon added a comment -

          Patch for branch-0.21. I haven't run this one through the full suite of tests yet.

          Show
          Todd Lipcon added a comment - Patch for branch-0.21. I haven't run this one through the full suite of tests yet.
          Hide
          Todd Lipcon added a comment -

          Here's a branch-0.20 patch (nee hdfs-127-branch20-redone-v2.txt from HDFS-127)

          Show
          Todd Lipcon added a comment - Here's a branch-0.20 patch (nee hdfs-127-branch20-redone-v2.txt from HDFS-127 )
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Same for me. +1 on the patch.

          Show
          Tsz Wo Nicholas Sze added a comment - Same for me. +1 on the patch.
          Hide
          stack added a comment -

          .bq This would only be the case if you spanned all three blocks with a single read() op. Not likely to happen unless you're calling read() once with a 200MB buffer or something

          Yeah, I was just thinking this while out on an errand. My conjecture is way too contrived.

          So, @Tsz, extra +1 on the Todd patch.

          Show
          stack added a comment - .bq This would only be the case if you spanned all three blocks with a single read() op. Not likely to happen unless you're calling read() once with a 200MB buffer or something Yeah, I was just thinking this while out on an errand. My conjecture is way too contrived. So, @Tsz, extra +1 on the Todd patch.
          Hide
          Todd Lipcon added a comment -

          on a 3 block file, if we hiccupped the first read on each block,

          This would only be the case if you spanned all three blocks with a single read() op. Not likely to happen unless you're calling read() once with a 200MB buffer or something

          Show
          Todd Lipcon added a comment - on a 3 block file, if we hiccupped the first read on each block, This would only be the case if you spanned all three blocks with a single read() op. Not likely to happen unless you're calling read() once with a 200MB buffer or something
          Hide
          stack added a comment -

          .bq I actually mean "per read per block", i.e. within a singe read, there are 3 retries for each block.

          That would be better in that with the current patch, on a 3 block file, if we hiccupped the first read on each block, we'd trip the failures count though if we'd been counting on a block basis, the read would have gone through.

          That said, I'd be fine with Todds patch – its nice and clean, semantically and code-wise – going in and in a new issue working on the Tsz suggested improvement.

          Show
          stack added a comment - .bq I actually mean "per read per block", i.e. within a singe read, there are 3 retries for each block. That would be better in that with the current patch, on a 3 block file, if we hiccupped the first read on each block, we'd trip the failures count though if we'd been counting on a block basis, the read would have gone through. That said, I'd be fine with Todds patch – its nice and clean, semantically and code-wise – going in and in a new issue working on the Tsz suggested improvement.
          Hide
          Todd Lipcon added a comment -

          Ah, I see what you're saying. So, if you do a read that crosses a block boundary A-B, and get 2 errors at the end of block A, and 2 errors at the start of block B, you should still be OK?

          I could go either way here. Part of me thinks that if you have errors on both sides of a block boundary for a single read, your client is probably in a bad state and you're likely to fail either way?

          Since some are considering this an 0.20.2 blocker, could we get this commited as a solid improvement over what's there now (which makes very little sense) and then discuss whether the block boundary case should be improved?

          Show
          Todd Lipcon added a comment - Ah, I see what you're saying. So, if you do a read that crosses a block boundary A-B, and get 2 errors at the end of block A, and 2 errors at the start of block B, you should still be OK? I could go either way here. Part of me thinks that if you have errors on both sides of a block boundary for a single read, your client is probably in a bad state and you're likely to fail either way? Since some are considering this an 0.20.2 blocker, could we get this commited as a solid improvement over what's there now (which makes very little sense) and then discuss whether the block boundary case should be improved?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > ... don't you think?
          I actually mean "per read per block", i.e. within a singe read, there are 3 retries for each block.

          I have no problem if we want to do it "per read". Stack, what do you think?

          Show
          Tsz Wo Nicholas Sze added a comment - > ... don't you think? I actually mean "per read per block", i.e. within a singe read, there are 3 retries for each block. I have no problem if we want to do it "per read". Stack, what do you think?
          Hide
          Todd Lipcon added a comment -

          Hey Nicholas. Thanks for taking a look.

          Should the failure count be reset per block, but not per read?

          This doesn't match my expectation. Consider the case of HBase, where a region server opens a single region (which may very well be a single block) and holds it open for days at a time. During the time while it's open, it may experience sporadic errors every once in a while due to a network blip or what have you. Just because the reader saw an error at 12pm, 3pm, and 6pm doesn't mean it should fail when it sees one at 9pm. Any successful read operation should reset the count, regardless of which block is being accessed, don't you think?

          Show
          Todd Lipcon added a comment - Hey Nicholas. Thanks for taking a look. Should the failure count be reset per block, but not per read? This doesn't match my expectation. Consider the case of HBase, where a region server opens a single region (which may very well be a single block) and holds it open for days at a time. During the time while it's open, it may experience sporadic errors every once in a while due to a network blip or what have you. Just because the reader saw an error at 12pm, 3pm, and 6pm doesn't mean it should fail when it sees one at 9pm. Any successful read operation should reset the count, regardless of which block is being accessed, don't you think?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Should the failure count be reset per block, but not per read?

          Also, I tried the patch. TestReadWhileWriting and TestFiHFlush failed although they do not seem related to the patch.

          Show
          Tsz Wo Nicholas Sze added a comment - Should the failure count be reset per block, but not per read? Also, I tried the patch. TestReadWhileWriting and TestFiHFlush failed although they do not seem related to the patch.
          Hide
          stack added a comment -

          I also ran full hdfs test suite. All tests passed though it failed in test-cactus:

          /Users/stack/checkouts/hdfs/trunk/build.xml:588: The following error occurred while executing this line:
          /Users/stack/checkouts/hdfs/trunk/build.xml:569: The following error occurred while executing this line:
          /Users/stack/checkouts/hdfs/trunk/src/contrib/build.xml:48: The following error occurred while executing this line:
          java.lang.NoSuchMethodError: org.apache.cactus.integration.ant.CactusTask.addClasspathEntry(Ljava/lang/String;)V
          
          Total time: 65 minutes 17 seconds
          

          No cactus on my machine...

          Show
          stack added a comment - I also ran full hdfs test suite. All tests passed though it failed in test-cactus: /Users/stack/checkouts/hdfs/trunk/build.xml:588: The following error occurred while executing this line: /Users/stack/checkouts/hdfs/trunk/build.xml:569: The following error occurred while executing this line: /Users/stack/checkouts/hdfs/trunk/src/contrib/build.xml:48: The following error occurred while executing this line: java.lang.NoSuchMethodError: org.apache.cactus.integration.ant.CactusTask.addClasspathEntry(Ljava/lang/ String ;)V Total time: 65 minutes 17 seconds No cactus on my machine...
          Hide
          stack added a comment -

          +1 on commit.

          When I ran:

          $ ANT_HOME=/usr/bin/ant ant -Dfindbugs.home=/Users/stack/bin/findbugs-1.3.9 -Djava5.home=/System/Library/Frameworks/JavaVM.framework/Versions/1.5/Home/ -Dforrest.home=/Users/stack/bin/apache-forrest-0.8 -Dcurl.cmd=/usr/bin/curl -Dpatch.file=/tmp/hdfs-927.txt test-patch
          

          on my local machine I got this...

          ...
               [exec] There appear to be 114 release audit warnings before the patch and 114 release audit warnings after applying the patch.
               [exec] 
               [exec] 
               [exec] 
               [exec] 
               [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 8 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
               [exec] 
               [exec] 
               [exec] 
               [exec] 
               [exec] ======================================================================
               [exec] ======================================================================
               [exec]     Finished build.
               [exec] ======================================================================
               [exec] ======================================================================
               [exec] 
               [exec] 
          
          BUILD SUCCESSFUL
          Total time: 12 minutes 47 seconds
          
          Show
          stack added a comment - +1 on commit. When I ran: $ ANT_HOME=/usr/bin/ant ant -Dfindbugs.home=/Users/stack/bin/findbugs-1.3.9 -Djava5.home=/ System /Library/Frameworks/JavaVM.framework/Versions/1.5/Home/ -Dforrest.home=/Users/stack/bin/apache-forrest-0.8 -Dcurl.cmd=/usr/bin/curl -Dpatch.file=/tmp/hdfs-927.txt test-patch on my local machine I got this... ... [exec] There appear to be 114 release audit warnings before the patch and 114 release audit warnings after applying the patch. [exec] [exec] [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 8 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] BUILD SUCCESSFUL Total time: 12 minutes 47 seconds
          Hide
          stack added a comment -

          The patch looks good to me resetting the 'failures' data member at the head of the two read methods above the chooseDataNode calls. I like the comment clarifying the intent of 'failures'.

          I ran the test suite and it looks like it all passed but result fell off the top of my screen. Will be back in morning with whether all tests passed for me locally.

          Show
          stack added a comment - The patch looks good to me resetting the 'failures' data member at the head of the two read methods above the chooseDataNode calls. I like the comment clarifying the intent of 'failures'. I ran the test suite and it looks like it all passed but result fell off the top of my screen. Will be back in morning with whether all tests passed for me locally.
          Hide
          Todd Lipcon added a comment -

          The three test-patches above all failed, but none of them had any failures in common except for org.apache.hadoop.hdfsproxy.TestProxyUtil.testSendCommand which is almost certainly related to recent security work...

          So please ignore the -1s. This patch is well unit tested by existing as well as new tests.

          Show
          Todd Lipcon added a comment - The three test-patches above all failed, but none of them had any failures in common except for org.apache.hadoop.hdfsproxy.TestProxyUtil.testSendCommand which is almost certainly related to recent security work... So please ignore the -1s. This patch is well unit tested by existing as well as new tests.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12431471/hdfs-927.txt
          against trunk revision 903906.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 8 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/213/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/213/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/213/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/213/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12431471/hdfs-927.txt against trunk revision 903906. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/213/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/213/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/213/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/213/console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Yet another test-patch snafu... NoClassDefFound errors, etc....I guess will resubmit? Hudson is becoming useless.

          Show
          Todd Lipcon added a comment - Yet another test-patch snafu... NoClassDefFound errors, etc....I guess will resubmit? Hudson is becoming useless.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12431471/hdfs-927.txt
          against trunk revision 903906.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 8 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/108/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/108/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/108/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/108/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12431471/hdfs-927.txt against trunk revision 903906. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/108/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/108/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/108/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/108/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          In the test result, there are 51 tests less than the previous build. I bet the build encountered some problem. Let's try again.

          Show
          Tsz Wo Nicholas Sze added a comment - In the test result , there are 51 tests less than the previous build. I bet the build encountered some problem. Let's try again.
          Hide
          Todd Lipcon added a comment -

          Can anyone explain this hudson result? it says -1 core tests, but the Test results page shows no failures...

          Show
          Todd Lipcon added a comment - Can anyone explain this hudson result? it says -1 core tests, but the Test results page shows no failures...
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12431471/hdfs-927.txt
          against trunk revision 903381.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 8 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12431471/hdfs-927.txt against trunk revision 903381. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          arg, same flaky NoClassDefFound Hudson junk. resubmitting

          Show
          Todd Lipcon added a comment - arg, same flaky NoClassDefFound Hudson junk. resubmitting
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12431471/hdfs-927.txt
          against trunk revision 903381.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 8 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12431471/hdfs-927.txt against trunk revision 903381. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          The crux of this issue is that the original HDFS-127 patch was bad. I'm not sure why it caused an infinite loop on 0.20 but not on later branches, but either way it doesn't do what it was supposed to.

          This patch adds test cases to check infinite loop behavior and also to verify that the correct number of retries are taken. I also took the approach I outlined at https://issues.apache.org/jira/browse/HDFS-127?focusedCommentId=12803077&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12803077 to fix HDFS-127's retry logic.

          Show
          Todd Lipcon added a comment - The crux of this issue is that the original HDFS-127 patch was bad. I'm not sure why it caused an infinite loop on 0.20 but not on later branches, but either way it doesn't do what it was supposed to. This patch adds test cases to check infinite loop behavior and also to verify that the correct number of retries are taken. I also took the approach I outlined at https://issues.apache.org/jira/browse/HDFS-127?focusedCommentId=12803077&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12803077 to fix HDFS-127 's retry logic.

            People

            • Assignee:
              Todd Lipcon
              Reporter:
              Todd Lipcon
            • Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development