Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-2034

length in getBlockRange becomes -ve when reading only from currently being written blk

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.23.0
    • Component/s: hdfs-client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      This came up during HDFS-1907. Posting an example that Todd posted in HDFS-1907 that brought out this issue.

      Here's an example sequence to describe what I mean:
      1. open file, write one and a half blocks
      2. call hflush
      3. another reader asks for the first byte of the second block

      In this case since offset is greater than the completed block length, the math in getBlockRange() of DFSInputStreamer.java will set "length" to negative.

      1. HDFS-2034.patch
        7 kB
        John George
      2. HDFS-2034-1.patch
        7 kB
        John George
      3. HDFS-2034-1.patch
        7 kB
        John George
      4. HDFS-2034-2.patch
        7 kB
        John George
      5. HDFS-2034-3.patch
        7 kB
        John George
      6. HDFS-2034-4.patch
        7 kB
        John George
      7. HDFS-2034-5.patch
        7 kB
        John George

        Activity

        Hide
        John George added a comment -

        Taking clues from what Daryn, Nicholas and Todd were suggesting/finding in HDFS-1907.

        This patch addresses the case where length could go negative under certain circumstances.

        Show
        John George added a comment - Taking clues from what Daryn, Nicholas and Todd were suggesting/finding in HDFS-1907 . This patch addresses the case where length could go negative under certain circumstances.
        Hide
        Todd Lipcon added a comment -

        Haven't had a chance to look over the patch in detail yet, but curious: if you add this test without changing the code itself, does it expose a bug as we expected?

        Show
        Todd Lipcon added a comment - Haven't had a chance to look over the patch in detail yet, but curious: if you add this test without changing the code itself, does it expose a bug as we expected?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12481436/HDFS-2034.patch
        against trunk revision 1131264.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.hdfs.TestHDFSTrash

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/708//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/708//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/708//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481436/HDFS-2034.patch against trunk revision 1131264. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/708//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/708//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/708//console This message is automatically generated.
        Hide
        John George added a comment -

        Before I ran the test, I added a check in getBlockRange() to assert if length was going negative. The tests failed on the assert. This is how I had confirmed that the length was going negative. But I had not run the tests without any source change (I should have) before submitting this patch.

        Show
        John George added a comment - Before I ran the test, I added a check in getBlockRange() to assert if length was going negative. The tests failed on the assert. This is how I had confirmed that the length was going negative. But I had not run the tests without any source change (I should have) before submitting this patch.
        Hide
        John George added a comment -

        I just ran the test without any source modification and the tests still passes. getFinalizedBlockRange() when called with a negative or zero length does the right thing. So, downgrading the bug from "major" to "minor".

        Show
        John George added a comment - I just ran the test without any source modification and the tests still passes. getFinalizedBlockRange() when called with a negative or zero length does the right thing. So, downgrading the bug from "major" to "minor".
        Hide
        Daryn Sharp added a comment -
        • I'd suggest changing "lastblkcomplete but read past that" to something more grammatically correct (I'm not the police!) like "can't read past last completed block".
        • There's an assertion for reading past the last completed block, but not for reading past the last located block?
        • I found the names of the booleans to not be clear at face value. Ie. I couldn't tell what they meant if I didn't see the assignment.

        For your consideration, maybe something like this with a sprinkling of comments:

        if (locatedBlocks.isLastBlockComplete())
          assert(!readOffsetPastCompletedBlock) : "can't read past last completed block";
        
        if (!readOffsetPastCompletedBlock) {
          if (readLengthPastCompletedBlock) {
            length = locatedBlocks.getFileLength() - offset;
          }
          blocks = getFinalizedBlockRange(offset, length);
        }
        if (readOffsetPastCompletedBlock) {
          Block lastBlock = locatedBlocks.getLastLocatedBlock();
          long lastBlockEndOffset = lastBlock.getStartOffset() + lastBlock.getBlockSize();
          assert(offset < lastBlockEndOffset) : "can't read past last located block";
          blocks.add(lastLocatedBlock);
        }
        
        Show
        Daryn Sharp added a comment - I'd suggest changing "lastblkcomplete but read past that" to something more grammatically correct (I'm not the police!) like "can't read past last completed block" . There's an assertion for reading past the last completed block, but not for reading past the last located block? I found the names of the booleans to not be clear at face value. Ie. I couldn't tell what they meant if I didn't see the assignment. For your consideration, maybe something like this with a sprinkling of comments: if (locatedBlocks.isLastBlockComplete()) assert (!readOffsetPastCompletedBlock) : "can't read past last completed block" ; if (!readOffsetPastCompletedBlock) { if (readLengthPastCompletedBlock) { length = locatedBlocks.getFileLength() - offset; } blocks = getFinalizedBlockRange(offset, length); } if (readOffsetPastCompletedBlock) { Block lastBlock = locatedBlocks.getLastLocatedBlock(); long lastBlockEndOffset = lastBlock.getStartOffset() + lastBlock.getBlockSize(); assert (offset < lastBlockEndOffset) : "can't read past last located block" ; blocks.add(lastLocatedBlock); }
        Hide
        John George added a comment -

        > Daryn Sharp commented on HDFS-2034:
        > -----------------------------------
        >
        > * I'd suggest changing "lastblkcomplete but read past that" to something
        > more grammatically correct (I'm not the police!) like {{"can't read past last
        > completed block"}}.

        Good point.

        >
        > * There's an assertion for reading past the last completed block, but not
        > for reading past the last located block?

        The reason that I added assert for completed blocks is because I was changing a previous assumption in that code. So, I wanted to make sure that I was not breaking the logic behind it. I should probably remove this assert.

        The reason I am hesitant about adding an assert for reading past the last block is because it has to go out of its way to calculate the length of all the blocks (which is already being done by the caller of this function) and that does not seem to be the job of this function.

        >
        > * I found the names of the booleans to not be clear at face value. Ie. I
        > couldn't tell what they meant if I didn't see the assignment.

        On reading the booleans today, I have to say I agree with you on two of the three boolean names.

        Changed readOnlyIncompleteBlk to readOffsetWithinCompleteBlk (now it means the opposite of what it meant before, but it seems to help the code logic).

        Changed readPastCompletedBlk to readLengthPastCompletedBlock.
        I would like to continue to use lengthOfCompleteBlk instead of locatedBlocks.getFileLength() because (and as Todd suggested in HDFS-1907), it make sense to call it something related to completeBlocks as opposed to FileLength.

        >
        > For your consideration, maybe something like this with a sprinkling of
        > comments:
        >

        > if (locatedBlocks.isLastBlockComplete())
        >   assert(!readOffsetPastCompletedBlock) : "can't read past last completed 
        > block";
        > 
        > if (!readOffsetPastCompletedBlock) {
        >   if (readLengthPastCompletedBlock) {
        >     length = locatedBlocks.getFileLength() - offset;
        >   }
        >   blocks = getFinalizedBlockRange(offset, length);
        > }
        > if (readOffsetPastCompletedBlock) {
        >   Block lastBlock = locatedBlocks.getLastLocatedBlock();
        >   long lastBlockEndOffset = lastBlock.getStartOffset() + 
        > lastBlock.getBlockSize();
        >   assert(offset < lastBlockEndOffset) : "can't read past last located block";
        >   blocks.add(lastLocatedBlock);
        > }
        > 

        >

        Thanks for your suggestions.

        Show
        John George added a comment - > Daryn Sharp commented on HDFS-2034 : > ----------------------------------- > > * I'd suggest changing "lastblkcomplete but read past that" to something > more grammatically correct (I'm not the police!) like {{"can't read past last > completed block"}}. Good point. > > * There's an assertion for reading past the last completed block, but not > for reading past the last located block? The reason that I added assert for completed blocks is because I was changing a previous assumption in that code. So, I wanted to make sure that I was not breaking the logic behind it. I should probably remove this assert. The reason I am hesitant about adding an assert for reading past the last block is because it has to go out of its way to calculate the length of all the blocks (which is already being done by the caller of this function) and that does not seem to be the job of this function. > > * I found the names of the booleans to not be clear at face value. Ie. I > couldn't tell what they meant if I didn't see the assignment. On reading the booleans today, I have to say I agree with you on two of the three boolean names. Changed readOnlyIncompleteBlk to readOffsetWithinCompleteBlk (now it means the opposite of what it meant before, but it seems to help the code logic). Changed readPastCompletedBlk to readLengthPastCompletedBlock. I would like to continue to use lengthOfCompleteBlk instead of locatedBlocks.getFileLength() because (and as Todd suggested in HDFS-1907 ), it make sense to call it something related to completeBlocks as opposed to FileLength. > > For your consideration, maybe something like this with a sprinkling of > comments: > > if (locatedBlocks.isLastBlockComplete()) > assert (!readOffsetPastCompletedBlock) : "can't read past last completed > block"; > > if (!readOffsetPastCompletedBlock) { > if (readLengthPastCompletedBlock) { > length = locatedBlocks.getFileLength() - offset; > } > blocks = getFinalizedBlockRange(offset, length); > } > if (readOffsetPastCompletedBlock) { > Block lastBlock = locatedBlocks.getLastLocatedBlock(); > long lastBlockEndOffset = lastBlock.getStartOffset() + > lastBlock.getBlockSize(); > assert (offset < lastBlockEndOffset) : "can't read past last located block" ; > blocks.add(lastLocatedBlock); > } > > Thanks for your suggestions.
        Hide
        John George added a comment -

        > Daryn Sharp commented on HDFS-2034:
        > -----------------------------------
        >
        > * I'd suggest changing "lastblkcomplete but read past that" to something
        > more grammatically correct (I'm not the police!) like {{"can't read past last
        > completed block"}}.

        Good point.

        >
        > * There's an assertion for reading past the last completed block, but not
        > for reading past the last located block?

        The reason that I added assert for completed blocks is because I was changing a previous assumption in that code. So, I wanted to make sure that I was not breaking the logic behind it. I should probably remove this assert.

        The reason I am hesitant about adding an assert for reading past the last block is because it has to go out of its way to calculate the length of all the blocks (which is already being done by the caller of this function) and that does not seem to be the job of this function.

        >
        > * I found the names of the booleans to not be clear at face value. Ie. I
        > couldn't tell what they meant if I didn't see the assignment.

        On reading the booleans today, I have to say I agree with you on two of the three boolean names.

        Changed readOnlyIncompleteBlk to readOffsetWithinCompleteBlk (now it means the opposite of what it meant before, but it seems to help the code logic).

        Changed readPastCompletedBlk to readLengthPastCompletedBlock.
        I would like to continue to use lengthOfCompleteBlk instead of locatedBlocks.getFileLength() because (and as Todd suggested in HDFS-1907), it make sense to call it something related to completeBlocks as opposed to FileLength.

        >
        > For your consideration, maybe something like this with a sprinkling of
        > comments:
        >

        > if (locatedBlocks.isLastBlockComplete())
        >   assert(!readOffsetPastCompletedBlock) : "can't read past last completed 
        > block";
        > 
        > if (!readOffsetPastCompletedBlock) {
        >   if (readLengthPastCompletedBlock) {
        >     length = locatedBlocks.getFileLength() - offset;
        >   }
        >   blocks = getFinalizedBlockRange(offset, length);
        > }
        > if (readOffsetPastCompletedBlock) {
        >   Block lastBlock = locatedBlocks.getLastLocatedBlock();
        >   long lastBlockEndOffset = lastBlock.getStartOffset() + 
        > lastBlock.getBlockSize();
        >   assert(offset < lastBlockEndOffset) : "can't read past last located block";
        >   blocks.add(lastLocatedBlock);
        > }
        > 

        >

        Thanks for your suggestions.

        Show
        John George added a comment - > Daryn Sharp commented on HDFS-2034 : > ----------------------------------- > > * I'd suggest changing "lastblkcomplete but read past that" to something > more grammatically correct (I'm not the police!) like {{"can't read past last > completed block"}}. Good point. > > * There's an assertion for reading past the last completed block, but not > for reading past the last located block? The reason that I added assert for completed blocks is because I was changing a previous assumption in that code. So, I wanted to make sure that I was not breaking the logic behind it. I should probably remove this assert. The reason I am hesitant about adding an assert for reading past the last block is because it has to go out of its way to calculate the length of all the blocks (which is already being done by the caller of this function) and that does not seem to be the job of this function. > > * I found the names of the booleans to not be clear at face value. Ie. I > couldn't tell what they meant if I didn't see the assignment. On reading the booleans today, I have to say I agree with you on two of the three boolean names. Changed readOnlyIncompleteBlk to readOffsetWithinCompleteBlk (now it means the opposite of what it meant before, but it seems to help the code logic). Changed readPastCompletedBlk to readLengthPastCompletedBlock. I would like to continue to use lengthOfCompleteBlk instead of locatedBlocks.getFileLength() because (and as Todd suggested in HDFS-1907 ), it make sense to call it something related to completeBlocks as opposed to FileLength. > > For your consideration, maybe something like this with a sprinkling of > comments: > > if (locatedBlocks.isLastBlockComplete()) > assert (!readOffsetPastCompletedBlock) : "can't read past last completed > block"; > > if (!readOffsetPastCompletedBlock) { > if (readLengthPastCompletedBlock) { > length = locatedBlocks.getFileLength() - offset; > } > blocks = getFinalizedBlockRange(offset, length); > } > if (readOffsetPastCompletedBlock) { > Block lastBlock = locatedBlocks.getLastLocatedBlock(); > long lastBlockEndOffset = lastBlock.getStartOffset() + > lastBlock.getBlockSize(); > assert (offset < lastBlockEndOffset) : "can't read past last located block" ; > blocks.add(lastLocatedBlock); > } > > Thanks for your suggestions.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12481588/HDFS-2034-1.patch
        against trunk revision 1132698.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/716//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/716//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/716//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481588/HDFS-2034-1.patch against trunk revision 1132698. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/716//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/716//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/716//console This message is automatically generated.
        Hide
        Daryn Sharp added a comment -

        Looking good!

        The reason I am hesitant about adding an assert for reading past the last block is because it has to go out of its way to calculate the length of all the blocks (which is already being done by the caller of this function) and that does not seem to be the job of this function.

        Sure, I agree that it's usually ok to omit sanity checks in private methods. Especially if the method will blow up with an exception if the arg is insane. Ex. not checking for nulls. In this case, omitting a sanity check of the offset creates another latent defect similar to ones being fixed. Let's imagine that in the future a caller passes an offset beyond the length of the file. The caller will incorrectly get the last block, and some poor dev will have to debug it.

        I took a peek, and only 1 assert is needed, and it's really cheap because all the values are pre-computed and it's just a couple field accesses. DFSInputStream's getFileLength() returns the full size of the located blocks and the bytes in the last unwritten block. Perfect. The first line of the method could be:

        assert(offset < getFileLength(), "offset exceeds file length");

        I'll defer to Todd if wishes to contradict my opinion.

        Show
        Daryn Sharp added a comment - Looking good! The reason I am hesitant about adding an assert for reading past the last block is because it has to go out of its way to calculate the length of all the blocks (which is already being done by the caller of this function) and that does not seem to be the job of this function. Sure, I agree that it's usually ok to omit sanity checks in private methods. Especially if the method will blow up with an exception if the arg is insane. Ex. not checking for nulls. In this case, omitting a sanity check of the offset creates another latent defect similar to ones being fixed. Let's imagine that in the future a caller passes an offset beyond the length of the file. The caller will incorrectly get the last block, and some poor dev will have to debug it. I took a peek, and only 1 assert is needed, and it's really cheap because all the values are pre-computed and it's just a couple field accesses. DFSInputStream 's getFileLength() returns the full size of the located blocks and the bytes in the last unwritten block. Perfect. The first line of the method could be: assert(offset < getFileLength(), "offset exceeds file length"); I'll defer to Todd if wishes to contradict my opinion.
        Hide
        John George added a comment -

        Good point about the "incomplete block being added to an offset larger than file size". I completely agree with you that the function should ensure what it sends back is what it was asked for. In that case, we should probably just check for the requested offset. My concern with that specific assert is that, with that assert, this function controls the behavior in cases where the user requested for an offset > EOF, while the upper layer might specifically want a different behavior (like sending back an EOF exception or just returning 0). So, my take on this is that as long as the function behaves like it is asked to, we should probably not assert.

        Adding a new patch based on your suggestion about getFileLength().

        Thanks again Daryn.

        Show
        John George added a comment - Good point about the "incomplete block being added to an offset larger than file size". I completely agree with you that the function should ensure what it sends back is what it was asked for. In that case, we should probably just check for the requested offset. My concern with that specific assert is that, with that assert, this function controls the behavior in cases where the user requested for an offset > EOF, while the upper layer might specifically want a different behavior (like sending back an EOF exception or just returning 0). So, my take on this is that as long as the function behaves like it is asked to, we should probably not assert. Adding a new patch based on your suggestion about getFileLength(). Thanks again Daryn.
        Hide
        Todd Lipcon added a comment -

        In general regarding asserts:

        • definitely don't rely on asserts to sanity-check user input - most people run without asserts on
        • I wouldn't be too concerned about performance of asserts unless it's doing something like an RPC – since they're only usually on in unit tests, it's reasonable to trade off for safety over performance

        Some notes on the specific patch:

        • In this case, I think the assert that Daryn suggested is reasonable – given it's a private method, it shouldn't ever have this invalid input, since the input is already sanity checked in the read code path, right?
        • the new ArrayList<LocatedBlock>() is unneeded in the usual case, since the blocks variable is overwritten anyway. Perhaps you could push this to an else clause?
        Show
        Todd Lipcon added a comment - In general regarding asserts: definitely don't rely on asserts to sanity-check user input - most people run without asserts on I wouldn't be too concerned about performance of asserts unless it's doing something like an RPC – since they're only usually on in unit tests, it's reasonable to trade off for safety over performance Some notes on the specific patch: In this case, I think the assert that Daryn suggested is reasonable – given it's a private method, it shouldn't ever have this invalid input, since the input is already sanity checked in the read code path, right? the new ArrayList<LocatedBlock>() is unneeded in the usual case, since the blocks variable is overwritten anyway. Perhaps you could push this to an else clause?
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12481607/HDFS-2034-2.patch
        against trunk revision 1132715.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/723//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/723//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/723//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481607/HDFS-2034-2.patch against trunk revision 1132715. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/723//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/723//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/723//console This message is automatically generated.
        Hide
        John George added a comment -

        Thanks Todd. Attaching a patch with the following changes.

        • added assert to check if offset > file length
        • moved blocks = new ArrayList... to an else
        Show
        John George added a comment - Thanks Todd. Attaching a patch with the following changes. added assert to check if offset > file length moved blocks = new ArrayList... to an else
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12481616/HDFS-2034-3.patch
        against trunk revision 1132715.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.hdfs.server.namenode.TestBackupNode

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/725//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/725//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/725//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481616/HDFS-2034-3.patch against trunk revision 1132715. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.namenode.TestBackupNode +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/725//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/725//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/725//console This message is automatically generated.
        Hide
        Daryn Sharp added a comment -

        +1 Looks good! Minor suggestion is to move the assert to the first line of the method since there's no use in assigning the other values which aren't used by the assert. It's not a big deal, up to you.

        Show
        Daryn Sharp added a comment - +1 Looks good! Minor suggestion is to move the assert to the first line of the method since there's no use in assigning the other values which aren't used by the assert. It's not a big deal, up to you.
        Hide
        Todd Lipcon added a comment -

        Couple of small nits:

        • rather than reassiging the length parameter, I think it would be clearer to just do:
          blocks = getFinalizedBlockRange(offset, Math.min(length, lengthOfCompleteBlk - offset));
          

          don't you think?

        • when instantiating the ArrayList, pass a capacity of 1 since you know that it will have at most 1 element. (the default capacity for ArrayList is 10)
        • our style guide says to use curly braces even for one-line if statements. There are a couple missing in the patch
        • another style thing: for short inline comments it's usually preferred to // style comments instead of /* ... */ for whatever reason.
        Show
        Todd Lipcon added a comment - Couple of small nits: rather than reassiging the length parameter, I think it would be clearer to just do: blocks = getFinalizedBlockRange(offset, Math .min(length, lengthOfCompleteBlk - offset)); don't you think? when instantiating the ArrayList, pass a capacity of 1 since you know that it will have at most 1 element. (the default capacity for ArrayList is 10) our style guide says to use curly braces even for one-line if statements. There are a couple missing in the patch another style thing: for short inline comments it's usually preferred to // style comments instead of /* ... */ for whatever reason.
        Hide
        John George added a comment -

        New patch with Tod's and Daryn's comments included.

        Show
        John George added a comment - New patch with Tod's and Daryn's comments included.
        Hide
        Daryn Sharp added a comment -

        I noticed that the last conditional of if (readLengthPastCompleteBlk && offset < getFileLength()) dropped the RHS side. Presumably this is because it's the condition in the early assert? Todd mentioned earlier that most people run with asserts disabled, which means the method will return the wrong data if asserts are off and the constraint is violated. I'd lean towards the assert being an exception (not sure why we didn't discuss that option earlier), but I'd be happy if the RHS condition was re-added. Todd can be the judge.

        Show
        Daryn Sharp added a comment - I noticed that the last conditional of if (readLengthPastCompleteBlk && offset < getFileLength()) dropped the RHS side. Presumably this is because it's the condition in the early assert? Todd mentioned earlier that most people run with asserts disabled, which means the method will return the wrong data if asserts are off and the constraint is violated. I'd lean towards the assert being an exception (not sure why we didn't discuss that option earlier), but I'd be happy if the RHS condition was re-added. Todd can be the judge.
        Hide
        John George added a comment -

        Daryn,
        thank you for reviewing this patch again. I agree with you on changing the assert to Exception. The attached patch has that changed.
        I have left out checking for the offset > getFileLength() again since the exception should cover for that case.

        Show
        John George added a comment - Daryn, thank you for reviewing this patch again. I agree with you on changing the assert to Exception. The attached patch has that changed. I have left out checking for the offset > getFileLength() again since the exception should cover for that case.
        Hide
        Daryn Sharp added a comment -

        +1 Great! Now we're protected whether or not asserts are enabled.

        Show
        Daryn Sharp added a comment - +1 Great! Now we're protected whether or not asserts are enabled.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12485740/HDFS-2034-5.patch
        against trunk revision 1144100.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.hdfs.TestDFSShell
        org.apache.hadoop.hdfs.TestHDFSTrash

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/899//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/899//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/899//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12485740/HDFS-2034-5.patch against trunk revision 1144100. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/899//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/899//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/899//console This message is automatically generated.
        Hide
        John George added a comment -

        I don't think the above tests fail because of this patch, but I dont see the tests failing in the patches before this one... When I run these tests manually, they seem to pass as well.

        Show
        John George added a comment - I don't think the above tests fail because of this patch, but I dont see the tests failing in the patches before this one... When I run these tests manually, they seem to pass as well.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        I agreed that the failed tests are not related. Some tests failed with

        java.lang.IllegalArgumentException: port out of range:-1
        	at java.net.InetSocketAddress.<init>(InetSocketAddress.java:118)
        	at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:597)
                ...
        
        Show
        Tsz Wo Nicholas Sze added a comment - I agreed that the failed tests are not related. Some tests failed with java.lang.IllegalArgumentException: port out of range:-1 at java.net.InetSocketAddress.<init>(InetSocketAddress.java:118) at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:597) ...
        Hide
        Tsz Wo Nicholas Sze added a comment -

        I have committed this. Thanks, John.

        Also thanks Daryn for reviewing it.

        Show
        Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, John. Also thanks Daryn for reviewing it.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #776 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/776/)
        HDFS-2034. Length in DFSInputStream.getBlockRange(..) becomes -ve when reading only from a currently being written block. Contributed by John George

        szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1144480
        Files :

        • /hadoop/common/trunk/hdfs/CHANGES.txt
        • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java
        • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteRead.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #776 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/776/ ) HDFS-2034 . Length in DFSInputStream.getBlockRange(..) becomes -ve when reading only from a currently being written block. Contributed by John George szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1144480 Files : /hadoop/common/trunk/hdfs/CHANGES.txt /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteRead.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #720 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/720/)
        HDFS-2034. Length in DFSInputStream.getBlockRange(..) becomes -ve when reading only from a currently being written block. Contributed by John George

        szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1144480
        Files :

        • /hadoop/common/trunk/hdfs/CHANGES.txt
        • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java
        • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteRead.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #720 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/720/ ) HDFS-2034 . Length in DFSInputStream.getBlockRange(..) becomes -ve when reading only from a currently being written block. Contributed by John George szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1144480 Files : /hadoop/common/trunk/hdfs/CHANGES.txt /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteRead.java

          People

          • Assignee:
            John George
            Reporter:
            John George
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development