Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1907

BlockMissingException upon concurrent read and write: reader was doing file position read while writer is doing write without hflush

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: hdfs-client
    • Labels:
      None
    • Environment:

      Run on a real cluster. Using the latest 0.23 build.

    • Hadoop Flags:
      Reviewed

      Description

      BlockMissingException is thrown under this test scenario:
      Two different processes doing concurrent file r/w: one read and the other write on the same file

      • writer keep doing file write
      • reader doing position file read from beginning of the file to the visible end of file, repeatedly

      The reader is basically doing:
      byteRead = in.read(currentPosition, buffer, 0, byteToReadThisRound);
      where CurrentPostion=0, buffer is a byte array buffer, byteToReadThisRound = 1024*10000;

      Usually it does not fail right away. I have to read, close file, re-open the same file a few times to create the problem. I'll pose a test program to repro this problem after I've cleaned up a bit my current test program.

      1. HDFS-1907.patch
        1 kB
        John George
      2. HDFS-1907-2.patch
        3 kB
        John George
      3. HDFS-1907-3.patch
        3 kB
        John George
      4. HDFS-1907-4.patch
        2 kB
        John George
      5. HDFS-1907-5.patch
        2 kB
        John George
      6. HDFS-1907-5.patch
        2 kB
        John George

        Issue Links

          Activity

          Hide
          CW Chung added a comment -

          Stack Trace:
          =========================
          11/05/08 19:05:48 WARN hdfs.DFSClient: Failed to connect to /xxx.xx.xx.xxx:1004 for file /tmp/N/909NF for block
          BP-1632719171-xxx.xx.xx.xxx-1303748685682:blk_-8940328094159486414_3653:java.io.IOException: Got error for
          OP_READ_BLOCK, self=/xxx.xx.xx.xxx:36405, remote=/xxx.xx.xx.xxx:1004, for file /tmp/N/909NF, for pool
          BP-1632719171-xxx.xx.xx.xxx-1303748685682 block -8940328094159486414_3653
          at org.apache.hadoop.hdfs.BlockReader.newBlockReader(BlockReader.java:398)
          at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:631)
          at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:704)
          at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:51)
          ....
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.hadoop.util.RunJar.main(RunJar.java:192)

          Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block:
          BP-1632719171-xxx.xx.xx.xxx-1303748685682:blk_-8940328094159486414_3653 file=/tmp/N/909NF
          at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:570)
          at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:618)
          at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:704)
          at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:51)

          Show
          CW Chung added a comment - Stack Trace: ========================= 11/05/08 19:05:48 WARN hdfs.DFSClient: Failed to connect to /xxx.xx.xx.xxx:1004 for file /tmp/N/909NF for block BP-1632719171-xxx.xx.xx.xxx-1303748685682:blk_-8940328094159486414_3653:java.io.IOException: Got error for OP_READ_BLOCK, self=/xxx.xx.xx.xxx:36405, remote=/xxx.xx.xx.xxx:1004, for file /tmp/N/909NF, for pool BP-1632719171-xxx.xx.xx.xxx-1303748685682 block -8940328094159486414_3653 at org.apache.hadoop.hdfs.BlockReader.newBlockReader(BlockReader.java:398) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:631) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:704) at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:51) .... at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:192) Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1632719171-xxx.xx.xx.xxx-1303748685682:blk_-8940328094159486414_3653 file=/tmp/N/909NF at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:570) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:618) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:704) at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:51)
          Hide
          John George added a comment -

          In cases where a request for data from a file that has blocks that are yet to be finalized is made, DFSInputStream, assumes that the request that was made included the unfinalized block as well. This causes the offset to go negative.

          Show
          John George added a comment - In cases where a request for data from a file that has blocks that are yet to be finalized is made, DFSInputStream, assumes that the request that was made included the unfinalized block as well. This causes the offset to go negative.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481036/HDFS-1907.patch
          against trunk revision 1129942.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestDFSUpgradeFromImage

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/671//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/671//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/671//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481036/HDFS-1907.patch against trunk revision 1129942. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/671//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/671//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/671//console This message is automatically generated.
          Hide
          John George added a comment -

          The test in HDFS-1968 is the one that caused this bug to show up. So, hoping to use the same as unit test and hence no additional tests added. I dont think TestDFSUpgradeFromImage failure was caused by this patch since it fails in build #669 as well.

          Show
          John George added a comment - The test in HDFS-1968 is the one that caused this bug to show up. So, hoping to use the same as unit test and hence no additional tests added. I dont think TestDFSUpgradeFromImage failure was caused by this patch since it fails in build #669 as well.
          Hide
          Daryn Sharp added a comment -

          I'd suggest a temporary boolean for whether the read went past the end of the finalized block. You might even consider simplifying all the logic to:

          -    final List<LocatedBlock> blocks;
          -    if (locatedBlocks.isLastBlockComplete()) {
          -      blocks = getFinalizedBlockRange(offset, length);
          -    }
          -    else {
          -      if (length + offset > locatedBlocks.getFileLength()) {
          -        length = locatedBlocks.getFileLength() - offset;
          -      }
          -      blocks = getFinalizedBlockRange(offset, length);
          +    boolean readPastEnd = (offset + length > locatedBlocks.getFileLength());
          +    if (readPastEnd) length = locatedBlocks.getFileLength() - offset;
          +
          +    final List<LocatedBlock> blocks = getFinalizedBlockRange(offset, length);
          +    if (readPastEnd && !locatedBlocks.isLastBlockComplete()) {
                 blocks.add(locatedBlocks.getLastLocatedBlock());
               }
          
          Show
          Daryn Sharp added a comment - I'd suggest a temporary boolean for whether the read went past the end of the finalized block. You might even consider simplifying all the logic to: - final List<LocatedBlock> blocks; - if (locatedBlocks.isLastBlockComplete()) { - blocks = getFinalizedBlockRange(offset, length); - } - else { - if (length + offset > locatedBlocks.getFileLength()) { - length = locatedBlocks.getFileLength() - offset; - } - blocks = getFinalizedBlockRange(offset, length); + boolean readPastEnd = (offset + length > locatedBlocks.getFileLength()); + if (readPastEnd) length = locatedBlocks.getFileLength() - offset; + + final List<LocatedBlock> blocks = getFinalizedBlockRange(offset, length); + if (readPastEnd && !locatedBlocks.isLastBlockComplete()) { blocks.add(locatedBlocks.getLastLocatedBlock()); }
          Hide
          John George added a comment -

          Thanks Daryn. Attaching another patch taking Daryns comments and also enabling position based testing in TestWriteRead.java

          Show
          John George added a comment - Thanks Daryn. Attaching another patch taking Daryns comments and also enabling position based testing in TestWriteRead.java
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481145/HDFS-1907-2.patch
          against trunk revision 1130262.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestDFSUpgradeFromImage

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/677//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/677//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/677//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481145/HDFS-1907-2.patch against trunk revision 1130262. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/677//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/677//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/677//console This message is automatically generated.
          Hide
          John George added a comment -

          patch synced up to the trunk

          Show
          John George added a comment - patch synced up to the trunk
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481225/HDFS-1907-3.patch
          against trunk revision 1130381.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/685//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/685//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/685//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481225/HDFS-1907-3.patch against trunk revision 1130381. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/685//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/685//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/685//console This message is automatically generated.
          Hide
          John George added a comment -

          updating patch to the trunk again

          Show
          John George added a comment - updating patch to the trunk again
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481269/HDFS-1907-4.patch
          against trunk revision 1130693.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/690//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/690//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/690//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481269/HDFS-1907-4.patch against trunk revision 1130693. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/690//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/690//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/690//console This message is automatically generated.
          Hide
          Daryn Sharp added a comment -

          I just did a cursory read of getFinalizedBlockRange. If the file is committed, it appears to throw an exception if the client attempts to read past the end of file. However, in the case where the file has an uncompleted block, I think there's a secondary bug where no exception is thrown if the client is attempting to read past the end of the uncompleted block. Basically a final bounds check should be added.

          Small issue with the test. I know you probably copied-n-pasted, however assertTrue(stat == 0) is an anti-pattern. assertTrue should used to test a real boolean value, not a fabricated boolean within the assert. This condition needs to be assertEquals(0, stat). The reason is if assertTrue fails, junit can't report the bad value. Whereas assertEquals will report "I expected this, but got that".

          If you feel esp. industrious, you might pay it forward and use a regexp to search and replace the other similar conditions to assertEquals... The next dev that breaks a test will thank you. How much you do is up to you, but at least fix the new test.

          Show
          Daryn Sharp added a comment - I just did a cursory read of getFinalizedBlockRange . If the file is committed, it appears to throw an exception if the client attempts to read past the end of file. However, in the case where the file has an uncompleted block, I think there's a secondary bug where no exception is thrown if the client is attempting to read past the end of the uncompleted block. Basically a final bounds check should be added. Small issue with the test. I know you probably copied-n-pasted, however assertTrue(stat == 0) is an anti-pattern. assertTrue should used to test a real boolean value, not a fabricated boolean within the assert. This condition needs to be assertEquals(0, stat) . The reason is if assertTrue fails, junit can't report the bad value. Whereas assertEquals will report "I expected this , but got that ". If you feel esp. industrious, you might pay it forward and use a regexp to search and replace the other similar conditions to assertEquals ... The next dev that breaks a test will thank you. How much you do is up to you, but at least fix the new test.
          Hide
          John George added a comment -

          Thanks for the comments Daryn.
          Fixed the test with AssertEquals().

          As for the "IOException while trying to read past EOF issue", I do not think any IOException is thrown (or I dont see any).

          Infact, I think that if a call is made beyond the size that NN knows, it sets the "blocks[]" to null and lastLocatedBlock to the last locatedblock value (which to me seems to be a bug from NN side). But, may be that is the reason the function is called getFinalizedBlockRange(). Only finalized block range is asked from the namenode.

          The fileSize is known before this call and thus the length is adjusted to a correct value before this call is made.

          Either way, I appreciate you taking the time to look over the code.

          Show
          John George added a comment - Thanks for the comments Daryn. Fixed the test with AssertEquals(). As for the "IOException while trying to read past EOF issue", I do not think any IOException is thrown (or I dont see any). Infact, I think that if a call is made beyond the size that NN knows, it sets the "blocks[]" to null and lastLocatedBlock to the last locatedblock value (which to me seems to be a bug from NN side). But, may be that is the reason the function is called getFinalizedBlockRange(). Only finalized block range is asked from the namenode. The fileSize is known before this call and thus the length is adjusted to a correct value before this call is made. Either way, I appreciate you taking the time to look over the code.
          Hide
          Daryn Sharp added a comment -

          I might very well be misreading the code... It looks like the only way to leave getFinalizedBlockRange is either:

          • Locate all blocks for the range spanned by (offset, length)
          • Exceed the file bounds and fail on
            assert curOff >= blk.getStartOffset() : "Block not found";

          Thus when all blocks are finalized, reading past EOF appears to trigger an assertion.

          However, when the last block is unfinalized, the behavior appears to be different/inconsistent:

          1. getFinalizedBlockRange is called with a clipped length – perhaps specifically to avoid the assertion?
          2. Retrieves the unfinalized block.
          3. TODO? Assert if the read request exceeds the length of the unfinalized + finalized blocks.
          Show
          Daryn Sharp added a comment - I might very well be misreading the code... It looks like the only way to leave getFinalizedBlockRange is either: Locate all blocks for the range spanned by (offset, length) Exceed the file bounds and fail on assert curOff >= blk.getStartOffset() : "Block not found" ; Thus when all blocks are finalized, reading past EOF appears to trigger an assertion. However, when the last block is unfinalized, the behavior appears to be different/inconsistent: getFinalizedBlockRange is called with a clipped length – perhaps specifically to avoid the assertion? Retrieves the unfinalized block. TODO? Assert if the read request exceeds the length of the unfinalized + finalized blocks.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481303/HDFS-1907-5.patch
          against trunk revision 1130870.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestHDFSTrash

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/694//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/694//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/694//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481303/HDFS-1907-5.patch against trunk revision 1130870. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/694//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/694//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/694//console This message is automatically generated.
          Hide
          John George added a comment -
          assert curOff >= blk.getStartOffset() : "Block not found";
          

          The only way that (I can think of) where this assert is hit is if bytesRead becomes negative and/or a wrong block is returned.

          TODO? Assert if the read request exceeds the length of the unfinalized + finalized blocks.
          

          Sorry but I do not get the point of this TODO? I do not see how it brings consistency. Can you elaborate a little more? And I am little lost as to why both the asserts are related.

          Show
          John George added a comment - assert curOff >= blk.getStartOffset() : "Block not found"; The only way that (I can think of) where this assert is hit is if bytesRead becomes negative and/or a wrong block is returned. TODO? Assert if the read request exceeds the length of the unfinalized + finalized blocks. Sorry but I do not get the point of this TODO? I do not see how it brings consistency. Can you elaborate a little more? And I am little lost as to why both the asserts are related.
          Hide
          Daryn Sharp added a comment -

          +1
          Ah. I didn't go further up the callstack to see that the length is already being trimmed, so as you say, the assert shouldn't happen even when all blocks are finalized. Looks good! Your change does, however some of the surrounding code appears specious...

          Show
          Daryn Sharp added a comment - +1 Ah. I didn't go further up the callstack to see that the length is already being trimmed, so as you say, the assert shouldn't happen even when all blocks are finalized. Looks good! Your change does, however some of the surrounding code appears specious...
          Hide
          Todd Lipcon added a comment -

          nits:

          • positionReadOption is already assigned to false for the first test a few lines above where you added it
          • the new test method shouldn't start with a capital T - our method names are camelCase, not UpperCamelCase

          One question on the logic:

          • if the last block is in-progress, but you read from within that block, then readPastEnd will be false. Hence we will only call getFinalizedBlockRange. But the last block is not finalized. Does this case work? If so, then that means getFinalizedBlockRange also includes non-finalized blocks, which is contrary to its name. Here's an example sequence to describe what I mean:
          • open file, write one and a half blocks
          • call hflush
          • another reader asks for the first byte of the second block
          Show
          Todd Lipcon added a comment - nits: positionReadOption is already assigned to false for the first test a few lines above where you added it the new test method shouldn't start with a capital T - our method names are camelCase, not UpperCamelCase One question on the logic: if the last block is in-progress, but you read from within that block, then readPastEnd will be false. Hence we will only call getFinalizedBlockRange. But the last block is not finalized. Does this case work? If so, then that means getFinalizedBlockRange also includes non-finalized blocks, which is contrary to its name. Here's an example sequence to describe what I mean: open file, write one and a half blocks call hflush another reader asks for the first byte of the second block
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks, John!

          Also thanks Daryn for reviewing the patches.

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, John! Also thanks Daryn for reviewing the patches.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Todd,
          Oops, sorry that I did not see you comments when I was committing the patch. Let's fix the nits separately.

          Show
          Tsz Wo Nicholas Sze added a comment - Todd, Oops, sorry that I did not see you comments when I was committing the patch. Let's fix the nits separately.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Created HDFS-2029.

          Show
          Tsz Wo Nicholas Sze added a comment - Created HDFS-2029 .
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #710 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/710/)
          HDFS-1907. Fix position read for reading still-being-written file in DFSInputStream. Contributed by John George

          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1131124
          Files :

          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/DFSInputStream.java
          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteRead.java
          • /hadoop/hdfs/trunk/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #710 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/710/ ) HDFS-1907 . Fix position read for reading still-being-written file in DFSInputStream. Contributed by John George szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1131124 Files : /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/DFSInputStream.java /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteRead.java /hadoop/hdfs/trunk/CHANGES.txt
          Hide
          Todd Lipcon added a comment -

          No problem Nicholas. Do you have any thought on the logic question I raised? It seems like we either have a bug, or we have some methods that need comments/names updated. Or, it's entirely possible that I'm being dumb (it's before noon, after all)

          Show
          Todd Lipcon added a comment - No problem Nicholas. Do you have any thought on the logic question I raised? It seems like we either have a bug, or we have some methods that need comments/names updated. Or, it's entirely possible that I'm being dumb (it's before noon, after all)
          Hide
          John George added a comment -

          If the last block is in progress then readPastEnd will be true (I think) because locatedBlocks.getFileLength() will return only the "committed" blocks length and hence length + offset will be greater than the committed blocks.

          But in that case the length will be negative and that is not good..

          Show
          John George added a comment - If the last block is in progress then readPastEnd will be true (I think) because locatedBlocks.getFileLength() will return only the "committed" blocks length and hence length + offset will be greater than the committed blocks. But in that case the length will be negative and that is not good..
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > if the last block is in-progress, but you read from within that block, then readPastEnd will be false. ...

          readPastEnd is true since

          • offset + length is the last byte position of the request, and
          • locatedBlocks.getFileLength() is the length of completed blocks.

          So, offset + length is still greater than locatedBlocks.getFileLength()

          Show
          Tsz Wo Nicholas Sze added a comment - > if the last block is in-progress, but you read from within that block, then readPastEnd will be false. ... readPastEnd is true since offset + length is the last byte position of the request, and locatedBlocks.getFileLength() is the length of completed blocks. So, offset + length is still greater than locatedBlocks.getFileLength()
          Hide
          John George added a comment -

          Even I meant "complete" blocks not "committed" blocks.

          Show
          John George added a comment - Even I meant "complete" blocks not "committed" blocks.
          Hide
          Todd Lipcon added a comment -

          aha, I see. Perhaps we can improve the readability of the code by renaming LocatedBlocks.getFileLength to LocatedBlocks.getLengthOfCompleteBlocks()? I was previously under the mistaken impression that getFileLength() would include the "most up to date" length (eg include a partial block if it had been closed and re-opened for append)

          Show
          Todd Lipcon added a comment - aha, I see. Perhaps we can improve the readability of the code by renaming LocatedBlocks.getFileLength to LocatedBlocks.getLengthOfCompleteBlocks()? I was previously under the mistaken impression that getFileLength() would include the "most up to date" length (eg include a partial block if it had been closed and re-opened for append)
          Hide
          John George added a comment -

          Sorry, I was not very clear in my previous statements. Today, I seem to have a tendency to hit the Add button before I am done writing...

          Todd,
          I think you have a very valid example of position read. If the read is specifically for bytes from only the currently_being_written_block then readPastEnd will be true, but length will turn negative.

           length = locatedBlocks.getFileLength() - offset;
          

          Here, offset is greater than locatedBlocks.getFileLength.

          This is an issue rite? Or am I missing something...

          Show
          John George added a comment - Sorry, I was not very clear in my previous statements. Today, I seem to have a tendency to hit the Add button before I am done writing... Todd, I think you have a very valid example of position read. If the read is specifically for bytes from only the currently_being_written_block then readPastEnd will be true, but length will turn negative. length = locatedBlocks.getFileLength() - offset; Here, offset is greater than locatedBlocks.getFileLength. This is an issue rite? Or am I missing something...
          Hide
          John George added a comment -

          Filed HDFS-2034 to track this issue.

          Show
          John George added a comment - Filed HDFS-2034 to track this issue.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #688 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/688/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #688 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/688/ )

            People

            • Assignee:
              John George
              Reporter:
              CW Chung
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development