Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-265 Revisit append
  3. HDFS-570

When opening a file for read, make the file length avaliable to client.

    Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: Append Branch
    • Fix Version/s: Append Branch
    • Component/s: hdfs-client
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      In order to support read consistency, get last block length from a data-node when opening a file being written to.

      Description

      In order to support read consistency, DFSClient needs the file length at the file opening time. In the current implmentation, DFSClient obtains the file length at the file opening time but the length is inaccurate if the file is being written.

      For more details, see Section 4 in the append design doc.

      1. h570_20090926.patch
        26 kB
        Konstantin Shvachko
      2. h570_20090926.patch
        26 kB
        Konstantin Shvachko
      3. h570_20090925c.patch
        27 kB
        Tsz Wo Nicholas Sze
      4. h570_20090925b.patch
        25 kB
        Tsz Wo Nicholas Sze
      5. h570_20090925.patch
        23 kB
        Tsz Wo Nicholas Sze
      6. h570_20090924.patch
        17 kB
        Tsz Wo Nicholas Sze
      7. h570_20090922.patch
        15 kB
        Tsz Wo Nicholas Sze
      8. h570_20090828.patch
        25 kB
        Tsz Wo Nicholas Sze

        Issue Links

          Activity

          Tsz Wo Nicholas Sze made changes -
          Release Note In order to support read consistency, get last block length from a data-node when opening a file being written to.
          Konstantin Boudnik made changes -
          Link This issue is related to HDFS-735 [ HDFS-735 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue is related to HADOOP-6307 [ HADOOP-6307 ]
          Konstantin Shvachko made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags [Incompatible change] [Incompatible change, Reviewed]
          Resolution Fixed [ 1 ]
          Hide
          Konstantin Shvachko added a comment -

          I just committed this. Thank you Nicholas.

          Show
          Konstantin Shvachko added a comment - I just committed this. Thank you Nicholas.
          Konstantin Shvachko made changes -
          Attachment h570_20090926.patch [ 12420640 ]
          Hide
          Konstantin Shvachko added a comment -

          Added the comment.

          Show
          Konstantin Shvachko added a comment - Added the comment.
          Hide
          Konstantin Shvachko added a comment -

          OK. Never mind. getFileLength() includes the length of the last block, which is not known to the name-node yet, while locatedBlocks.getFileLength() does not. The naming is confusing, but the code is correct. I'll add a comment.

          Show
          Konstantin Shvachko added a comment - OK. Never mind. getFileLength() includes the length of the last block, which is not known to the name-node yet, while locatedBlocks.getFileLength() does not. The naming is confusing, but the code is correct. I'll add a comment.
          Konstantin Shvachko made changes -
          Attachment h570_20090926.patch [ 12420639 ]
          Hide
          Konstantin Shvachko added a comment -

          Merged the patch with trunk.

          In getBlockAt() you first throw exception if offset >= getFileLength() and then check
          else if (offset >= locatedBlocks.getFileLength()) {
          Is this esle if a dead branch?

          Show
          Konstantin Shvachko added a comment - Merged the patch with trunk. In getBlockAt() you first throw exception if offset >= getFileLength() and then check else if (offset >= locatedBlocks.getFileLength()) { Is this esle if a dead branch?
          Hide
          Tsz Wo Nicholas Sze added a comment -
               [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          

          Passed all unit tests except for TestBackupNode and the ones in HDFS-647

          Show
          Tsz Wo Nicholas Sze added a comment - [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Passed all unit tests except for TestBackupNode and the ones in HDFS-647
          Tsz Wo Nicholas Sze made changes -
          Attachment h570_20090925c.patch [ 12420614 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h570_20090925c.patch: changed getBlockAt(..) to use the last block stored in the locatedBlocks.

          Show
          Tsz Wo Nicholas Sze added a comment - h570_20090925c.patch: changed getBlockAt(..) to use the last block stored in the locatedBlocks.
          Hide
          Konstantin Boudnik added a comment -

          +1 Thanks Nicholas for the mods: the test is so much clearer now!

          Show
          Konstantin Boudnik added a comment - +1 Thanks Nicholas for the mods: the test is so much clearer now!
          Hide
          Tsz Wo Nicholas Sze added a comment -
               [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          

          Running unit tests ...

          Show
          Tsz Wo Nicholas Sze added a comment - [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Running unit tests ...
          Tsz Wo Nicholas Sze made changes -
          Attachment h570_20090925b.patch [ 12420608 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h570_20090925b.patch: fixed getBlockRange(..) to work with the new design and updated the tests.

          Thanks, Konstantin and Konstantin.

          Show
          Tsz Wo Nicholas Sze added a comment - h570_20090925b.patch: fixed getBlockRange(..) to work with the new design and updated the tests. Thanks, Konstantin and Konstantin.
          Hide
          Konstantin Boudnik added a comment -

          The test method is public thus some JavaDoc would be in order. Also, for the sake for future reader unfamiliar with particulars of append implementation, I'd suggest to add a minimal comments on the test assumptions expressed through assert invocations.

          There's some commented statement left - shall they be removed?

          +1 otherwise.

          Show
          Konstantin Boudnik added a comment - The test method is public thus some JavaDoc would be in order. Also, for the sake for future reader unfamiliar with particulars of append implementation, I'd suggest to add a minimal comments on the test assumptions expressed through assert invocations. There's some commented statement left - shall they be removed? +1 otherwise.
          Hide
          Konstantin Shvachko added a comment -

          +1

          Show
          Konstantin Shvachko added a comment - +1
          Tsz Wo Nicholas Sze made changes -
          Attachment h570_20090925.patch [ 12420594 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h570_20090925.patch: added a test.

          Show
          Tsz Wo Nicholas Sze added a comment - h570_20090925.patch: added a test.
          Tsz Wo Nicholas Sze made changes -
          Attachment h570_20090924.patch [ 12420504 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h570_20090924.patch: incorporated Konstantin's comments. Will see whether it is easy to add a unit test.

          Show
          Tsz Wo Nicholas Sze added a comment - h570_20090924.patch: incorporated Konstantin's comments. Will see whether it is easy to add a unit test.
          Hide
          Konstantin Shvachko added a comment -
          1. For the last block if it is under construction you should return BlockInfoUnderConstruction.getExpectedLocations() rather than those reported by data-nodes. The list of reported nodes may be empty, because the write has not finished yet, but the bytes should still be readable.
            I guess we don't have a test for this condition therefore this was not revealed during your testing.
          2. Could you please replace the new member LocatedBlocks.isLastBlockUnderConstruction with isComplete.
            I am planning to replace BlockInfoUnderConstruction.isUnderConstruction() to isComplete(). Thisi is because in current implementation block under-construction means 2 things: that the state is UNDER_CONSTRUCTION and that the state is not COMPLETE, which is confusing. Sorry my fault.

          It would be really good to have this patch ready tomorrow.

          Show
          Konstantin Shvachko added a comment - For the last block if it is under construction you should return BlockInfoUnderConstruction.getExpectedLocations() rather than those reported by data-nodes. The list of reported nodes may be empty, because the write has not finished yet, but the bytes should still be readable. I guess we don't have a test for this condition therefore this was not revealed during your testing. Could you please replace the new member LocatedBlocks.isLastBlockUnderConstruction with isComplete . I am planning to replace BlockInfoUnderConstruction.isUnderConstruction() to isComplete() . Thisi is because in current implementation block under-construction means 2 things: that the state is UNDER_CONSTRUCTION and that the state is not COMPLETE, which is confusing. Sorry my fault. It would be really good to have this patch ready tomorrow.
          Tsz Wo Nicholas Sze made changes -
          Attachment h570_20090922.patch [ 12420316 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h570_20090922.patch: updated with trunk. Will add new unit tests.

          Show
          Tsz Wo Nicholas Sze added a comment - h570_20090922.patch: updated with trunk. Will add new unit tests.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Could you explain more?

          As I said before, you mix in your patch changes related to the current jira with unrelated refactoring of the code.
          I listed most of the cases (1-8) that do not belong to the functionality you are implementing.
          And which

          1. Obscure understanding of the new functionality you actually introduce.
          2. Make it hard to continue merging the trunk with the branch.

          My proposal is to separate the implementation of the visible length from the refactoring of the code into 2 separate patches.
          The refactoring should be applied then to the trunk and to the branch.
          My personal preference is to postpone the refactoring until append is merged to the trunk.

          There is no code refactoring in LocatedBlock except for the imports. What do you mean by "LocatedBlock 3. Does not need any of the changes."?

          getReplicaInfo() is currently a private method of FSDataset. You are adding it to FSDatasetInterface. Based on the usage of the method I don't see a need for that.

          If you look at the patch again, you will find that getReplicaInfo(..) is called in DataNode. It cannot be invoked if it is a private method of FSDataset.

          Show
          Tsz Wo Nicholas Sze added a comment - > Could you explain more? As I said before, you mix in your patch changes related to the current jira with unrelated refactoring of the code. I listed most of the cases (1-8) that do not belong to the functionality you are implementing. And which 1. Obscure understanding of the new functionality you actually introduce. 2. Make it hard to continue merging the trunk with the branch. My proposal is to separate the implementation of the visible length from the refactoring of the code into 2 separate patches. The refactoring should be applied then to the trunk and to the branch. My personal preference is to postpone the refactoring until append is merged to the trunk. There is no code refactoring in LocatedBlock except for the imports. What do you mean by "LocatedBlock 3. Does not need any of the changes."? getReplicaInfo() is currently a private method of FSDataset. You are adding it to FSDatasetInterface. Based on the usage of the method I don't see a need for that. If you look at the patch again, you will find that getReplicaInfo(..) is called in DataNode. It cannot be invoked if it is a private method of FSDataset.
          Hide
          Konstantin Shvachko added a comment -

          > Could you explain more?

          As I said before, you mix in your patch changes related to the current jira with unrelated refactoring of the code.
          I listed most of the cases (1-8) that do not belong to the functionality you are implementing.
          And which

          1. Obscure understanding of the new functionality you actually introduce.
          2. Make it hard to continue merging the trunk with the branch.

          My proposal is to separate the implementation of the visible length from the refactoring of the code into 2 separate patches.
          The refactoring should be applied then to the trunk and to the branch.
          My personal preference is to postpone the refactoring until append is merged to the trunk.

          > 5. Why do you need to abstract getReplicaInfo()? It does not seem that SimulatedFSDataset actually need it anywhere, at least not yet.
          > FSDatasetInterface is an interface. By definition, all methods in an interface must be abstract.

          getReplicaInfo() is currently a private method of FSDataset. You are adding it to FSDatasetInterface. Based on the usage of the method I don't see a need for that.

          Show
          Konstantin Shvachko added a comment - > Could you explain more? As I said before, you mix in your patch changes related to the current jira with unrelated refactoring of the code. I listed most of the cases (1-8) that do not belong to the functionality you are implementing. And which Obscure understanding of the new functionality you actually introduce. Make it hard to continue merging the trunk with the branch. My proposal is to separate the implementation of the visible length from the refactoring of the code into 2 separate patches. The refactoring should be applied then to the trunk and to the branch. My personal preference is to postpone the refactoring until append is merged to the trunk. > 5. Why do you need to abstract getReplicaInfo()? It does not seem that SimulatedFSDataset actually need it anywhere, at least not yet. > FSDatasetInterface is an interface. By definition, all methods in an interface must be abstract. getReplicaInfo() is currently a private method of FSDataset. You are adding it to FSDatasetInterface. Based on the usage of the method I don't see a need for that.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          ClientDatanodeProtocol
          2. JavaDoc for getReplicaVisibleLength() is confusing. Could you please also make it 3 lines rather than 1.

          Why 3 lines?

          LocatedBlock
          3. Does not need any of the changes.

          Could you explain more?

          LocatedBlocks
          4. Here you do multiple field and method renames, combined with reformatting. I am lost.

          How can I help you?

          FSDatasetInterface
          5. Why do you need to abstract getReplicaInfo()? It does not seem that SimulatedFSDataset actually need it anywhere, at least not yet.

          FSDatasetInterface is an interface. By definition, all methods in an interface must be abstract.

          BlockManager
          6. You factored out a part of the code into a new method. I cannot see what the new changes are.

          The new method is involved in FSNamesystem. Could you take a look again?

          INodeFile
          8. It is not necessary to remove public from method declaration and remove unused method.

          I remove the method because I see the following comment in the code.

          // SHV !!! this is not used anywhere - remove
          

          The comment was introduced by you in HDFS-517. Could you explain what does it mean?

          Show
          Tsz Wo Nicholas Sze added a comment - ClientDatanodeProtocol 2. JavaDoc for getReplicaVisibleLength() is confusing. Could you please also make it 3 lines rather than 1. Why 3 lines? LocatedBlock 3. Does not need any of the changes. Could you explain more? LocatedBlocks 4. Here you do multiple field and method renames, combined with reformatting. I am lost. How can I help you? FSDatasetInterface 5. Why do you need to abstract getReplicaInfo()? It does not seem that SimulatedFSDataset actually need it anywhere, at least not yet. FSDatasetInterface is an interface . By definition, all methods in an interface must be abstract. BlockManager 6. You factored out a part of the code into a new method. I cannot see what the new changes are. The new method is involved in FSNamesystem. Could you take a look again? INodeFile 8. It is not necessary to remove public from method declaration and remove unused method. I remove the method because I see the following comment in the code. // SHV !!! this is not used anywhere - remove The comment was introduced by you in HDFS-517 . Could you explain what does it mean?
          Hide
          Konstantin Shvachko added a comment -

          1. I see a lot of changes that are pure refactoring: like adding "final" to class members and variables, or removing "public", or reshuffling imports. I am not against your changes, it is just that they make merging the branch with trunk even more difficult, and obscure the nature of the new functionality.

          ClientDatanodeProtocol
          2. JavaDoc for getReplicaVisibleLength() is confusing. Could you please also make it 3 lines rather than 1.

          LocatedBlock
          3. Does not need any of the changes.

          LocatedBlocks
          4. Here you do multiple field and method renames, combined with reformatting. I am lost.

          FSDatasetInterface
          5. Why do you need to abstract getReplicaInfo()? It does not seem that SimulatedFSDataset actually need it anywhere, at least not yet.

          BlockManager
          6. You factored out a part of the code into a new method. I cannot see what the new changes are.

          INode
          7. Please do not re-sort the imports

          INodeFile
          8. It is not necessary to remove public from method declaration and remove unused method.

          SimulatedFSDataset
          9. No need to remove the private constructor.

          Could you please revise your patch so that it only makes the essential changes.
          If you want to refactor please do it in a separate patch applied to trunk.

          Show
          Konstantin Shvachko added a comment - 1. I see a lot of changes that are pure refactoring: like adding "final" to class members and variables, or removing "public", or reshuffling imports. I am not against your changes, it is just that they make merging the branch with trunk even more difficult, and obscure the nature of the new functionality. ClientDatanodeProtocol 2. JavaDoc for getReplicaVisibleLength() is confusing. Could you please also make it 3 lines rather than 1. LocatedBlock 3. Does not need any of the changes. LocatedBlocks 4. Here you do multiple field and method renames, combined with reformatting. I am lost. FSDatasetInterface 5. Why do you need to abstract getReplicaInfo()? It does not seem that SimulatedFSDataset actually need it anywhere, at least not yet. BlockManager 6. You factored out a part of the code into a new method. I cannot see what the new changes are. INode 7. Please do not re-sort the imports INodeFile 8. It is not necessary to remove public from method declaration and remove unused method. SimulatedFSDataset 9. No need to remove the private constructor. Could you please revise your patch so that it only makes the essential changes. If you want to refactor please do it in a separate patch applied to trunk.
          Tsz Wo Nicholas Sze made changes -
          Hadoop Flags [Incompatible change]
          Hide
          Tsz Wo Nicholas Sze added a comment -
               [exec] +1 overall.
               [exec]
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec]
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
               [exec]
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec]
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec]
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnnings.
               [exec]
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          

          All tests pass except TestBackupNode (HDFS-192) and TestFiDataTransferProtocol. I ran TestFiDataTransferProtocol over a clean append-branch. It also failed.

          Show
          Tsz Wo Nicholas Sze added a comment - [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. All tests pass except TestBackupNode ( HDFS-192 ) and TestFiDataTransferProtocol. I ran TestFiDataTransferProtocol over a clean append-branch. It also failed.
          Hide
          dhruba borthakur added a comment -

          This approach sounds good to me.

          Show
          dhruba borthakur added a comment - This approach sounds good to me.
          Tsz Wo Nicholas Sze made changes -
          Attachment h570_20090828.patch [ 12418031 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h570_20090828.patch: implemented the idea mentioned in my previous comment.

          Show
          Tsz Wo Nicholas Sze added a comment - h570_20090828.patch: implemented the idea mentioned in my previous comment.
          Tsz Wo Nicholas Sze made changes -
          Field Original Value New Value
          Description In order to support read consistency, DFSClient needs the file length at the file opening time.

          For more details, see Section 4 in the [append design doc|https://issues.apache.org/jira/secure/attachment/12415768/appendDesign2.pdf].
          In order to support read consistency, DFSClient needs the file length at the file opening time. In the current implmentation, DFSClient obtains the file length at the file opening time but the length is inaccurate if the file is being written.

          For more details, see Section 4 in the [append design doc|https://issues.apache.org/jira/secure/attachment/12415768/appendDesign2.pdf].
          Hide
          Tsz Wo Nicholas Sze added a comment -

          In order to support reading a file when it is being written. I suggest to have the following changes:

          • DFSClient obtains all the block lengths from the namenode for the blocks is not under construction. Note that only the last block is possibly under construction. Other blocks must be finalized.
          • For the last block, if it is under construction, DFSClient obtains the block length from one of the datanode.
          Show
          Tsz Wo Nicholas Sze added a comment - In order to support reading a file when it is being written. I suggest to have the following changes: DFSClient obtains all the block lengths from the namenode for the blocks is not under construction. Note that only the last block is possibly under construction. Other blocks must be finalized. For the last block, if it is under construction, DFSClient obtains the block length from one of the datanode.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I suggest to implement Algorithm 1 because it is closer to the current implementation. What do you think, Dhruba?

          Show
          Tsz Wo Nicholas Sze added a comment - I suggest to implement Algorithm 1 because it is closer to the current implementation. What do you think, Dhruba?
          Hide
          dhruba borthakur added a comment -

          Are we implementing algorithm 1 or algorithm 2 (as described in the append design doc)?

          Show
          dhruba borthakur added a comment - Are we implementing algorithm 1 or algorithm 2 (as described in the append design doc)?
          Tsz Wo Nicholas Sze created issue -

            People

            • Assignee:
              Tsz Wo Nicholas Sze
              Reporter:
              Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development