Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-609

Create a file with the append flag does not work in HDFS

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.21.0, 0.22.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      HADOOP-5438 introduced a create API with flags. There are a couple of issues when the flag is set to be APPEND.
      1. The APPEND flag does not work in HDFS. Append is not as simple as changing a FileINode to be a FileINodeUnderConstruction. It also need to reopen the last block for applend if last block is not full and handle crc when the last crc chunk is not full.
      2. The API is not well thought. It has parameters like replication factor and blockSize. Those parameters do not make any sense if APPEND flag is set. But they give an application user a wrong impression that append could change a file's block size and replication factor.

        Issue Links

          Activity

          Hide
          dhruba borthakur added a comment -

          I will take a first pass at it.

          Show
          dhruba borthakur added a comment - I will take a first pass at it.
          Hide
          Eli Collins added a comment -

          Is there anything left to do here? Looks like #1 and #2 are already addressed on trunk.

          Show
          Eli Collins added a comment - Is there anything left to do here? Looks like #1 and #2 are already addressed on trunk.
          Hide
          Todd Lipcon added a comment -

          I disagree - I don't think these are addressed in trunk.

          #1) the APPEND flag seems to track through to startFileInternal in FSNamesystem, which as Hairong mentioned just converts the INode but does not properly pass back a LocatedBlock for the last block, or convert it to underconstruction status.
          #2) There still doesn't seem to be any checks that prevent a user from passing blocksize or replication when CreateFlag.APPEND is specified

          Show
          Todd Lipcon added a comment - I disagree - I don't think these are addressed in trunk. #1) the APPEND flag seems to track through to startFileInternal in FSNamesystem, which as Hairong mentioned just converts the INode but does not properly pass back a LocatedBlock for the last block, or convert it to underconstruction status. #2) There still doesn't seem to be any checks that prevent a user from passing blocksize or replication when CreateFlag.APPEND is specified
          Hide
          Eli Collins added a comment -

          Thanks for the clarification Todd. I was looking at the FileSystem#append APIs looks like those didn't get removed when they were merged with create and apparently create with append was checked in without a test?

          Show
          Eli Collins added a comment - Thanks for the clarification Todd. I was looking at the FileSystem#append APIs looks like those didn't get removed when they were merged with create and apparently create with append was checked in without a test?
          Hide
          Tom White added a comment -

          So are we saying that append doesn't work when calling create() with the append flag, but it works when calling append()? For the 0.21 release we could either fix this (any volunteers?) or throw an unsupported exception for create with the append flag.

          Show
          Tom White added a comment - So are we saying that append doesn't work when calling create() with the append flag, but it works when calling append()? For the 0.21 release we could either fix this (any volunteers?) or throw an unsupported exception for create with the append flag.
          Hide
          Eli Collins added a comment -

          Updating fix version since this needs to be fixed on trunk as well.

          Show
          Eli Collins added a comment - Updating fix version since this needs to be fixed on trunk as well.
          Hide
          Eli Collins added a comment -

          Updated HADOOP-5438, seems like we need to figure out the right create/append API before we put it in a release.

          Show
          Eli Collins added a comment - Updated HADOOP-5438 , seems like we need to figure out the right create/append API before we put it in a release.
          Hide
          Tom White added a comment -

          This is the HDFS portion of HADOOP-6826.

          Show
          Tom White added a comment - This is the HDFS portion of HADOOP-6826 .
          Hide
          dhruba borthakur added a comment -

          Code changes look fine.

          Show
          dhruba borthakur added a comment - Code changes look fine.
          Hide
          Tom White added a comment -

          I've committed this. (I ran test-patch before doing so, which passed.)

          Show
          Tom White added a comment - I've committed this. (I ran test-patch before doing so, which passed.)
          Hide
          Konstantin Shvachko added a comment -

          test-patch does not run tests. We should keep following common practices and go through patch available stage, shouldn't we.

          Show
          Konstantin Shvachko added a comment - test-patch does not run tests. We should keep following common practices and go through patch available stage, shouldn't we.
          Hide
          Tom White added a comment -

          Konstantin, This was one of those situations where the changes for this JIRA had to be applied at the same time as its parent HADOOP-5438, so wasn't possible to have Hudson check this patch either before or after HADOOP-5438 was committed, so I did it manually. You're right about the tests. I should have mentioned that I did run tests at the time I posted the patch - I'm re-running them now to be sure I didn't miss anything.

          Show
          Tom White added a comment - Konstantin, This was one of those situations where the changes for this JIRA had to be applied at the same time as its parent HADOOP-5438 , so wasn't possible to have Hudson check this patch either before or after HADOOP-5438 was committed, so I did it manually. You're right about the tests. I should have mentioned that I did run tests at the time I posted the patch - I'm re-running them now to be sure I didn't miss anything.
          Hide
          Konstantin Shvachko added a comment -

          Thanks Tom for clarifying. There was no comments about running tests.

          Show
          Konstantin Shvachko added a comment - Thanks Tom for clarifying. There was no comments about running tests.

            People

            • Assignee:
              Tom White
              Reporter:
              Hairong Kuang
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development