Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1363

startFileInternal should return the last block of the file opened for append as an under-construction block

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      FSNamesystem.startFileInternal should convert the last block of the file opened for append to an under-construction block and return it. This will let remove the second synchronized section in FSNamesystem.appendFile() and avoid redundant computations and potential inconsistencies as stated in HDFS-1152.

      1. appendFileSync-0.21.patch
        8 kB
        Konstantin Shvachko
      2. appendFileSync.patch
        12 kB
        Konstantin Shvachko
      3. appendFileSync.patch
        11 kB
        Konstantin Shvachko

        Issue Links

          Activity

          Hide
          Konstantin Shvachko added a comment -

          The block conversion should be done in FSNamesystem.startFileInternal(), when the file itself is converted to an under-construction file.
          The last block of the file should be returned to FSNamesystem.appendFile(), which will make computation of the last block in appendFile() redundant.

          Show
          Konstantin Shvachko added a comment - The block conversion should be done in FSNamesystem.startFileInternal() , when the file itself is converted to an under-construction file. The last block of the file should be returned to FSNamesystem.appendFile() , which will make computation of the last block in appendFile() redundant.
          Hide
          Konstantin Shvachko added a comment -

          This is a patch for trunk.

          • It eliminates the second synchronized section in appendFile().
          • I moved all this logic related to generating LocatedBlock inside startFileInternal(),
          • which now can return a LocatedBlock if called from appendFile().
          • I reintroduce setBlockTokens() in FSNamesystem, otherwise I could not reuse getBlockLocation() in BlockManager.
          • The patch removes the part of the code responsible for cleaning up replicas of the last block from respective DataNodeDescriptors. This logic was introduced in HADOOP-5134 for append-20 in order to avoid replication of blocks under construction. We do not need this anymore as blocks under-construction are not replicated in current impl, and getBlockLocations() always returns expected locations for such blocks, ignoring the triplets.
          Show
          Konstantin Shvachko added a comment - This is a patch for trunk. It eliminates the second synchronized section in appendFile(). I moved all this logic related to generating LocatedBlock inside startFileInternal(), which now can return a LocatedBlock if called from appendFile(). I reintroduce setBlockTokens() in FSNamesystem, otherwise I could not reuse getBlockLocation() in BlockManager. The patch removes the part of the code responsible for cleaning up replicas of the last block from respective DataNodeDescriptors. This logic was introduced in HADOOP-5134 for append-20 in order to avoid replication of blocks under construction. We do not need this anymore as blocks under-construction are not replicated in current impl, and getBlockLocations() always returns expected locations for such blocks, ignoring the triplets.
          Hide
          Jakob Homan added a comment -

          Review:

          • BlockManager::createLocatedBlock(). Originally this method did more than just call the LocatedBlock constructor (back when it was part of FSNameSystem). Now, since this is all it's doing and is called just two places, maybe we can just remove it?
          • FSNamesystem.java:822. Javadoc for setBlockTokens refers to old, combined BATs, which we don't use any more.
          • Since the contract of startFileInternal has been changed (now may null or a LocatedBlock) and the method itself is quite convoluted, it would be good to spell this out in the method's javadoc.

          Otherwise, looks good as a refactor.

          Show
          Jakob Homan added a comment - Review: BlockManager::createLocatedBlock(). Originally this method did more than just call the LocatedBlock constructor (back when it was part of FSNameSystem). Now, since this is all it's doing and is called just two places, maybe we can just remove it? FSNamesystem.java:822. Javadoc for setBlockTokens refers to old, combined BATs, which we don't use any more. Since the contract of startFileInternal has been changed (now may null or a LocatedBlock) and the method itself is quite convoluted, it would be good to spell this out in the method's javadoc. Otherwise, looks good as a refactor.
          Hide
          Jakob Homan added a comment -

          Canceling patch post review. Also, looks like Hudson is AWOL again, so you may wish to run tests manually.

          Show
          Jakob Homan added a comment - Canceling patch post review. Also, looks like Hudson is AWOL again, so you may wish to run tests manually.
          Hide
          Konstantin Shvachko added a comment -

          Enhansed JavaDocs and eliminated createLocatedBlock() as Jakob suggested.
          The tests are running.

          Show
          Konstantin Shvachko added a comment - Enhansed JavaDocs and eliminated createLocatedBlock() as Jakob suggested. The tests are running.
          Hide
          Konstantin Shvachko added a comment -

          The tests ran fine, except the usual suspects: TestTrash and TestFileConcurrentReader.
          test-patch complains there are no new tests included. I find it hard to envision all possible ways of spliiting a synchronized section into two or more sections making it hard to test. And the main functionality is tested well by existing test cases.

          .     [exec] -1 overall.
                [exec]     +1 @author.  The patch does not contain any @author tags.
                [exec]     -1 tests included.  The patch doesn't appear to include any new or modified test
                [exec]                         Please justify why no new tests are needed for this patch.
                [exec]                         Also please list what manual steps were performed to verify this patch.
                [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
                [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
                [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
                [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
                [exec]     +1 system tests framework.  The patch passed system tests framework compile.
                [exec] ======================================================================
          
          Show
          Konstantin Shvachko added a comment - The tests ran fine, except the usual suspects: TestTrash and TestFileConcurrentReader. test-patch complains there are no new tests included. I find it hard to envision all possible ways of spliiting a synchronized section into two or more sections making it hard to test. And the main functionality is tested well by existing test cases. . [exec] -1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] -1 tests included. The patch doesn't appear to include any new or modified test [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] +1 system tests framework. The patch passed system tests framework compile. [exec] ======================================================================
          Hide
          Jakob Homan added a comment -

          +1.

          Show
          Jakob Homan added a comment - +1.
          Hide
          Konstantin Shvachko added a comment -

          A patch for 0.21 branch.

          Show
          Konstantin Shvachko added a comment - A patch for 0.21 branch.
          Hide
          Konstantin Shvachko added a comment -

          I just committed this.

          Show
          Konstantin Shvachko added a comment - I just committed this.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #388 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/388/)
          HDFS-1363. Eliminate second synchronized sections in appendFile(). Contributed by Konstantin Shvachko.

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #388 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/388/ ) HDFS-1363 . Eliminate second synchronized sections in appendFile(). Contributed by Konstantin Shvachko.

            People

            • Assignee:
              Konstantin Shvachko
              Reporter:
              Konstantin Shvachko
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development