Hadoop Common
  1. Hadoop Common
  2. HADOOP-5015

Separate block/replica management code from FSNamesystem

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager.

      1. blkmanager.patch
        135 kB
        Suresh Srinivas
      2. blkmanager.patch
        136 kB
        Suresh Srinivas

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          111d 19h 31m 1 Suresh Srinivas 04/May/09 19:13
          Patch Available Patch Available Resolved Resolved
          2d 3h 21m 1 Tsz Wo Nicholas Sze 06/May/09 22:34
          Resolved Resolved Closed Closed
          474d 22h 1 Tom White 24/Aug/10 20:34
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s dfs [ 12310710 ]
          Hide
          Suresh Srinivas added a comment -

          Created 5782 to revert some of the formatting changes

          Show
          Suresh Srinivas added a comment - Created 5782 to revert some of the formatting changes
          Suresh Srinivas made changes -
          Link This issue blocks HADOOP-5782 [ HADOOP-5782 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk #828 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/828/)
          . Separate block management code from FSNamesystem. Contributed by Suresh Srinivas

          Show
          Hudson added a comment - Integrated in Hadoop-trunk #828 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/828/ ) . Separate block management code from FSNamesystem. Contributed by Suresh Srinivas
          Hide
          Raghu Angadi added a comment - - edited

          > To make code review simpler, I have also retained the structure of the code moved from FSNamesystem.java as it is in BlockManager.java.

          There are a lot of formatting changes that break existing patches for HDFS :

          e.g. segment (ideally it should have been a clean cut-n-paste) :

               synchronized (neededReplications) {
          -      out.println("Metasave: Blocks waiting for replication: " +
          -                  neededReplications.size());
          +      out.println("Metasave: Blocks waiting for replication: "
          +          + neededReplications.size());
                 for (Block block : neededReplications) {
          -        List<DatanodeDescriptor> containingNodes =
          -                                          new ArrayList<DatanodeDescriptor>();
          +        List<DatanodeDescriptor> containingNodes = new ArrayList<DatanodeDescriptor>();
                   NumberReplicas numReplicas = new NumberReplicas();
                   // source node returned is not used
                   chooseSourceDatanode(block, containingNodes, numReplicas);
          -        int usableReplicas = numReplicas.liveReplicas() +
          -                             numReplicas.decommissionedReplicas();
          +        int usableReplicas = numReplicas.liveReplicas()
          +            + numReplicas.decommissionedReplicas();
                   // l: == live:, d: == decommissioned c: == corrupt e: == excess
          -        out.print(block + " (replicas:" +
          -                  " l: " + numReplicas.liveReplicas() +
          -                  " d: " + numReplicas.decommissionedReplicas() +
          -                  " c: " + numReplicas.corruptReplicas() +
          -                  " e: " + numReplicas.excessReplicas() +
          -                  ((usableReplicas > 0)? "" : " MISSING") + ")");
          +        out.print(block + " (replicas:" + " l: " + numReplicas.liveReplicas()
          +            + " d: " + numReplicas.decommissionedReplicas() + " c: "
          +            + numReplicas.corruptReplicas() + " e: "
          +            + numReplicas.excessReplicas()
          +            + ((usableReplicas > 0) ? "" : " MISSING") + ")");
           
          -        for (Iterator<DatanodeDescriptor> jt = blocksMap.nodeIterator(block);
          -             jt.hasNext();) {
          +        for (Iterator<DatanodeDescriptor> jt = blocksMap.nodeIterator(block); jt
          +            .hasNext();) {
                     DatanodeDescriptor node = jt.next();
                     out.print(" " + node + " : ");
                   }
                   out.println("");
          

          Without such changes, it would have been much simpler to port the patches (just by changing the file name in patch file.

          Show
          Raghu Angadi added a comment - - edited > To make code review simpler, I have also retained the structure of the code moved from FSNamesystem.java as it is in BlockManager.java. There are a lot of formatting changes that break existing patches for HDFS : e.g. segment (ideally it should have been a clean cut-n-paste) : synchronized (neededReplications) { - out.println("Metasave: Blocks waiting for replication: " + - neededReplications.size()); + out.println("Metasave: Blocks waiting for replication: " + + neededReplications.size()); for (Block block : neededReplications) { - List<DatanodeDescriptor> containingNodes = - new ArrayList<DatanodeDescriptor>(); + List<DatanodeDescriptor> containingNodes = new ArrayList<DatanodeDescriptor>(); NumberReplicas numReplicas = new NumberReplicas(); // source node returned is not used chooseSourceDatanode(block, containingNodes, numReplicas); - int usableReplicas = numReplicas.liveReplicas() + - numReplicas.decommissionedReplicas(); + int usableReplicas = numReplicas.liveReplicas() + + numReplicas.decommissionedReplicas(); // l: == live:, d: == decommissioned c: == corrupt e: == excess - out.print(block + " (replicas:" + - " l: " + numReplicas.liveReplicas() + - " d: " + numReplicas.decommissionedReplicas() + - " c: " + numReplicas.corruptReplicas() + - " e: " + numReplicas.excessReplicas() + - ((usableReplicas > 0)? "" : " MISSING") + ")"); + out.print(block + " (replicas:" + " l: " + numReplicas.liveReplicas() + + " d: " + numReplicas.decommissionedReplicas() + " c: " + + numReplicas.corruptReplicas() + " e: " + + numReplicas.excessReplicas() + + ((usableReplicas > 0) ? "" : " MISSING") + ")"); - for (Iterator<DatanodeDescriptor> jt = blocksMap.nodeIterator(block); - jt.hasNext();) { + for (Iterator<DatanodeDescriptor> jt = blocksMap.nodeIterator(block); jt + .hasNext();) { DatanodeDescriptor node = jt.next(); out.print(" " + node + " : "); } out.println(""); Without such changes, it would have been much simpler to port the patches (just by changing the file name in patch file.
          Tsz Wo Nicholas Sze made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks, Suresh!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Suresh!
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good.

          I will commit this later today.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good. I will commit this later today.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12407169/blkmanager.patch
          against trunk revision 771661.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 18 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407169/blkmanager.patch against trunk revision 771661. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The code refactoriing is good. It helps a lot in code readability.

          Show
          Tsz Wo Nicholas Sze added a comment - The code refactoriing is good. It helps a lot in code readability.
          Tsz Wo Nicholas Sze made changes -
          Link This issue incorporates HADOOP-5574 [ HADOOP-5574 ]
          Suresh Srinivas made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hadoop Flags [Reviewed]
          Suresh Srinivas made changes -
          Attachment blkmanager.patch [ 12407169 ]
          Hide
          Suresh Srinivas added a comment -

          Incorporated all the comments in the attached patch, except the following:

          1. Not moving generationStamp to blockManager as it is used in FSNamesystem and updated during file creation
          2. blockInvalidateLimit can be renamed in a separate change
          3. replicationRecheckInterval should be moved into ReplicationMonitor in a separate change
          Show
          Suresh Srinivas added a comment - Incorporated all the comments in the attached patch, except the following: Not moving generationStamp to blockManager as it is used in FSNamesystem and updated during file creation blockInvalidateLimit can be renamed in a separate change replicationRecheckInterval should be moved into ReplicationMonitor in a separate change
          Hide
          dhruba borthakur added a comment -

          This patch is not related to HADOOP-3799. This patch attempts to separate out the block manager code from the namespace manager code whereas HADOOP-3799 makes the replica-placement pluggable.

          Show
          dhruba borthakur added a comment - This patch is not related to HADOOP-3799 . This patch attempts to separate out the block manager code from the namespace manager code whereas HADOOP-3799 makes the replica-placement pluggable.
          Hide
          Jeff Hammerbacher added a comment -

          Hey,

          Could someone with a deeper knowledge of HDFS internals comment on what work would need to be done after this refactoring to complete https://issues.apache.org/jira/browse/HADOOP-3799? Any insight here would be much appreciated!

          Later,
          Jeff

          Show
          Jeff Hammerbacher added a comment - Hey, Could someone with a deeper knowledge of HDFS internals comment on what work would need to be done after this refactoring to complete https://issues.apache.org/jira/browse/HADOOP-3799? Any insight here would be much appreciated! Later, Jeff
          Hide
          dhruba borthakur added a comment -

          This refactoring looks great!

          Show
          dhruba borthakur added a comment - This refactoring looks great!
          Konstantin Shvachko made changes -
          Link This issue is part of HADOOP-4563 [ HADOOP-4563 ]
          Konstantin Shvachko made changes -
          Summary Seprate block/replica management code from FSNamesystem Separate block/replica management code from FSNamesystem
          Hide
          Konstantin Shvachko added a comment -

          This is a nice refactoring. A few comments:

          1. FSDirectory should have a method getBlockManager(). This will let you avoid unnecessary transient methods in FSNamesystem that only exist to call the respective BlockManager methods.
          2. BlocksWithLocations getBlocks() should remain in FSNamesystem. I think BlockManager should not be exposed to external types like BlocksWithLocations. Besides getBlocks() does not work with any intrinsic fields of BlockManager.
          3. It seems to me that ReplicationTargetChooser replicator should be also moved into BlockManager.
          4. I am not sure about the following fields, which may be related to block management as well, but the patch left them inside FSNamesystem:
            • generationStamp seem to be related
            • blockInvalidateLimit looks like unrelated, we may later want to rename it to datanodeInvalidateLimit.
            • ReplicationMonitor along with replicationRecheckInterval - probably related.
          Show
          Konstantin Shvachko added a comment - This is a nice refactoring. A few comments: FSDirectory should have a method getBlockManager() . This will let you avoid unnecessary transient methods in FSNamesystem that only exist to call the respective BlockManager methods. BlocksWithLocations getBlocks() should remain in FSNamesystem . I think BlockManager should not be exposed to external types like BlocksWithLocations . Besides getBlocks() does not work with any intrinsic fields of BlockManager . It seems to me that ReplicationTargetChooser replicator should be also moved into BlockManager . I am not sure about the following fields, which may be related to block management as well, but the patch left them inside FSNamesystem : generationStamp seem to be related blockInvalidateLimit looks like unrelated, we may later want to rename it to datanodeInvalidateLimit . ReplicationMonitor along with replicationRecheckInterval - probably related.
          Suresh Srinivas made changes -
          Assignee Suresh Srinivas [ sureshms ]
          Suresh Srinivas made changes -
          Attachment blkmanager.patch [ 12407044 ]
          Hide
          Suresh Srinivas added a comment -

          As a first step towards separating out block management functionality from FSNamesystem.java, I have introduced a new class BlockManager.java. This new class is to be only used by FSNamesystem, using the synchronization as it exists today. To make code review simpler, I have also retained the structure of the code moved from FSNamesystem.java as it is in BlockManager.java. After this change, we could have more iterations to organize the code better with in BlockManager.java.

          Show
          Suresh Srinivas added a comment - As a first step towards separating out block management functionality from FSNamesystem.java, I have introduced a new class BlockManager.java. This new class is to be only used by FSNamesystem , using the synchronization as it exists today. To make code review simpler, I have also retained the structure of the code moved from FSNamesystem.java as it is in BlockManager.java. After this change, we could have more iterations to organize the code better with in BlockManager.java.
          Hide
          Hong Tang added a comment -

          My wish list:

          • Blocks are addressed through numeric IDs (160-bit or even longer, ID never recycled).
          • Inode Block (a block that contains the Block IDs of a file's data blocks.)
          • NN maintains mapping of path => Inode Block ID.
          Show
          Hong Tang added a comment - My wish list: Blocks are addressed through numeric IDs (160-bit or even longer, ID never recycled). Inode Block (a block that contains the Block IDs of a file's data blocks.) NN maintains mapping of path => Inode Block ID.
          Hairong Kuang made changes -
          Field Original Value New Value
          Link This issue incorporates HADOOP-5016 [ HADOOP-5016 ]
          Hide
          dhruba borthakur added a comment -

          This will be the first step to maybe run the BlockManager separately from the namespace manager! That will be pretty cool!

          Show
          dhruba borthakur added a comment - This will be the first step to maybe run the BlockManager separately from the namespace manager! That will be pretty cool!
          Hairong Kuang created issue -

            People

            • Assignee:
              Suresh Srinivas
              Reporter:
              Hairong Kuang
            • Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development