Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1366

reduce namenode startup time by optimising checkBlockInfo while loading fsimage

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: namenode
    • Labels:
      None

      Description

      The namenode spends about 10 minutes reading in a 14 GB fsimage file into memory and creating all the in-memory data structures. A jstack based debugger clearly shows that most of the time during the fsimage load is spent in BlocksMap.checkBlockInfo. There is a easy way to optimize this method especially for this code path.

        Activity

        Hide
        Matt Foley added a comment -

        The code has changed since this ticket was opened. In March I did some experiments, and at that time there was no longer a BlocksMap.checkBlockInfo() method, and the call sequence was:

        FSImage.loadFSImage()
          FSImageFormat.Loader.load()
            FSImageFormat.Loader.loadFullNameINodes()
              FSDirectory.addToParent()
                BlockManager.addINode()
                  BlocksMap.addINode()
        

        BlocksMap.addINode() did this:

          BlockInfo addINode(BlockInfo b, INodeFile iNode) {
            BlockInfo info = blocks.get(b);
            if (info != b) {
              info = b;
              blocks.put(info);
            }
            info.setINode(iNode);
            return info;
          }
        

        which could be replaced by

          BlockInfo addINode(BlockInfo b, INodeFile iNode) {
            blocks.put(b);
            b.setINode(iNode);
            return b;
          }
        

        Calling blocks.get() before conditionally calling blocks.put() in this way is a waste regardless of whether we are reading the FSImage or calling addINode() for any other purpose, because the cost of put and get are about the same, and the result of just calling put is identical to the above code. However, I put this into a simple proof-of-principle patch (attached - not ready for prime time) and tried it. I only got a 6% improvement in FSImage load time.

        Show
        Matt Foley added a comment - The code has changed since this ticket was opened. In March I did some experiments, and at that time there was no longer a BlocksMap.checkBlockInfo() method, and the call sequence was: FSImage.loadFSImage() FSImageFormat.Loader.load() FSImageFormat.Loader.loadFullNameINodes() FSDirectory.addToParent() BlockManager.addINode() BlocksMap.addINode() BlocksMap.addINode() did this: BlockInfo addINode(BlockInfo b, INodeFile iNode) { BlockInfo info = blocks.get(b); if (info != b) { info = b; blocks.put(info); } info.setINode(iNode); return info; } which could be replaced by BlockInfo addINode(BlockInfo b, INodeFile iNode) { blocks.put(b); b.setINode(iNode); return b; } Calling blocks.get() before conditionally calling blocks.put() in this way is a waste regardless of whether we are reading the FSImage or calling addINode() for any other purpose, because the cost of put and get are about the same, and the result of just calling put is identical to the above code. However, I put this into a simple proof-of-principle patch (attached - not ready for prime time) and tried it. I only got a 6% improvement in FSImage load time.
        Hide
        dhruba borthakur added a comment -
        
        "main" prio=10 tid=0x0000000040114800 nid=0x3f15 runnable [0x000000004022a000]
           java.lang.Thread.State: RUNNABLE
                at org.apache.hadoop.hdfs.server.namenode.BlocksMap$BlockInfo.getNext(BlocksMap.java:277)
                at org.apache.hadoop.hdfs.util.LightWeightGSet.get(LightWeightGSet.java:121)
                at org.apache.hadoop.hdfs.server.namenode.BlocksMap.checkBlockInfo(BlocksMap.java:353)
                at org.apache.hadoop.hdfs.server.namenode.BlocksMap.addINode(BlocksMap.java:370)
                at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addToParent(FSDirectory.java:319)
                at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:956)
                at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:813)
                at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:370)
                at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:141)
                at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:328)
                at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:305)
                at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:254)
                at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:332)
                at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1120)
                at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1129)
        
        

        The idea here is that when the fsimage is being loaded, BlocksMap.checkBlockInfo() will never find the block in the blocksMap, i.e. the call to blocks.get() will always return null.

        Show
        dhruba borthakur added a comment - "main" prio=10 tid=0x0000000040114800 nid=0x3f15 runnable [0x000000004022a000] java.lang. Thread .State: RUNNABLE at org.apache.hadoop.hdfs.server.namenode.BlocksMap$BlockInfo.getNext(BlocksMap.java:277) at org.apache.hadoop.hdfs.util.LightWeightGSet.get(LightWeightGSet.java:121) at org.apache.hadoop.hdfs.server.namenode.BlocksMap.checkBlockInfo(BlocksMap.java:353) at org.apache.hadoop.hdfs.server.namenode.BlocksMap.addINode(BlocksMap.java:370) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addToParent(FSDirectory.java:319) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:956) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:813) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:370) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:141) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:328) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:305) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:254) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:332) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1120) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1129) The idea here is that when the fsimage is being loaded, BlocksMap.checkBlockInfo() will never find the block in the blocksMap, i.e. the call to blocks.get() will always return null.

          People

          • Assignee:
            Unassigned
            Reporter:
            dhruba borthakur
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development