Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13671

Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.1.0, 3.0.3
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Target Version/s:

      Description

      NameNode hung when deleting large files/blocks. The stack info:

      "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 tid=0x00007fb505b27800 nid=0x94c3 runnable [0x00007fa861361000]
         java.lang.Thread.State: RUNNABLE
      	at org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
      	at org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
      	at org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
      	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
      	at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
      	at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
      	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
      	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
      	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
      	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
      	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
      

      In the current deletion logic in NameNode, there are mainly two steps:

      • Collect INodes and all blocks to be deleted, then delete INodes.
      • Remove blocks chunk by chunk in a loop.
        Actually the first step should be a more expensive operation and will takes more time. However, now we always see NN hangs during the remove block operation.

      Looking into this, we introduced a new structure FoldedTreeSet to have a better performance in dealing FBR/IBRs. But compared with early implementation in remove-block logic, FoldedTreeSet seems more slower since It will take additional time to balance tree node. When there are large block to be removed/deleted, it looks bad.

      For the get type operations in DatanodeStorageInfo, we only provide the getBlockIterator to return blocks iterator and no other get operation with specified block. Still we need to use FoldedTreeSet in DatanodeStorageInfo? As we know FoldedTreeSet is benefit for Get not Update. Maybe we can revert this to the early implementation.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                linyiqun Yiqun Lin
              • Votes:
                5 Vote for this issue
                Watchers:
                59 Start watching this issue

                Dates

                • Created:
                  Updated: