Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7470

SecondaryNameNode need twice memory when calling reloadFromImageFile

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      histo information at 2014-12-02 01:19

      num #instances #bytes class name
      ----------------------------------------------
      1: 186449630 19326123016 [Ljava.lang.Object;
      2: 157366649 15107198304 org.apache.hadoop.hdfs.server.namenode.INodeFile
      3: 183409030 11738177920 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo
      4: 157358401 5244264024 [Lorg.apache.hadoop.hdfs.server.blockmanagement.BlockInfo;
      5: 3 3489661000 [Lorg.apache.hadoop.util.LightWeightGSet$LinkedElement;
      6: 29253275 1872719664 [B
      7: 3230821 284312248 org.apache.hadoop.hdfs.server.namenode.INodeDirectory
      8: 2756284 110251360 java.util.ArrayList
      9: 469158 22519584 org.apache.hadoop.fs.permission.AclEntry
      10: 847 17133032 [Ljava.util.HashMap$Entry;
      11: 188471 17059632 [C
      12: 314614 10067656 [Lorg.apache.hadoop.hdfs.server.namenode.INode$Feature;
      13: 234579 9383160 com.google.common.collect.RegularImmutableList
      14: 49584 6850280 <constMethodKlass>
      15: 49584 6356704 <methodKlass>
      16: 187270 5992640 java.lang.String
      17: 234579 5629896 org.apache.hadoop.hdfs.server.namenode.AclFeature

      histo information at 2014-12-02 01:32

      num #instances #bytes class name
      ----------------------------------------------
      1: 355838051 35566651032 [Ljava.lang.Object;
      2: 302272758 29018184768 org.apache.hadoop.hdfs.server.namenode.INodeFile
      3: 352500723 22560046272 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo
      4: 302264510 10075087952 [Lorg.apache.hadoop.hdfs.server.blockmanagement.BlockInfo;
      5: 177120233 9374983920 [B
      6: 3 3489661000 [Lorg.apache.hadoop.util.LightWeightGSet$LinkedElement;
      7: 6191688 544868544 org.apache.hadoop.hdfs.server.namenode.INodeDirectory
      8: 2799256 111970240 java.util.ArrayList
      9: 890728 42754944 org.apache.hadoop.fs.permission.AclEntry
      10: 330986 29974408 [C
      11: 596871 19099880 [Lorg.apache.hadoop.hdfs.server.namenode.INode$Feature;
      12: 445364 17814560 com.google.common.collect.RegularImmutableList
      13: 844 17132816 [Ljava.util.HashMap$Entry;
      14: 445364 10688736 org.apache.hadoop.hdfs.server.namenode.AclFeature
      15: 329789 10553248 java.lang.String
      16: 91741 8807136 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
      17: 49584 6850280 <constMethodKlass>

      And the stack trace shows it was doing reloadFromImageFile:

      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.getInode(FSDirectory.java:2426)
      at org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectorySection(FSImageFormatPBINode.java:160)
      at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:243)
      at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168)
      at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:121)
      at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:902)
      at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:888)
      at org.apache.hadoop.hdfs.server.namenode.FSImage.reloadFromImageFile(FSImage.java:562)
      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:1048)
      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:536)
      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:388)
      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:354)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:356)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1630)
      at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:413)
      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:350)
      at java.lang.Thread.run(Thread.java:745)

      So before doing reloadFromImageFile, I think we need release old namesystem to prevent SecondaryNameNode OOM.

      Attachments

        1. HDFS-7470.1.patch
          3 kB
          yunjiong zhao
        2. HDFS-7470.2.patch
          2 kB
          yunjiong zhao
        3. HDFS-7470.patch
          2 kB
          yunjiong zhao
        4. secondaryNameNode.jstack.txt
          14 kB
          yunjiong zhao

        Activity

          People

            zhaoyunjiong yunjiong zhao
            zhaoyunjiong yunjiong zhao
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: