[HDFS-12042] Lazy initialize AbstractINodeDiffList#diffs for snapshots to reduce memory consumption - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.9.0, 3.0.0-beta1, 2.8.2
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

When snapshot diff operation is performed in a NameNode that manages several million HDFS files/directories, NN needs a lot of memory. Some of that memory is wasted due to suboptimal data structures, such as empty or under-populated ArrayLists, etc. Analyzing one heap dump with jxray (www.jxray.com), we observed the following problems with data structures:

9. BAD COLLECTIONS

Total collections: 99,707,902  Bad collections: 88,799,760  Overhead: 9,063,898K (18.2%)

Top bad collections:
    Ovhd           Problem     Num objs      Type
-------------------------------------------------
3,056,014K (6.1%)      small     29435572     j.u.ArrayList
2,641,373K (5.3%)     1-elem     21837906     j.u.ArrayList
864,215K (1.7%)     1-elem      5291813     j.u.TreeSet
808,456K (1.6%)     1-elem      3045847     j.u.HashMap
602,470K (1.2%)      empty     18549109     j.u.ArrayList
441,563K (0.9%)      empty      4356975     j.u.TreeSet
373,088K (0.7%)      empty      5297007     j.u.HashMap
270,324K (0.5%)      small       931394     j.u.HashMap

The data structures created by HDFS code that suffer from the above problems are, in particular:

  4,228,182K (8.5%): j.u.ArrayList: 19412263 of small 2,111,087K (4.2%), 12932408 of 1-elem 1,717,585K (3.4%), 12784310 of empty 399,509K (0.8%)
     <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- org.apache.hadoop.util.LightWeightGSet$LinkedElement[] <-- org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0 <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER

and

  575,557K (1.2%): j.u.ArrayList: 4363271 of 1-elem 409,056K (0.8%), 2439001 of small 166,482K (0.3%)
     <-- org.apache.hadoop.hdfs.server.namenode.INodeDirectory.children <-- org.apache.hadoop.util.LightWeightGSet$LinkedElement[] <-- org.apache.hadoop.util.LightWeightGSet.entries <-- org.apache.hadoop.hdfs.server.namenode.INodeMap.map <-- org.apache.hadoop.hdfs.server.namenode.FSDirectory.inodeMap <-- org.apache.hadoop.hdfs.server.namenode.FSNamesystem.dir <-- org.apache.hadoop.hdfs.server.namenode.FSNamesystem$NameNodeResourceMonitor.this$0 <-- org.apache.hadoop.util.Daemon.target <-- org.apache.hadoop.hdfs.server.namenode.FSDirectory.inodeMap <-- org.apache.hadoop.hdfs.server.namenode.FSNamesystem.dir <-- org.apache.hadoop.hdfs.server.namenode.FSNamesystem$NameNodeResourceMonitor.this$0 <-- org.apache.hadoop.util.Daemon.target <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER

There are several different reference chains that all lead to FileDiffList.diffs or INodeDirectory.children. The total percentage of memory wasted by these data structures in the analyzed dump is about 12%. By creating these lists lazily and/or with capacity that better matches their actual size, we should be able to reclaim a significant part of these 12%.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-12042.01.patch
26/Jun/17 22:33
6 kB
Misha Dmitriev
HDFS-12042.02.patch
27/Jun/17 06:10
8 kB
Misha Dmitriev
HDFS-12042.03.patch
27/Jun/17 19:13
8 kB
Misha Dmitriev
HDFS-12042.04.patch
28/Jun/17 23:03
8 kB
Misha Dmitriev

Issue Links

is related to

HDFS-11881 NameNode consumes a lot of memory for snapshot diff report generation

Resolved

Activity

People

Assignee:: Misha Dmitriev

Reporter:: Misha Dmitriev

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 26/Jun/17 20:18

Updated:: 02/Oct/19 17:13

Resolved:: 30/Jun/17 18:51