Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
By using following approach we can save about 45% memory footprint for each block replica in DataNode memory (This JIRA only talks about ReplicaMap in DataNode), the details are:
In ReplicaMap,
private final Map<String, Map<Long, ReplicaInfo>> map = new HashMap<String, Map<Long, ReplicaInfo>>();
Currently we use a HashMap Map<Long, ReplicaInfo> to store the replicas in memory. The key is block id of the block replica which is already included in ReplicaInfo, so this memory can be saved. Also HashMap Entry has a object overhead. We can implement a lightweight Set which is similar to LightWeightGSet, but not a fixed size (LightWeightGSet uses fix size for the entries array, usually it's a big value, an example is BlocksMap, this can avoid full gc since no need to resize), also we should be able to get Element through key.
Following is comparison of memory footprint If we implement a lightweight set as described:
We can save:
SIZE (bytes) ITEM 20 The Key: Long (12 bytes object overhead + 8 bytes long) 12 HashMap Entry object overhead 4 reference to the key in Entry 4 reference to the value in Entry 4 hash in Entry
Total: -44 bytes
We need to add:
SIZE (bytes) ITEM 4 a reference to next element in ReplicaInfo
Total: +4 bytes
So totally we can save 40bytes for each block replica
And currently one finalized replica needs around 46 bytes (notice: we ignore memory alignment here).
We can save 1 - (4 + 46) / (44 + 46) = 45% memory for each block replica in DataNode.