Yourkit calculates the retained size, i.e. Retained size of an object is its shallow size plus the shallow sizes of the objects that are accessible, directly or indirectly, only from this object. In other words, the retained size represents the amount of memory that will be freed by the garbage collector when this object is collected.
With 10 thousand regions, the retained size of the two ConcurrentSkipListMap is 7 mega bytes.
With 100 thousands regions, the retained size is 75 mega bytes. 19 megs are TableName objects, and this leads to an obvious optimization (I had it in mind already, to save on 'equals' but the final size is crazy). On the same range, we have 3.3 mb of ServerName.
Lastly, I don't think that a Map is the best algorithm, a Trie would be much better. I will have a look at this as well.
With 100k regions, time is:
||time without the patch
||time with the patch
|| 50 million each
|| 83 seconds
With these results my opinion is that we should commit this patch as it is, because:
- 60 Mb is acceptable for a client connected to a cluster with 100K regions.
- In most cases, the weak reference will just make the performance unpredictable. The remaining cases (regions not used often so we can remove them under memory pressure) does not justify the noise for the other cases.
- We can lower the memory foot print further if necessary, and it's likely a better solution than playing with the GC.