Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
7.7.2, 8.2, 8.3
-
None
-
None
Description
On a sizeable cluster with multi-shard multi-replica collections, when LRUStatsCache was in use we encountered excessive memory usage, which consequently led to severe performance problems.
On a closer examination of the heapdumps it became apparent that when LRUStatsCache.addToPerShardTermStats is called it creates instances of FastLRUCache using the passed shard argument - however, the value of this argument is not a simple shard name but instead it's a randomly ordered list of ALL replica URLs for this shard.
As a result, due to the combinatoric number of possible keys, over time the map in LRUStatsCache.perShardTemStats grew to contain ~2 mln entries...
The fix seems to be simply to extract the shard name and cache using this name instead of the full string value of the shard parameter. Existing unit tests also need much improvement.
Attachments
Attachments
Issue Links
- fixes
-
SOLR-7759 DebugComponent's explain should be implemented as a distributed query
- Resolved