Details
Description
One of our nodes became berzerk after a restart, Solr went completely nuts! So i opened VisualVM to keep an eye on it and spotted a different problem that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
It appears Solr is leaking one SolrZkClient instance per second via DistributedQueue$ChildWatcher. That one per second is quite accurate for all nodes, there are about the same amount of instances as there are seconds since Solr started. I know VisualVM's instance count includes objects-to-be-collected, the instance count does not drop after a forced garbed collection round.
It doesn't matter how many cores or collections the nodes carry or how heavy traffic is.