Each stale state response from the server causes CloudSolrClient to evict the requested collection from the local cache. At this point, the request is retried and the latest collection state is fetched live from ZooKeeper.
There is nothing preventing multiple request threads to simultaneously hit ZooKeeper and cause a thundering herd effect. There is synchronization to prevent multiple request threads from simultaneously trying to refresh the state but that is not enough. Each request thread which receives a stale state will sequentially refresh state from ZK after acquiring the lock.
We should use the past and current znode version of the cluster state to make sure that redundant fetches from ZooKeeper are never possible.