For each shard in a distributed request, Solr currently routes each request randomly via ShufflingReplicaListTransformer to a particular replica. In setups with replication factor >1, this normally results in a situation where subsequent requests (which one would hope/expect to leverage cached results from previous related requests) end up getting routed to a replica that hasn't seen any related requests.
The problem can be replicated by issuing a relatively expensive query (maybe containing common terms?). The first request initializes the queryResultCache on the consulted replicas. If replication factor >1 and there are a sufficient number of shards, subsequent requests will likely be routed to at least one replica that hasn't seen the query before. The replicas with uninitialized caches become a bottleneck, and from the client's perspective, many subsequent requests appear not to benefit from caching at all.