Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
5.0, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 6.0
Description
This is to do with a distributed data-race. Core-creation happens at a time when collection is not yet visible to the node. In this case a fallback code-path is used which de-references collection-state lazily (on demand) as opposed to setting a watch and keeping it cached locally.
Due to this, as requests towards the core mount, it generates ZK fetch for collection proportionately. On a large solr-cloud cluster, this generates several Gbps of TX traffic on ZK nodes. This affects indexing throughput(which floors) in addition to running ZK node out of network bandwidth.
On smaller solr-cloud clusters its hard to run into, because probability of this race materializing reduces.