Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Simulated large scale tests uncovered an interesting scenario that occurs also in real clusters where IndexSizeTrigger is used for controlling the maximum shard size.
As index size grows and the number of shards grows, if document assignment is more or less even then at equal intervals (on a log2 scale) there will be an avalanche of SPLITSHARD operations, because all shards will reach the critical size at approximately the same time.
A hundred or more split shard operations running in parallel may severely affect the cluster performance.
One possible approach to reduce the likelihood of this situation is to split shards not exactly in half but rather fudge the proportions around 60/40% in a random sequence, so that the resulting sub-sub-sub…shards would reach the thresholds at different times. This would require modifications to the SPLITSHARD command to allow this randomization.
Another approach would be to simply limit the maximum number of parallel split shard operations. However, this would slow down the process of reaching the balance (increase lag) and possibly violate other operational constraints due to some shards waiting too long for the split and significantly exceeding their max size.