Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
None
-
None
Description
The autoscaling policy supports spreading replicas of a shard across zones (via sysprops) but that by itself does not guarantee that all availability zones are actually used and that too in a balanced way.
For example:
{replica:#EQUAL, shard:#EACH, sysprop.az:#EACH}
The above policy might end up using only 2 out of 3 availability zones to spread the replicas (assuming there are enough nodes per zone to not violate any other policy rules) and not generate any violations. So although we still have resilience against the loss of one AZ, we do not keep enough capacity as we could have if we had used all 3 zones equally. In summary, this is a cluster balancing problem.