I propose to introduce a collection property called autoManageReplicas. This will be used only with the ZK as truth mode.
If set to true, then whenever the number of replicas for a shard fall below the replicationFactor and the down nodes do not come back up for a configurable amount of time, additional replicas will be started up automatically. Similarly, if the actual number of replicas is equal to or greater than replicationFactor then if old (previously down) nodes come back up then they will not be allowed to join the shard and will be unloaded instead.
I think we should not unload running shards if number of replicas are more for now. We can change that later if needed.