Currently 'autoAddReplicas=true' can be specified in the Collection Create API to automatically add replicas when a replica becomes unavailable. I propose to move this feature to the autoscaling cluster policy rules design.
This will include the following:
- Trigger support for ‘nodeLost’ event type
- Modification of existing implementation of ‘autoAddReplicas’ to automatically create the appropriate ‘nodeLost’ trigger.
- Any such auto-created trigger must be marked internally such that setting ‘autoAddReplicas=false’ via the Modify Collection API should delete or disable corresponding trigger.
- Support for non-HDFS filesystems while retaining the optimization afforded by HDFS i.e. the replaced replica can point to the existing data dir of the old replica.
- Deprecate/remove the feature of enabling/disabling ‘autoAddReplicas’ across the entire cluster using cluster properties in favor of using the suspend-trigger/resume-trigger APIs.
This will retain backward compatibility for the most part and keep a common use-case easy to enable as well as make it available to more people (i.e. people who don't use HDFS).