As SolrCloud is now used at fairly large scale, most users end up writing their own cluster management tools. We should have a framework for cluster management in Solr.
In a discussion with Noble Paul, we outlined the following steps w.r.t. the approach to having this implemented:
- Basic API calls for cluster management e.g. utilize added nodes, remove a node etc. These calls would need explicit invocation by the users to begin with. It would also specify the strategy to use. For instance I can have a strategy called optimizeCoreCount which would target to have an even no:of cores in each node . The strategy could optionally take parameters as well
- Metrics and stats tracking e.g. qps, etc. These would be required for any advanced cluster management tasks e.g. maintain a qps of 'x' by auto-adding a replica (using a recipe) etc. We would need collection/shard/node level views of metrics for this.
- Recipes: combination of multiple sequential/parallel API calls based on rules. This would be complicated specially as most of these would be long running series of tasks which would either have to be rolled back or resumed in case of a failure.
- Event based triggers that would not require explicit cluster management calls for end users.