Perhaps its a little too ambitious, but the reason I brought up the idea of the overseer handling collection management every n seconds is:
Lets say you have 4 nodes with 2 collections on them. You want each collection to use as many nodes as are available. Now you want to add a new node. To get it to participate in the existing collections, you have to configure them, or create new compatible cores over http on the new node. Wouldn't it be nice if the Overseer just saw the new node, that the collections had repFactor=MAX_INT and created the cores for you?
Also, consider failure scenarios:
If you remove a collection, what happens when a node that was down comes back and had that a piece of that collection? Your collection will be back as a single node. An Overseer process could prune this off shortly after.
So numShards/repFactor + Overseeer smarts seems simple and good to me. But sometimes you may want to be precise in picking shards/repliacs. Perhaps simply doing some kind of 'rack awareness' type feature down the road is the best way to control this though. You could create connections and weight costs using token markers for each node or something.
So I think maybe we would need a new zk node where solr instances register rather than cores? then we know what is available to place replicas on - even if that Solr instance has no cores?
Then the Overseer would have a process that ran every n (1 min?) and looked at each collection and its repFactor and numShards, and add or prune given the current state.
This would also account for failures on collection creation or deletion. If a node was down and missed the operation, when it came back, within N seconds, the Overseer would add or prune with the restored node.
It handles a lot of failures scenarios (with some lag) and makes the interface to the user a lot simpler. Adding nodes can eventually mean just starting up a node new rather than requiring any config. It's also easy to deal with changing the replication factor. Just update it in zk, and when the Overseer process runs next, it will add and prune to match the latest value (given the number of nodes available).