> Especially since in order to be stable under the rebalancer
Oh guys, you are going too far! I am talking of faster cycle of innovation and iteration. A pluggable interface allows the hadoop community to try experiments with newer methods of block placement. Once such a placement algorithm proves beneficial and helpful, does the related questions of "how to make the balancer work with the new placement policy" come into my mind. If experiments prove that there isn't any viable alternative pluggable policy, then the question of "does the balancer work with a pluggable policy" is moot.
> hdfs probably needs to store metadata with the files or blocks
I do not like this approach. It makes hdfs heavy, clunky and difficult to maintain. Have you seen what happened to other file system that tried to do everything inside it, e.g. DCE-DFS? It is possible that HDFS might allow generic blobs to be stored stored with files (aka extended file attributes) where application specific data can be stored. But it should be disassociated from a "requirement" that archival-policy has to be stored with file meta-data.
Again folks, I agree completely with you that a "finished product" needs to encompass the "balancer". But to start experimenting to figure out whether a different placement policy is beneificial at all, I need the pluggability feature, otherwise I have to keep changing my hadoop source code every time I want to experiment. My experiments will probably take 3 to six months, especially because I want to benchmark results at large scale.
For installations that go with the default policy, there is no impact at all.