Todd - first of all, no one is blocking anything.
Hey Suresh. I'll try to answer a few of your questions above from the perspective of HBase and MR.
This jira was started with the premise that this new feature was useful to MapReduce and HBase (http://s.apache.org/NJY). So, I assumed there would be some work in that direction.
If that was the case I don't see how doing the suggestion to do the work in a dev-branch before merging to mainline is blocking anything? It is something we have done many times over for YARN, HDFS HA etc. etc.
Personally, if anyone was doing this work on MR, I'd be very interested in collaborating, heck - learning.
However, given my experience on MR, I'd classify it as a high-risk, but very, very interesting research since on a mid-sized clusters (few hundred nodes) and beyond the scheduling overhead might more than negate the I/O gains. Hence, again, doing that in a dev-branch is absolutely the right thing to do from a project and risk management perspective.
This isn't the first time an API has been added to the trunk code before downstream users exist.
Yes, this wouldn't be the first time we made that mistake.
Clearly, we are dealing with the consequences of our previous mistakes for a while now. Arguing that is a good reason to do the same, again, is not cogent.
As I mentioned above, we have at least one customer who would like to use this feature in their code to get better disk efficiency. They need to run against an actual release, not a dev branch build. This is the primary use case we're targeting right now. I want to be perfectly honest: the HBase/MR examples I gave above are not on our immediate roadmap; they just serve as proof that this isn't a one-off/niche improvement.
Now, clearly, you don't plan to do any work on either HBase or MR anytime soon and you have a different roadmap for a client.
If you had made that clear sooner, the conversation would be different.
Essentially, for the foreseeable future this will be dead code which is not going to be beneficial to anyone in the community... yet, the burden of maintenance etc. will remain.
No, that is not a big deal since this particular change has a fairly small cross-section - it might be harder to make the argument for a future, more extensive change of this kind. Clearly, if it's a plugin etc., its easier to digest.
IAC, I don't wish to debate this further.
Importantly, we should switch this feature off by default so that people who use this understand that this isn't necessarily supported - at least until we have a real, use-case for this in the community.