Saved the best for last
Daryn, thanks for bringing up HADOOP-8139, as it is indeed something we want to address on Windows.
Let's first make sure that I understood the problem correctly. The Jira is about '\' character being used as an escape character for metachars, and replace("
", "/") in Path breaks this. Your current fix in 0.23 addresses the problem in Unix by not doing this "problematic" replace, but leaves Windows with the problem. Please correct me if I'm mistaken, as it's a long discussion.
> After a long discussion in HADOOP-8139, it was decided that only RFC standard URIs will be supported by hadoop. Paths using "\" are not going to be supported.
@Daryn: I would prefer to move the discussion in a direction of how to support "\" by Hadoop on Windows, and work with the community on the acceptable solution. Aksing users to enter input paths in form "c:/some/path" does not seem like the right thing to do. Please let me know if you agree with me here. I would prefer if we address HADOOP-8139 in a separate change, as this change moves us forward with Windows support, and does not break Unix behavior.
file:/// should not allow authority or port - it is for local file systems.
@Sanjay: I was just trying to illustrate the problem, sorry for the confusion.
There's no reason you have to, you can always use new Path(String).
@Daryn: Actually, this does not work for paths that are symlinks. For example, new Path("/some/path#symlink") will encode the "#" character internally, so we lose the symlink behavior. This is why I believe this is a good change. If you take a look at changes I've done to GenericOptionsParser.java you can see how this simplifies things on the call site.