Issue Details (XML | Word | Printable)

Key: HADOOP-1891
Type: Bug Bug
Status: Closed Closed
Resolution: Won't Fix
Priority: Major Major
Assignee: Mahadev konar
Reporter: Olga Natkovich
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

"." is converted to an empty path

Created: 14/Sep/07 01:36 AM   Updated: 08/Jul/09 04:42 PM
Return to search
Component/s: None
Affects Version/s: 0.14.1
Fix Version/s: None

Time Tracking:
Not Specified

Environment: Linux
Issue Links:
Reference
 

Resolution Date: 19/Feb/08 11:46 PM


 Description  « Hide
Path p = new Path(".");
System.out.println("path=(" + p.toString() +")");

path =()



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Chris Douglas added a comment - 25/Sep/07 12:17 AM
I'm uncertain of the correct behavior, here. Absent a filesystem- or a configuration to determine the default filesystem- there's no "working directory" to resolve. Regrettably, this:
Configuration conf = new Configuration();
Path cwd = new Path(".");
Path kid1 = new Path(parent, "blah");
Path kid2 = new Path(FileSystem.get(conf).getWorkingDirectory(), "blah");
// kid1: blah
// kid2: /home/user/blah

is neither intuitive nor succinct. Paths are evaluated at construction and segments matching dot are summarily excised as part of URI normalization. Since this fails to do what one would reasonably expect with "." as a path, would it make sense to throw in this case? Certainly, Path doesn't have enough information to do much else.


dhruba borthakur added a comment - 25/Sep/07 05:54 AM
My thinking is that it new Path(".") should throw an exception if there isn't enough information to convert it into an absolute path name.

Doug Cutting added a comment - 25/Sep/07 04:54 PM
> Since this fails to do what one would reasonably expect with "." as a path [ ... ]

Hmm. It does what I'd expect. "./foo" and "foo" name the same file, no? What's unexpected?

> new Path(".") should throw an exception [ ... ]

I don't see why. Having an unresolved Path that represents the connected directory seems reasonable to me.


Owen O'Malley added a comment - 25/Sep/07 05:51 PM
Paths can be relative and that is handy. Most applications want to make them fully qualified sooner rather than later, but I don't think an exception is the right answer.

Chris Douglas added a comment - 25/Sep/07 06:16 PM - edited
> Hmm. It does what I'd expect. "./foo" and "foo" name the same file, no? What's unexpected?

Well, "." and fs.getWorkingDirectory() aren't the same thing, as in the above example. That was surprising to me, at least. Path can keep enough information after URI normalization to know that the original was a relative path when the string is "./foo", but not when it's simply "."

Path already throws when it gets an empty string; would it be reasonable to assume that a Path successfully constructed as the empty string refers to the working directory? I can't think of a situation where reporting its URI as Path.CUR_DIR would be an error. It would also work in new Path("foo/bar", "../.."), etc.

What problem is this causing?

[Edit]
We'd also see fewer bugs like HADOOP-1902


Doug Cutting added a comment - 25/Sep/07 06:44 PM
> Well, "." and fs.getWorkingDirectory() aren't the same thing, as in the above example.

Can you describe what you'd expect the example to print? Perhaps the fix is to avoid normalizing URIs until they are dereferenced within a FileSystem implementation? That way "./foo" would print as "./foo" rather than just "foo".


Chris Douglas added a comment - 25/Sep/07 08:17 PM
I see now what you meant, and I retract my point: the existing behavior matches expectations, except as in the original example.

Coupled with HADOOP-1909, I like the idea of leaving Paths relative until dereferenced within a FileSystem. Would it make sense to go further and require all Paths to be dereferenced this way? There's a lot of string manipulation and special-casing in Path, particularly for Windows filesystems. Pushing that out to the FS seems like a reasonable abstraction. Introducing a new type would also let users employ POSIX semantics for Paths, but URI semantics for Hadoop Paths (as in HADOOP-1858). The new type could even be a subtype of Path, where Path assumes the default FileSystem where it's used in a URI context (just as it does now). It would be a pervasive/risky change, though...


Doug Cutting added a comment - 25/Sep/07 08:46 PM
> There's a lot of string manipulation and special-casing in Path, particularly for Windows filesystems. Pushing that out to the FS seems like a reasonable abstraction.

One problem is that there's lots of code that passes things returned by File#getPath() to 'new Path(String)', and Windows file names are invalid URI paths. When we added Path.java to Hadoop we needed to do so back compatibly, since lots of user code manipulates file names and we didn't want to break it.

To avoid processing Windows-specifics in Path.java and stay compatible, we'd need to either avoid creating URIs in a Path at all, or we'd have to escape backslashes and colons in the URI's path, and have FileSystem implementations remove those escapes. Perhaps that would work, although it might be hard to make it back-compatible with existing code.

I've pulled a lot of my hair out in the process of getting Path to work on Windows and am personally reluctant to revisit this. But feel free to experiment and see if you can find a cleaner approach.


Mahadev konar added a comment - 19/Feb/08 11:46 PM
am closing this bug as a wont fix since have "." return an empty path suffices for now.