Description
org.apache.hadoop.fs.Globber.glob() breaks when a searched directory
contains a file whose simple name contains a colon.
The problem seem to be in the code currently at lines 258 and 257
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Globber.java#L257:
256: // Set the child path based on the parent path. 257: child.setPath(new Path(candidate.getPath(), 258: child.getPath().getName()));
That last line should probably be:
new Path(null, null, child.getPath().getName())));
The bug in the current code is that:
1) child.getPath().getName() gets the simple name (last segment) of the child Path as a raw string (not necessarily the corresponding relative Path string), and
2) that raw string is passed as Path(Path, String)'s second argument, which takes a Path string.
When that raw string contains a colon (e.g., xxx:yyy), it looks like a Path string that specifies a scheme ("xxx") and has a relative path "yyy}"--but that combination isn't allowed, so trying to constructing a Path with it (as Path(Path, String) does inside) throws an exception, aborting the entire glob() call.
Adding the call to Path(String, String, String) does the equivalent of converting the raw string "xxx:yyy" to the Path string "./xxx:yyy", so the part before the colon is not taken as a scheme.
Attachments
Attachments
Issue Links
- is depended upon by
-
DRILL-1805 Colon in file simple name in directory causes view not found
- Open