Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Scanning the file system for view files fails (resulting in "Table 'vv' not found" errors) if the directory being scanned for view files contains a file whose simple name (last pathname segment) contains a colon.
For example, the unit test method testDRILL_811View in Drill's ./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java fails if /tmp contains a file named like "aptitude-root.1528:JIsVaZ".
The cause is that Hadoop filesystem glob-pattern-matching code (org.apache.hadoop.fs.Globber's glob(), calling org.apache.hadoop.fs.Path's Path(Path,String)) mixes up relative file pathname strings and relative URI-style Path strings.
The problem seems to be where glob() calls child.getPath().getName() to get the raw final segment of the pathname and then passes that as the second argument to Path.Path(Path, String) (which takes URI/Path syntax) without encoding the raw segment into a relative URI/Path string by prepending "./" because of the colon (e.g., as Path.Path(String, String, String) does internally).
It seems that glob() should first use Path(String, String, String) to handle that encoding and then call Path.Path(Path, Path).
Action items:
1) Report Hadoop bug to Hadoop.
2) Review Drill's handling and propagation of the error.
Attachments
Issue Links
- depends upon
-
HADOOP-12455 fs.Globber breaks on colon in filename; doesn't use Path's handling for colons
- Patch Available
- is related to
-
DRILL-2784 handle filenames with colons (":")
- Open