Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1805

Colon in file simple name in directory causes view not found

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • Future
    • None
    • None

    Description

      Scanning the file system for view files fails (resulting in "Table 'vv' not found" errors) if the directory being scanned for view files contains a file whose simple name (last pathname segment) contains a colon.

      For example, the unit test method testDRILL_811View in Drill's ./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java fails if /tmp contains a file named like "aptitude-root.1528:JIsVaZ".

      The cause is that Hadoop filesystem glob-pattern-matching code (org.apache.hadoop.fs.Globber's glob(), calling org.apache.hadoop.fs.Path's Path(Path,String)) mixes up relative file pathname strings and relative URI-style Path strings.

      The problem seems to be where glob() calls child.getPath().getName() to get the raw final segment of the pathname and then passes that as the second argument to Path.Path(Path, String) (which takes URI/Path syntax) without encoding the raw segment into a relative URI/Path string by prepending "./" because of the colon (e.g., as Path.Path(String, String, String) does internally).

      It seems that glob() should first use Path(String, String, String) to handle that encoding and then call Path.Path(Path, Path).

      Action items:
      1) Report Hadoop bug to Hadoop.
      2) Review Drill's handling and propagation of the error.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dsbos Daniel Barclay
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: