Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7392

Exclude some files when requesting directory

    XMLWordPrintableJSON

    Details

    • Type: Wish
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 1.16.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently Drill ignores files starting with dot ('.') or underscore ('_').

      When requesting directory with file of different types or different schema and present at multiple levels of the tree file, it will be useful/more flexible, to have also option(s) to exclude some files by extension or maybe with a regexp.

      For Example:

      myTable
      |--D1
         |--file1.csv
         |-file2.csv
      |--D2
         | SubD2
            |--file1.csv
         |--file1.csv
         |--file1.xml 
         |--file1.json
      

      without enter in a debate of what is a good the organisation/disposition for the data, currently to request all the csv files of this example, the way is:

      SELECT * FROM ....`myTable/*/*.csv`
      UNION
      SELECT * FROM ....`myTable/*/*/*.csv`
      

      It will be useful to have the capacity to request directly myTable like:

      /* ALTER SESSION SET exclude_files='xml,json' */
      /* or */
      /* ALTER SESSION SET only_files='csv' */
      SELECT * FROM myTable
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              benj641 benj
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: