Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1131

Drill should ignore files in starting with . _

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • Future
    • Storage - Parquet
    • None

    Description

      Files containing . and _ as the first characters are ignored by hive and others are these are typically logs and status files written out by tools like mapreduce. Drill should not read them when querying a directory containing a list of parquet files.

      Currently it fails with the error:
      message: "Failure while setting up Foreman. < AssertionError:[ Internal error: Error while applying rule DrillPushProjIntoScan, args [rel#78:ProjectRel.NONE.ANY([]).[](child=rel#15:Subset#1.ENUMERABLE.ANY([]).[],p_partkey=$1,p_type=$2), rel#8:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[dfs, drillTestDirDencTpchSF100, part])] ] < DrillRuntimeException:[ java.io.IOException: Could not read footer: java.io.IOException: Could not read footer for file com.mapr.fs.MapRFileStatus@99c9d45e ] < IOException:[ Could not read footer: java.io.IOException: Could not read footer for file com.mapr.fs.MapRFileStatus@99c9d45e ] < IOException:[ Could not read footer for file com.mapr.fs.MapRFileStatus@99c9d45e ] < IOException:[ Open failed for file: /drill/testdata/dencSF100/part/.impala_insert_staging, error: Invalid argument (22) ]"

      Attachments

        Issue Links

          Activity

            People

              timothyfarkas Timothy Farkas
              inramana Ramana Inukonda Nagaraj
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: