Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2517

Apply Partition pruning before reading files during planning

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0, 0.8.0
    • Fix Version/s: 1.6.0
    • Labels:
      None

      Description

      Partition pruning still tries to read Parquet files during the planning stage even though they don't match the partition filter.

      For example, if there were an invalid Parquet file in a directory that should not be queried:

      0: jdbc:drill:zk=local> select sum(price) from dfs.tmp.purchases where dir0 = 1;
      Query failed: IllegalArgumentException: file:/tmp/purchases/4/0_0_0.parquet is not a Parquet file (too small)
      

      The reason is that the partition pruning happens after the Parquet plugin tries to read the footer of each file.

      Ideally, partition pruning would happen first before the format plugin gets involved.

        Attachments

          Activity

            People

            • Assignee:
              kkhatua Kunal Khatua
              Reporter:
              dragoncurve Adam Gilmore
              Reviewer:
              Kunal Khatua
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: