Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1610

Partition pruning not scanning all the required partitions when we have hierarchical partitions

    XMLWordPrintableJSON

Details

    Description

      git.commit.id.abbrev=6dca24a

      My data is organized in such a way that we have multiple year partitions and each year partition has 12 month partitions.
      Now the below query should look for all the records
      1. all months under 1993
      2. under the month 'dec' for all the years

      However drill seems to be ignoring the second case. This is happening with hive and dfs.

      Below are the queries :

      DFS :

      explain plan for select 
        l_orderkey, 
        l_partkey, 
        l_quantity, 
        l_shipdate, 
        l_shipinstruct 
      from hive.lineitem_hierarchical where (`year` IN (1993) and l_orderkey>29600) or `month` IN ('dec');
      

      Hive :

      explain plan for select columns[0],
        columns[1],
        columns[4],
        columns[10],
        columns[13]
      from `/lineitem_hierarchical` where (dir0=1993 and columns[0]>29600) or dir1='dec';
      

      Attached the sample data, hive ddl, and the explain plan outputs

      Attachments

        1. lineitem_hierarchical.zip
          2.06 MB
          Rahul Kumar Challapalli
        2. hive.ddl
          10 kB
          Rahul Kumar Challapalli
        3. explain_plan_hive.txt
          66 kB
          Rahul Kumar Challapalli
        4. explain_plan_dfs.txt
          11 kB
          Rahul Kumar Challapalli
        5. DRILL-1610.patch
          3 kB
          Mehant Baid

        Activity

          People

            Unassigned Unassigned
            rkins Rahul Kumar Challapalli
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: