Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3538

We do not prune partitions when we count over partitioning key and filter over partitioning key

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Not A Problem
    • 1.2.0
    • 1.3.0
    • Execution - Flow
    • None
    • 4 node cluster on CentOS

    Description

      We are not partition pruning when we do a count over partitioning key and when the predicate involves the partitioning key. CTAS used was,

      create table t3214 partition by (key2) as select cast(key1 as double) key1, cast(key2 as char(1)) key2 from `twoKeyJsn.json`;
      

      case 1) We do not do partition pruning in this case.

      0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key2) from t3214 where key2 = 'm';
      +------+------+
      | text | json |
      +------+------+
      | 00-00    Screen
      00-01      Project(EXPR$0=[$0])
      00-02        Project(EXPR$0=[$0])
      00-03          Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@e2471d7])
      

      case 2) We do not do partition pruning in this case.

      0: jdbc:drill:schema=dfs.tmp> explain plan for select count(*) from t3214 where key2 = 'm';
      +------+------+
      | text | json |
      +------+------+
      | 00-00    Screen
      00-01      Project(EXPR$0=[$0])
      00-02        Project(EXPR$0=[$0])
      00-03          Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@211930a2])
      

      case 3) We do not do partition pruning in this case.

      0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key1) from t3214 where key2 = 'm';
      +------+------+
      | text | json |
      +------+------+
      | 00-00    Screen
      00-01      Project(EXPR$0=[$0])
      00-02        Project(EXPR$0=[$0])
      00-03          Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@23fea3b0])
      

      case 4) we do prune here.

      0: jdbc:drill:schema=dfs.tmp> explain plan for select avg(key1) from t3214 where key2 = 'm';
      +------+------+
      | text | json |
      +------+------+
      | 00-00    Screen
      00-01      Project(EXPR$0=[CAST(/(CastHigh(CASE(=($1, 0), null, $0)), $1)):ANY NOT NULL])
      00-02        StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)])
      00-03          StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[COUNT($0)])
      00-04            Project(key1=[$1])
      00-05              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/t3214/0_0_15.parquet]], selectionRoot=maprfs:/tmp/t3214, numFiles=1, columns=[`key2`, `key1`]]])
      

      case 5) we do prune here.

      0: jdbc:drill:schema=dfs.tmp> explain plan for select min(key1) from t3214 where key2 = 'm';
      +------+------+
      | text | json |
      +------+------+
      | 00-00    Screen
      00-01      Project(EXPR$0=[$0])
      00-02        StreamAgg(group=[{}], EXPR$0=[MIN($0)])
      00-03          StreamAgg(group=[{}], EXPR$0=[MIN($0)])
      00-04            Project(key1=[$1])
      00-05              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/t3214/0_0_15.parquet]], selectionRoot=maprfs:/tmp/t3214, numFiles=1, columns=[`key2`, `key1`]]])
      

      commit id that I am testing on : 17e580a7

      Attachments

        Activity

          People

            amansinha100 Aman Sinha
            khfaraaz Khurram Faraaz
            Khurram Faraaz Khurram Faraaz
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: