[DRILL-3538] We do not prune partitions when we count over partitioning key and filter over partitioning key - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Not A Problem
Affects Version/s: 1.2.0
Fix Version/s: 1.3.0
Component/s: Execution - Flow
Labels:
None
Environment:

4 node cluster on CentOS

Description

We are not partition pruning when we do a count over partitioning key and when the predicate involves the partitioning key. CTAS used was,

create table t3214 partition by (key2) as select cast(key1 as double) key1, cast(key2 as char(1)) key2 from `twoKeyJsn.json`;

case 1) We do not do partition pruning in this case.

0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key2) from t3214 where key2 = 'm';
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(EXPR$0=[$0])
00-02        Project(EXPR$0=[$0])
00-03          Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@e2471d7])

case 2) We do not do partition pruning in this case.

0: jdbc:drill:schema=dfs.tmp> explain plan for select count(*) from t3214 where key2 = 'm';
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(EXPR$0=[$0])
00-02        Project(EXPR$0=[$0])
00-03          Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@211930a2])

case 3) We do not do partition pruning in this case.

0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key1) from t3214 where key2 = 'm';
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(EXPR$0=[$0])
00-02        Project(EXPR$0=[$0])
00-03          Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@23fea3b0])

case 4) we do prune here.

0: jdbc:drill:schema=dfs.tmp> explain plan for select avg(key1) from t3214 where key2 = 'm';
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(EXPR$0=[CAST(/(CastHigh(CASE(=($1, 0), null, $0)), $1)):ANY NOT NULL])
00-02        StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)])
00-03          StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[COUNT($0)])
00-04            Project(key1=[$1])
00-05              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/t3214/0_0_15.parquet]], selectionRoot=maprfs:/tmp/t3214, numFiles=1, columns=[`key2`, `key1`]]])

case 5) we do prune here.

0: jdbc:drill:schema=dfs.tmp> explain plan for select min(key1) from t3214 where key2 = 'm';
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(EXPR$0=[$0])
00-02        StreamAgg(group=[{}], EXPR$0=[MIN($0)])
00-03          StreamAgg(group=[{}], EXPR$0=[MIN($0)])
00-04            Project(key1=[$1])
00-05              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/t3214/0_0_15.parquet]], selectionRoot=maprfs:/tmp/t3214, numFiles=1, columns=[`key2`, `key1`]]])

commit id that I am testing on : 17e580a7

Attachments

Activity

People

Assignee:: Aman Sinha

Reporter:: Khurram Faraaz

Reviewer:: Khurram Faraaz

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 21/Jul/15 23:57

Updated:: 10/Dec/15 23:30

Resolved:: 01/Nov/15 18:32