Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Not A Problem
-
1.2.0
-
None
-
4 node cluster on CentOS
Description
We are not partition pruning when we do a count over partitioning key and when the predicate involves the partitioning key. CTAS used was,
create table t3214 partition by (key2) as select cast(key1 as double) key1, cast(key2 as char(1)) key2 from `twoKeyJsn.json`;
case 1) We do not do partition pruning in this case.
0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key2) from t3214 where key2 = 'm'; +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(EXPR$0=[$0]) 00-02 Project(EXPR$0=[$0]) 00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@e2471d7])
case 2) We do not do partition pruning in this case.
0: jdbc:drill:schema=dfs.tmp> explain plan for select count(*) from t3214 where key2 = 'm'; +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(EXPR$0=[$0]) 00-02 Project(EXPR$0=[$0]) 00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@211930a2])
case 3) We do not do partition pruning in this case.
0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key1) from t3214 where key2 = 'm'; +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(EXPR$0=[$0]) 00-02 Project(EXPR$0=[$0]) 00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@23fea3b0])
case 4) we do prune here.
0: jdbc:drill:schema=dfs.tmp> explain plan for select avg(key1) from t3214 where key2 = 'm'; +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(EXPR$0=[CAST(/(CastHigh(CASE(=($1, 0), null, $0)), $1)):ANY NOT NULL]) 00-02 StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)]) 00-03 StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[COUNT($0)]) 00-04 Project(key1=[$1]) 00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/t3214/0_0_15.parquet]], selectionRoot=maprfs:/tmp/t3214, numFiles=1, columns=[`key2`, `key1`]]])
case 5) we do prune here.
0: jdbc:drill:schema=dfs.tmp> explain plan for select min(key1) from t3214 where key2 = 'm'; +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(EXPR$0=[$0]) 00-02 StreamAgg(group=[{}], EXPR$0=[MIN($0)]) 00-03 StreamAgg(group=[{}], EXPR$0=[MIN($0)]) 00-04 Project(key1=[$1]) 00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/t3214/0_0_15.parquet]], selectionRoot=maprfs:/tmp/t3214, numFiles=1, columns=[`key2`, `key1`]]])
commit id that I am testing on : 17e580a7