Description
Currently there are three different failures
Set hive.cbo.returnpath.hiveop=true for all cases.
1) First case is wrong result for following query
explain select count(1), count(*), count(a), count(b), count(c), count(d), count(distinct a), count(distinct b), count(distinct c), count(distinct d), count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), count(distinct a,b,c,d) from abcd;
This occurs due to a bug in HiveCalciteUtil.getExprNodes. While looking for corresponding expression for a aggregate function's argument wrong index is being used.
2) Out of bound exception for following
set hive.map.aggr=false
explain select count(1), count(*), count(a), count(b), count(c), count(d), count(distinct a), count(distinct b), count(distinct c), count(distinct d), count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), count(distinct a,b,c,d) from abcd;
The above happens while converting Calcite Aggregation to Hive's group by operator.
3) Once the above case with exception is fixed same query with hive.map.aggr=false give wrong results. Problem in this case is that while creating expression for aggregate function's argument we end up with wrong column info from underlying reduce sink operator.