Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Bug
-
1.15.0
-
None
-
None
Description
test case:
@Test public void testDistinctCount0() { final HepProgram program = HepProgram.builder() .addRuleInstance(AggregateExpandDistinctAggregatesRule.INSTANCE) .addRuleInstance(AggregateProjectMergeRule.INSTANCE) .build(); checkPlanning(program, "select type, count(distinct acctno), sum(distinct balance)" + " from customer.account group by type"); }
current result:
<TestCase name="testDistinctCount0"> <Resource name="sql"> <![CDATA[select type, count(distinct acctno) from customer.account group by type]]> </Resource> <Resource name="planBefore"> <![CDATA[ LogicalAggregate(group=[{0}], EXPR$1=[COUNT(DISTINCT $1)], EXPR$2=[SUM(DISTINCT $2)]) LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2]) LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]]) ]]> </Resource> <Resource name="planAfter"> <![CDATA[ LogicalProject(TYPE=[$0], EXPR$1=[$1], EXPR$2=[CAST($2):INTEGER NOT NULL]) LogicalAggregate(group=[{0}], EXPR$1=[COUNT($1) FILTER $3], EXPR$2=[SUM($2) FILTER $4]) LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)]) LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2], $g=[$3]) LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {1, 2}]], $g=[GROUPING($1, $0, $2)]) LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]]) ]]> </Resource> </TestCase>
However, the result plan is wrong.
first, if we only use AggregateExpandDistinctAggregatesRule.INSTANCE to optimize the query, the result plan is correct:
@Test public void testDistinctCount0() { final HepProgram program = HepProgram.builder() .addRuleInstance(AggregateExpandDistinctAggregatesRule.INSTANCE) //.addRuleInstance(AggregateProjectMergeRule.INSTANCE) .build(); checkPlanning(program, "select type, count(distinct acctno), sum(distinct balance)" + " from customer.account group by type"); } LogicalProject(TYPE=[$0], EXPR$1=[$1], EXPR$2=[CAST($2):INTEGER NOT NULL]) LogicalAggregate(group=[{0}], EXPR$1=[COUNT($1) FILTER $3], EXPR$2=[SUM($2) FILTER $4]) LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)]) LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {0, 2}]], $g=[GROUPING($0, $1, $2)]) LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2]) LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]])
then AggregateProjectMergeRule.INSTANCE is added, it will change the sub-plan from
LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {0, 2}]], $g=[GROUPING($0, $1, $2)]) LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2])
to
LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {1, 2}]], $g=[GROUPING($1, $0, $2)])
Notes that the groups was changed from
groups=[[{0, 1}, {0, 2}]]
to
groups=[[{0, 1}, {1, 2}]]
, but the filter values generated by groupValue in AggregateExpandDistinctAggregatesRule are not changed in project:
LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)])
// filter values before AggregateProjectMergeRule added AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(0,1,2), ImmutableBitSet.of(0,1)) is 1 AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(0,1,2), ImmutableBitSet.of(0,2)) is 2 // filter values after AggregateProjectMergeRule added AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(1,0,2), ImmutableBitSet.of(0,1)) is 1 AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(1,0,2), ImmutableBitSet.of(1,2)) is 4