TLDR; CoreRules.AGGREGATE_REMOVE is fired by a HepPlanner but while removing the Aggregate, instead of returning the Aggregate's input, it returns a VolcanoPlanner's RelSubset with the input, which leads to unforeseeable consequences.
Details: This seems a strange issue that happens because several factors occur.
I first reproduced it on my application with the following query (on TPCH):
But the issue can be reproduced also in Calcite by adding this test into HepPlannerTest:
The important elements are: firstly our RelOptCluster has a VolcanoPlanner as planner (so any relNode.getCluster().getPlanner() call that we execute will return a VolcanoPlanner instance). Nevertheless we also apply some rules via a HepPlanner. I think this is a quite common strategy in Calcite clients to obtain a better performance: first apply a subset of rules that are always beneficial via a HepPlanner, and then apply the "main" set of rules via the cost-based VolcanoPlanner.
Secondly, we have AggregateRemoveRule, which we use in the HepPlanner phase.
This rule contains the following code:
Notice the line  which uses call.getPlanner() to call the prune method. By using call.getPlanner() we get the correct planner of the rule that is being fired, in this case a HepPlanner, so we end up calling HepPlanner#prune, which is fine.
However, the line  calls the RelOptRule#convert static method:
Notice how in this case, the planner is obtained from the relNode's cluster , in our case that would be the VolcanoPlanner, which is potentially problematic. Further down, if the relNode matches the outTraits, no action is done and the same relNode is returned, no problem here. But, if it does not match them, then RelOptPlanner#changeTraits will be called, i.e. VolcanoPlanner#changeTraits , and this is where the problem will originate: in our scenario VolcanoPlanner#changeTraits will return a Volcano's RelSubset, which is completely unhandable by the HepPlanner that triggered the rule, and that leads to the incorrect plan returned by the HepPlanner.
In this case, what happens with our original query (LogicalValues with sorted values), we get to RelOptRule#convert with the RelNode being a LogicalValues with Convention.NONE + Collation, and the toTraits are the ones from the LogicalAggregate that we are removing: Convention.NONE + Collation . Since the traits from the LogicalValues do not match the LogicalAgggregate traits (Collation != Collation), the RelOptPlanner#changeTraits is called and the problem occurs. I am not sure why here RelTraitSet#matches is used (which computes an exact match, hence returning false), rather than RelTraitSet#satisfies, which would have returned true, because a sorted LogicalValues (Collation) satisfies the unsorted Collation, but I assume there is a reason for that.
As a workaround, if the LogicalValues elements are NOT in order, then the problem is avoided: we deal with a LogicalValues with Collation , so inside RelOptRule#convert the LogicalValues traits (Convention.NONE + Collation) match the LogicalAggregates ones (Convention.NONE + Collation), so the method returns without calling RelOptPlanner#changeTraits, so the problem does not happen. This can be confirmed by modifying the proposed test: