Description
with t as (select true c)
3select t.c
4from t
5group by t.c
6having mean(t.c) > 0
This query throws "Column 't.c' does not exist. Did you mean one of the following? [t.c]"
However, mean(boolean) is not a supported function signature, thus error result should be "cannot resolve 'mean(t.c)' due to data type mismatch: function average requires numeric or interval types, not boolean"
This is because
- The mean(boolean) in HAVING was not marked as resolved in ResolveFunctions rule.
- Thus in ResolveAggregationFunctions, the TempResolvedColumn as a wrapper in mean(TempResolvedColumn(t.c)) cannot be removed (only resolved AGG can remove its’s TempResolvedColumn).
- Thus in a later batch rule applying, TempResolvedColumn was reverted and it becomes mean(`t.c`), so mean loses the information about t.c.
- Thus at the last step, the analyzer can only report t.c not found.
mean(boolean) in HAVING is not marked as resolved in ResolveFunctions rule because
- It uses Expression default `resolved` field population code
lazy val resolved: Boolean = childrenResolved && checkInputDataTypes().isSuccess
- During the analyzing, mean(boolean) is mean(TempResolveColumn(boolean), thus childrenResolved is true.
- however checkInputDataTypes() will be false (Average.scala#L55
- Thus eventually Average's `resolved` will be false, but it leads to wrong error message.